MySQL High Performance
I have recently seen several cases when performance for MySQL queries on a single table was terrible. The reason was simple: the wrong indexes were added and so the execution plan was poor. Here are guidelines to help you optimize various kinds of single-table queries.
Disclaimer: I will be presenting general guidelines and I do not intend to cover all scenarios. I am pretty confident that you can find examples where what I am writing does not work, but I am also confident that it will help you most of the time. Also I will not discuss features you can find in MySQL 5.6+ like Index Condition Pushdown to keep things simple. Be aware that such features can actually make a significant difference in query response time (for good or for bad).
An index can perform up to 3 actions: filter, sort/group and cover. While the first 2 actions are self-explanatory, not everyone may know what a ‘covering index’ is. Actually that’s very easy. The general workflow for a basic query is:
1. Use an index to find matching records and get the pointers to data.
2. Use the pointers to the corresponding data.
3. Return records
When a covering index can be used, the index already covers all fields requested in the query, so step #2 can be skipped and the workflow is now:
1. Use an index to find matching records
2. Return records
In many cases, indexes are small and can fit in memory while data is large and does not fit in memory: by using a covering index, you can avoid lots of disk operations and performance can be order of magnitudes better.
Let’s now look at different common scenarios.
This is the most basic scenario:SELECT * FROM t WHERE c = 100
The idea is of course to add an index on (c). However note that if the criteria is not selective enough, the optimizer may choose to perform a full table scan that will certainly be more efficient.
Also note that a frequent variation of this query is when you only select a small subset of fields instead of all fields:
Here it could make sense to create an index on (c, c1, c2) because it will be a covering index. Do not create an index on (c1, c2, c)! It will still be covering but it will not be usable for filtering (remember that you can only use a left-most prefix of an index to filter).Multiple equalitiesSELECT * FROM t WHERE c = 100 and d = 'xyz'
It is also very easy to optimize: just add an index on (c, d) or (d, c).
Here we must be careful because as long as we are using a column with an inequality, this will prevent us from using further columns in the index.
Therefore if we create an index on (d, c), we will be able to filter both on c and d, this is good.
But if we create an index on (c, d), we will only be filtering on c, which is less efficient.
So unlike the situation when you have equalities, order of columns matters when inequalities are used.Multiple inequalitiesSELECT * FROM t WHERE c > 100 and b < 10 and d = 'xyz'
As we have 2 inequalities, we already know that we will not be able to filter on both conditions (*). So we have to make a decision: will we filter on (d, b) or on (d, c)?
It is not possible to tell which option is better without looking at the data: simply choose the column where the inequality is the most selective. The main point is that you must put the column(s) with an equality first.
(*) Actually there is a way to ‘filter’ on both inequalites: partition on b and add an index on (d, c) or partition on c and add an index on (d, b). The details are out of the scope of this post but it might be an option for some situations.Equalities and sortSELECT * FROM t WHERE c = 100 and d = 'xyz' ORDER BY b
As mentioned in the first paragraph, an index can filter and sort so this query is easy to optimize. However like for inequalities, we must carefully choose the order of the columns in the index: the rule is that we will filter first, and then sort.
With that in mind, it is easy to know that (c, d, b) or (d, c, b) will be good indexes while (b, c, d) or (b, d, c) are not as good (they will sort but not filter).
And if we have:SELECT c1, c2 FROM t WHERE c = 100 and d = 'xyz' ORDER BY b
We can create a super efficient index that will filter, sort and be covering: (c, d, b, c1, c2).Inequality and sort
We have 2 main variations here. The first one is:SELECT * FROM t WHERE c > 100 and d = 'xyz' ORDER BY b
Two options look reasonable in this case:
1. filter on d and sort by b.
2. filter on d and c.
Which strategy is more efficient? It will depend on your data, so you will have to experiment.
The second variation is:SELECT * FROM t WHERE c > 100 ORDER BY b
This time we have no equality so we have to choose between filtering and sorting. Most likely you will choose filtering.Conclusion
Not all cases have been covered in this post but you can already see that in some cases you will create poor MySQL indexes if you are not careful. In a future post, I will present a case that can look confusing at first sight but which is easy to understand if you already know everything mentioned here.
The post Indexing 101: Optimizing MySQL queries on a single table appeared first on MySQL Performance Blog.
With highlights that included news of Percona’s acquisition of Tokutek, a lively keynote discussion with Apple legend Steve “Woz” Wozniak, scores of technical sessions, tutorials and a festive MySQL community dinner and game night, last week’s Percona Live MySQL Conference and Expo had something for everyone.
More than 1,200 attendees from around the world converged upon Santa Clara, California for the event, which included for the first time a two-day OpenStack Live track alongside a two-day crash course for aspiring MySQL DBAs called “MySQL 101.”
With the Tokutek acquisition, announced last Tuesday by Peter Zaitsev, Percona becomes the first company to offer both MySQL and MongoDB software and solutions. Percona has also taken over development and support for TokuDB® and TokuMX™ as well as the revolutionary Fractal Tree® indexing technology that enables those products to deliver improved performance, reliability and compression for modern Big Data applications. Peter talks in depth about the technologies in last week’s post.
Also on Tuesday, “Woz” and Jim Doherty, EVP of Sales & Marketing, talked about a range of issues associated with technology and innovation.
The Apple co-founder and inventor of the Apple I and Apple II computers also shared his thoughts on what influenced him growing up, his approach to problem solving, childhood education, artificial intelligence and more. (You can view the entire 45-minute conversation by clicking on the Woz to the left.)
The other keynotes included:
- Peter Zaitsev’s opening keynote
- Peter Zaitsev announces Percona’s acquisition of Tokutek
- Oracle’s Tomas Ulin’s keynote was titled “Celebrating MySQL“
- Facebook’s Harrison Fisk who spoke about “Polyglot Persistence @Facebook“
- A panel discussion on the next disruptive technology titled “What comes after the Cloud and Big Data?“
- “WebScaleSQL: The Power of Collaboration,” by Facebook’s Steaphan Greene
- MariaDB’s Patrik Sallner keynote was titled “1+1>2“
- VMware’s Robert Hodges answered the question, “What Has the Cloud Done for My Data Lately?“
- Rob Young presented “Percona’s Take on Smart Data Management – What’s Next for Products and Services“
- And Tesora’s Amrith Kumar spoke about “Building Databases for the ‘-as-a-Service’ World“
In addition to the above keynotes, all of the sessions were recorded and will be available soon for registered attendees who had access to them at the conference. Most of the slides will be available soon to everyone via the Percona Live 2015 site. Just click the session you’re interested in and scroll to the bottom of the page to view the slides.
Congratulations go out to all of the MySQL Community Awards 2015 winners with a special thanks to Shlomi Noach and Jeremy Cole for running the awards program.
Special thanks also goes out to the Percona Live and OpenStack Live 2015 conference committees, which together organized a fantastic week of events. And of course none of the events would have been possible without our generous Percona Live sponsors and OpenStack Live sponsors.
Finally, a round of applause for Percona’s director of conferences, Kortney Runyan, for her monumental efforts organizing the event. Kortney could not have succeeded without the support of our multiple service vendors including Ireland Presentations, Carleson Production Group, Tricord, the Hyatt Santa Clara, and the Santa Clara Convention Center, to name just a few.
Also announced last week was the new venue for Percona Live Europe, which will be held September 21-22 in Amsterdam. Percona Live Amsterdam promises to be bigger and better than ever. The Call for Papers is open for this exciting new venue so be sure to submit your proposals now.
See you in Amsterdam this September! And be sure to save the date for Percona Live 2016 – April 18-21 at the the Hyatt Santa Clara, and the Santa Clara Convention Center.
P.S. For more Percona Live and OpenStack Live 2015 photos…
MySQL sharding is one of the most used and surely the most abused MySQL scaling technology. My April 2 Dzone article, “To Shard, or Not to Shard,” proved there is indeed quite an interest in this topic.
As such, I’m hosting a live webinar tomorrow (April 22) that will shed light on questions about sharding with MySQL. It’s titled: To Shard or Not to Shard That is the Question!
I’ll be answering questions such as:
- Is sharding right for your application or should you use other scaling technologies?
- If you’re sharding, what things do you need to consider and which questions do you need to have answered?
- What kind of specific technologies can assist you with sharding?
I hope you can make it for this April 22 webinar. It starts at 10 a.m. Pacific time. Please register now and bring your questions, as sharing them with me and the other attendees is half of the fun of live webinars.
Or if you prefer, share your questions about sharding with MySQL in the comments section below, and I’ll do my best to answer them. I’ll be writing a followup post that will include all questions and my answers soon. A recording of this webinar along with my slides will also be available here afterwards.
The post Considering Sharding with MySQL? Join my April 22 webinar. Questions welcome! appeared first on MySQL Performance Blog.
When optimizing queries and investigating performance issues, MySQL comes with built in support for profiling queries akaSET profiling = 1; . This is already awesome and simple to use, but why the PERFORMANCE_SCHEMA alternative?
Because profiling will be removed soon (already deprecated on MySQL 5.6 ad 5.7); the built-in profiling capability can only be enabled per session. This means that you cannot capture profiling information for queries running from other connections. If you are using Percona Server, the profiling option for log_slow_verbosity is a nice alternative, unfortunately, not everyone is using Percona Server.
Now, for a quick demo: I execute a simple query and profile it below. Note that all of these commands are executed from a single session to my test instance.mysql> SHOW PROFILES; +----------+------------+----------------------------------------+ | Query_ID | Duration | Query | +----------+------------+----------------------------------------+ | 1 | 0.00011150 | SELECT * FROM sysbench.sbtest1 LIMIT 1 | +----------+------------+----------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> SHOW PROFILE SOURCE FOR QUERY 1; +----------------------+----------+-----------------------+------------------+-------------+ | Status | Duration | Source_function | Source_file | Source_line | +----------------------+----------+-----------------------+------------------+-------------+ | starting | 0.000017 | NULL | NULL | NULL | | checking permissions | 0.000003 | check_access | sql_parse.cc | 5797 | | Opening tables | 0.000021 | open_tables | sql_base.cc | 5156 | | init | 0.000009 | mysql_prepare_select | sql_select.cc | 1050 | | System lock | 0.000005 | mysql_lock_tables | lock.cc | 306 | | optimizing | 0.000002 | optimize | sql_optimizer.cc | 138 | | statistics | 0.000006 | optimize | sql_optimizer.cc | 381 | | preparing | 0.000005 | optimize | sql_optimizer.cc | 504 | | executing | 0.000001 | exec | sql_executor.cc | 110 | | Sending data | 0.000025 | exec | sql_executor.cc | 190 | | end | 0.000002 | mysql_execute_select | sql_select.cc | 1105 | | query end | 0.000003 | mysql_execute_command | sql_parse.cc | 5465 | | closing tables | 0.000004 | mysql_execute_command | sql_parse.cc | 5544 | | freeing items | 0.000005 | mysql_parse | sql_parse.cc | 6969 | | cleaning up | 0.000006 | dispatch_command | sql_parse.cc | 1874 | +----------------------+----------+-----------------------+------------------+-------------+ 15 rows in set, 1 warning (0.00 sec)
To demonstrate how we can achieve the same with Performance Schema, we first identify our current connection id. In the real world, you might want to get the connection/processlist id of the thread you want to watch i.e. fromSHOW PROCESSLIST .mysql> SELECT THREAD_ID INTO @my_thread_id -> FROM threads WHERE PROCESSLIST_ID = CONNECTION_ID(); Query OK, 1 row affected (0.00 sec)
Next, we identify the bounding EVENT_IDs for the statement stages. We will look for the statement we wanted to profile using the query below from theevents_statements_history_longtable. Your LIMIT clause may vary depending on how much queries the server might be getting.mysql> SELECT THREAD_ID, EVENT_ID, END_EVENT_ID, SQL_TEXT, NESTING_EVENT_ID -> FROM events_statements_history_long -> WHERE THREAD_ID = @my_thread_id -> AND EVENT_NAME = 'statement/sql/select' -> ORDER BY EVENT_ID DESC LIMIT 3 G *************************** 1. row *************************** THREAD_ID: 13848 EVENT_ID: 419 END_EVENT_ID: 434 SQL_TEXT: SELECT THREAD_ID INTO @my_thread_id FROM threads WHERE PROCESSLIST_ID = CONNECTION_ID() NESTING_EVENT_ID: NULL *************************** 2. row *************************** THREAD_ID: 13848 EVENT_ID: 374 END_EVENT_ID: 392 SQL_TEXT: SELECT * FROM sysbench.sbtest1 LIMIT 1 NESTING_EVENT_ID: NULL *************************** 3. row *************************** THREAD_ID: 13848 EVENT_ID: 353 END_EVENT_ID: 364 SQL_TEXT: select @@version_comment limit 1 NESTING_EVENT_ID: NULL 3 rows in set (0.02 sec)
From the results above, we are mostly interested with the EVENT_ID and END_EVENT_ID values from the second row, this will give us the stage events of this particular query from theevents_stages_history_longtable.mysql> SELECT EVENT_NAME, SOURCE, (TIMER_END-TIMER_START)/1000000000 as 'DURATION (ms)' -> FROM events_stages_history_long -> WHERE THREAD_ID = @my_thread_id AND EVENT_ID BETWEEN 374 AND 392; +--------------------------------+----------------------+---------------+ | EVENT_NAME | SOURCE | DURATION (ms) | +--------------------------------+----------------------+---------------+ | stage/sql/init | mysqld.cc:998 | 0.0214 | | stage/sql/checking permissions | sql_parse.cc:5797 | 0.0023 | | stage/sql/Opening tables | sql_base.cc:5156 | 0.0205 | | stage/sql/init | sql_select.cc:1050 | 0.0089 | | stage/sql/System lock | lock.cc:306 | 0.0047 | | stage/sql/optimizing | sql_optimizer.cc:138 | 0.0016 | | stage/sql/statistics | sql_optimizer.cc:381 | 0.0058 | | stage/sql/preparing | sql_optimizer.cc:504 | 0.0044 | | stage/sql/executing | sql_executor.cc:110 | 0.0008 | | stage/sql/Sending data | sql_executor.cc:190 | 0.0251 | | stage/sql/end | sql_select.cc:1105 | 0.0017 | | stage/sql/query end | sql_parse.cc:5465 | 0.0031 | | stage/sql/closing tables | sql_parse.cc:5544 | 0.0037 | | stage/sql/freeing items | sql_parse.cc:6969 | 0.0056 | | stage/sql/cleaning up | sql_parse.cc:1874 | 0.0006 | +--------------------------------+----------------------+---------------+ 15 rows in set (0.01 sec)
As you can see the results are pretty close, not exactly the same but close. SHOW PROFILE shows Duration in seconds, while the results above is in milliseconds.
Some limitations to this method though:
- As we’ve seen it takes a few hoops to dish out the information we need. Because we have to identify the statement we have to profile manually, this procedure may not be easy to port into tools like the sys schema or pstop.
- Only possible if Performance Schema is enabled (by default its enabled since MySQL 5.6.6, yay!)
- Does not cover all metrics compared to the native profiling i.e. CONTEXT SWITCHES, BLOCK IO, SWAPS
- Depending on how busy the server you are running the tests, the sizes of the history tables may be too small, as such you either have to increase or loose the history to early i.e.performance_schema_events_stages_history_long_sizevariable. Using ps_history might help in this case though with a little modification to the queries.
- The resulting Duration per event may vary, I would think this may be due to the additional as described on performance_timers table. In any case we hope to get this cleared up as result when this bug is fixed.
Data inconsistencies in replication environments are a pretty common. There are lots of posts that explain how to fix those using pt-table-checksum and pt-table-sync. Usually we only care about the data but from time to time we receive this question in support:How can I check the table definition consistency between servers?
Replication also allow us to have different table definition between master and slaves. For example, there are some cases that you need some indexes on slaves for querying purposes but are not really needed on the master. There are some other cases where those differences are just a mistake that needs to be fixed.
mysqldiff, included in Oracle’s MySQL Utilities, can help us to find those differences and get the information we need to fix those them. In this post I’m going to show you how to use it with an example.Find table definition inconsistencies
mysqldiff allows us to find those inconsistencies checking the differences between the tables on the same server (different databases) or on different servers (also possible on different databases). In this example I’m going to search for differences in table definitions between two different servers, server1 and server2.
The command line is pretty simple. This is used to compare the tables on “test” database:mysqldiff --server1=user@host1 --server2=user@host2 test:test
If the database name is different:mysqldiff --server1=user@host1 --server2=user@host2 testdb:anotherdb
If the table name is different:mysqldiff --server1=user@host1 --server2=user@host2 testdb.table1:anotherdb.anothertable
Now I want to check the table definition consistency between two servers. The database’s name is “employees”:# mysqldiff --force --server1=root:email@example.com:21489 --server2=root:firstname.lastname@example.org:21490 employees:employees # WARNING: Using a password on the command line interface can be insecure. # server1 on 127.0.0.1: ... connected. # server2 on 127.0.0.1: ... connected. # Comparing `employees` to `employees` [PASS] # Comparing `employees`.`departments` to `employees`.`departments` [FAIL] # Object definitions differ. (--changes-for=server1) # --- `employees`.`departments` +++ `employees`.`departments` @@ -1,6 +1,6 @@ CREATE TABLE `departments` ( `dept_no` char(4) NOT NULL, - `dept_name` varchar(40) NOT NULL, + `dept_name` varchar(256) DEFAULT NULL, PRIMARY KEY (`dept_no`), UNIQUE KEY `dept_name` (`dept_name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 # Comparing `employees`.`dept_emp` to `employees`.`dept_emp` [PASS] # Comparing `employees`.`dept_manager` to `employees`.`dept_manager` [PASS] # Comparing `employees`.`employees` to `employees`.`employees` [FAIL] # Object definitions differ. (--changes-for=server1) # --- `employees`.`employees` +++ `employees`.`employees` @@ -5,5 +5,6 @@ `last_name` varchar(16) NOT NULL, `gender` enum('M','F') NOT NULL, `hire_date` date NOT NULL, - PRIMARY KEY (`emp_no`) + PRIMARY KEY (`emp_no`), + KEY `last_name` (`last_name`,`first_name`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 # Comparing `employees`.`salaries` to `employees`.`salaries` [PASS] # Comparing `employees`.`titles` to `employees`.`titles` [PASS] Compare failed. One or more differences found.
There are at least two differences. One in departments table and another one in employees table. The output is similar to diff. By default the tool stops after finding the first difference. That’s why we use –force, to tell the tool to continue checking all the tables.
It shows us that on departments the dept_name is varchar(40) on server1 and varchar(256) on server2. For “employees” table, it has a KEY (last_name, first_name) on the server2 that is not present on server1. Why is it taking server2 as a reference? Because of this line:# Object definitions differ. (--changes-for=server1)
So, the changes shown on the diff are for server1. If you want server2 to be the one to be changed and server1 used as reference, then –changes-for=server2 would be needed.
In some cases the diff output is not really useful. We actually need a SQL query to do the changes on the server. We just need to add –difftype=sql to the command line:# mysqldiff --force --difftype=sql --server1=root:email@example.com:21489 --server2=root:firstname.lastname@example.org:21490 employees:employees [...] # Comparing `employees`.`departments` to `employees`.`departments` [FAIL] # Transformation for --changes-for=server1: ALTER TABLE `employees`.`departments` DROP INDEX dept_name, ADD UNIQUE INDEX dept_name (dept_name), CHANGE COLUMN dept_name dept_name varchar(256) NULL; [...] # Comparing `employees`.`employees` to `employees`.`employees` [FAIL] # Transformation for --changes-for=server1: # ALTER TABLE `employees`.`employees` DROP PRIMARY KEY, ADD PRIMARY KEY(`emp_no`), ADD INDEX last_name (last_name,first_name);
As we can see, the tool is not perfect. There are two problems here:
1- On “departments table” it drops a UNIQUE key that is present in both servers only to add it again. Waste of time and resources.
2- On “employees” table it drops and recreate the PRIMARY KEY, again something that is not needed a all.
I have created a bug report but this also teaches us a good lesson. Don’t just copy and paste commands without first double checking it.What mysqldiff runs under the hood?
Mostly queries on INFORMATION_SCHEMA. These are the ones used to check inconsistencies on departments:SHOW CREATE TABLE `departments`; SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION, SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD, PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT CONSTRAINT_NAME, COLUMN_NAME, REFERENCED_TABLE_SCHEMA, REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments' AND REFERENCED_TABLE_SCHEMA IS NOT NULL; SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, TABLE_COLLATION, TABLE_COMMENT, ROW_FORMAT, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT ORDINAL_POSITION, COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE, COLUMN_DEFAULT, EXTRA, COLUMN_COMMENT, COLUMN_KEY FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments'; SELECT PARTITION_NAME, SUBPARTITION_NAME, PARTITION_ORDINAL_POSITION, SUBPARTITION_ORDINAL_POSITION, PARTITION_METHOD, SUBPARTITION_METHOD, PARTITION_EXPRESSION, SUBPARTITION_EXPRESSION, PARTITION_DESCRIPTION FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = 'employees' AND TABLE_NAME = 'departments';
As a summary, it checks partitions, row format, collation, constraints and so on.Conclusion
There are different tools for different purposes. We can check the data consistency with pt-table-checkum/pt-table-sync but also the table definitions with mysqldiff.
The post Checking table definition consistency with mysqldiff appeared first on MySQL Performance Blog.
If you haven’t already heard, on the Tuesday morning of the 2015 Percona Live MySQL Conference and Expo it was announced that Tokutek is now part of the Percona family. This means TokuDB® for MySQL, and TokuMX™ for MongoDB are Percona products now; and that the Tokutek team is now part of the Percona team.
Percona’s well-deserved reputation for unparalleled customer service and support in the MySQL market makes them the perfect home for Tokutek’s ground-breaking products. And with the Tokutek acquisition, Percona can expand and extend their activities and offerings into the MongoDB market.
This is a win/win for NoSQL and MySQL fans alike.More About Tokutek
Tokutek is the company that productized a new and revolutionary form of database indexing designed specifically for modern, Big Data applications. Based on data science research on new methods for high-performance data processing for data sets that no longer fit in memory, Fractal Tree® indexing is the secret sauce inside TokuDB and TokuMX.
Unlike the 40-year-old B-tree indexing found in other MySQL and MongoDB solutions, Fractal Tree indexing enables: up to 50x better performance; as much as 90% data compression; and 95% better write-optimization. That translates into significant customer satisfaction gains as well as major cost savings.
In addition, drawing upon their experience in the MySQL world, Tokutek developers introduced full ACID and MVCC transaction compliance, better concurrency, and an improved failover protocol to the MongoDB marketplace with TokuMX. And that means better reliability for mission-critical big data applications built with MongoDB.Next Steps
The Tokutek team is very excited to be joining the Percona team as we move into the next phase of growth on the MySQL and NoSQL market.
For now, if you want to learn more about TokuDB and TokuMX please visit www.tokutek.com. (In the coming weeks, the Tokutek site will be folded into the Percona site.)
If you want to strike up a conversation about enterprise subscriptions for either product drop us a line at email@example.com.
Vice President, Percona Sales
It is my pleasure to announce that Percona has acquired Tokutek and will take over development and support for TokuDB® and TokuMX™ as well as the revolutionary Fractal Tree® indexing technology that enables those products to deliver improved performance, reliability and compression for modern Big Data applications.
At Percona we have been working with the Tokutek team since 2009, helping to improve performance and scalability. The TokuDB storage engine has been available for Percona Server for about a year, so joining forces is quite a natural step for us.
Fractal Tree indexing technology—developed by years of data science research at MIT, Stony Brook University and Rutgers University—is the new generation data structure which, for many workloads, leapfrogs traditional B-tree technology which was invented in 1972 (over 40 years ago!). It is also often superior to LSM indexing, especially for mixed workloads.
But as we all know in software engineering, an idea alone is not enough. There are hundreds of databases which have data structures based on essentially the same B-Tree idea, but their performance and scalability differs dramatically. The Tokutek engineering team has spent more than 50 man years designing, implementing and polishing this technology, which resulted (in my opinion) in the only production-ready Open Source transactional alternative to the InnoDB storage engine in the MySQL space – TokuDB; and the only viable alternative distribution of MongoDB – TokuMX.
Designed for Modern World – TokuDB and TokuMX were designed keeping in mind modern database workloads, modern hardware and modern operating system properties which allowed for much more clean and scalable architecture, leading to great performance and scalability.
Compression at Speed – As part of it, compression was an early part of design, so a very high level of compression can be achieved with low performance overhead. In fact, chances are with fast compression you will get better performance with compression enabled.
Great Read/Write Balance – You find databases (or storage engines) are often classified into read optimized and write optimized, and even though you most likely heard about much better insert speed with Fractal Tree indexing, both for MySQL and MongoDB you may not know that this is achieved with Read performance being in the same ballpark or better for many workloads. The difference is just not so drastic.
Multiple Clustered Keys – This is a great feature, which together with compression and low cost index maintenance, allows TokuDB and TokuMX to reach much better performance for performance critical queries by clustering the data needed by such query together.
Messages – When we’re speaking about conventional data structure such as B-trees or Hash tables, it is essentially a way data is stored and operations are being performed in it. Fractal Tree indexing operates with a different paradigm which is focused around “Messages” being delivered towards the data to perform operations in questions. This allows it to do a lot of clever stuff, such as implement more complex operations with the same message, merge multiple messages together to optimize performance and use messages for internal purposes such as low overhead online optimization, table structure changes etc.
Low Overhead Maintenance – One of obvious uses of such Messages is Low Overhead Maintenance. The InnoDB storage engine allows you to add column “online,” which internally requires a full table rebuild, requiring a lot of time and resources for copy of the table. TokuDB however, can use “broadcast message” to add the column which will become available almost immediately and will gradually physically propagate when data is modified. It is quite a difference!
Smart No-Read Updates – Messages allow you to do smart complex updates without reading the data, dramatically improving performance. For example this is used to implement “Read Free Replication”
Optimized In Memory Data Structures – You may have heard a lot about in-memory databases, which are faster because they are using data structure optimized for properties on memory rather just caching the pages from disk, as, for example, MyISAM and InnoDB do. TokuDB and TokuMX offer you the best of both worlds by using memory optimized data structures for resident data and disk optimized data structures when data is pushed to disk.
Optimized IO – Whether you’re using legacy spinning media or Solid State Storage you will appreciate TokuDB having optimized IO – doing less and more sequential IO which helps spinning media performance, as well as dramatically reducing wear on flash, so you can improve longevity for your media or use lower cost storage.
Between the Tokutek engineering team and Percona we have a lot of ideas on how to take this technology even further, so it is the technology of choice for large portions of modern database workloads in the MySQL and MongoDB space. We are committed to working together to advance the limits of Open Source databases (relational or not)!
Interested to check out whether TokuDB or TokuMX is right for your application? Please contact us at firstname.lastname@example.org.
Percona is pleased to announce the availability of Percona Toolkit 2.2.14. Released April 14, 2015. Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.
This release is the current GA (Generally Available) stable release in the 2.2 series. It includes multiple bug fixes for pt-table-checksum with better support for Percona XtraDB Cluster, various other fixes, as well as continued preparation for MySQL 5.7 compatibility. Full details are below. Downloads are available here and from the Percona Software Repositories.
- pt-slave-find can now resolve the IP address and show the slave’s hostname. This can be done with the new --resolve-address option.
- pt-table-sync can now ignore the tables whose names match a specific Perl regex with the new --ignore-tables-regex option.
- Fixed bug 925781: Inserting non-BMP characters into a column with utf8 charset would cause the Incorrect string value error when running the pt-table-checksum.
- Fixed bug 1368244: pt-online-schema-change --alter-foreign-keys-method=drop-swap` was not atomic and thus it could be interrupted. Fixed by disabling common interrupt signals during the critical drop-rename phase.
- Fixed bug 1381280: pt-table-checksum was failing on BINARY field in Primary Key. Fixed by implementing new --binary-index flag to optionally create checksum table using BLOB data type.
- Fixed bug 1421405: Running pt-upgrade against a log with many identical (or similar) queries was producing repeated sections with the same fingerprint.
- Fixed bug 1402730: pt-duplicate-key-checker was not checking for duplicate keys when --verbose option was set.
- Fixed bug 1406390: A race condition was causing pt-heartbeat to crash with sleep argument error.
- Fixed bug 1417558: pt-stalk when used along with --collect-strace didn’t write the strace output to the expected destination file.
- Fixed bug 1421025: Missing dependency for perl-TermReadKey RPM package was causing toolkit commands to fail when they were run with --ask-pass option.
- Fixed bug 1421781: pt-upgrade would fail when log contained SELECT...INTO queries. Fixed by ignoring/skipping those queries.
- Fixed bug 1425478: pt-stalk was removing non-empty files that were starting with an empty line.
- Fixed bug 1419098: Fixed bad formatting in the pt-table-checksum documentation.
I’ve started my long journey from Florianópolis, Brazil, to Santa Clara, California and I type this words while waiting for a connection flight. Next Wednesday, Daniel Guzmán Burgos and I will be presenting in the Percona Live MySQL Conference and Expo (PLMCE).
I’m so excited with the new MySQL 101 program that has been added to this year’s event! Along the years I’ve been working as a Support Engineer at Percona I’ve heard two very distinct types of comments amongst others from some people, customers and community in general, about PLMCE:
1) That they went and it was awesome but they found it hard to follow as most of the contents in the program were high level for them;
2) that they wanted to attend the event but just didn’t felt experienced enough with MySQL just yet, so they would wait a year or two before committing to it.
Sometimes I’ve also been asked: “Don’t you guys organize a similar conference, but for beginners ?” Not exactly, I’d tell them, pointing them to the on-demand training that Percona offers, but this isn’t always a good fit for everybody.
Well, I’m glad that this request was reconsidered this year and that we now have an intensive MySQL 101 2-day program: “You send us developers and admins, and we’ll send you back MySQL DBAs.”
Daniel and I will be talking about the use of a few key tools of the Percona Toolkit from the viewpoint of day-to-day operations of a MySQL DBA – we’ll showcase how some of them can be integrated to Nagios to improve monitoring. The goal: “stay ahead of MySQL operational problems,“which is also the base title of our talk.
Percona Toolkit is a rich collection of more than 30 command-line tools for MySQL, Percona Server and MariaDB that can help database administrators perform and automate a variety of database and system tasks. Some of the tools focus on diagnostic, some on performance improvement and some others are actually used to fix things. I just love them; they make my day-to-day work with MySQL that much easier!
I hope to see you in Santa Clara this week!
The post Staying ahead of MySQL operational problems at Percona Live appeared first on MySQL Performance Blog.
A few years back Deva wrote about how to use tcpdump on very busy hosts. That post sparked my interest about exploring how to measure the impact of tcpdump on very busy hosts. In this post, I wanted to highlight how much of an impact there really is and what options you have to make the query collection much more effective.
Some things you need to know:
- The test is a sysbench read-only workload, 8 tables, 8 threads, 1000000 rows each with 16G of buffer pool. Dataset fully in memory.
- sysbench is ran on the same host, on 1Gbps connection, sysbench can saturate the network and therefore affect my network test with netcat so I decided to run locally.
- There are 13 tests, 5 minutes each with 1 minute interval, varying on how the dump file is captured.
- First one as baseline is the MySQL slow query log.A:mysql -e 'set global long_query_time=0, slow_query_log=1; select sleep(300); set global long_query_time=1, slow_query_log=0;'
- Second group is tcpdump with -w option, which means tcpdump itself is writing to the capture file.B:$DUMPCMD -i any -G 300 -W 1 -Z root -w tcpdump.pcap port 3306C:$DUMPCMD -i any -G 300 -W 1 -Z root -w tcpdump.pcap 'port 3306 and tcp & 7 == 2 and tcp & 7 == 2'D:$DUMPCMD -i any -G 300 -W 1 -Z root -w tcpdump.pcap 'port 3306 and ( tcp & 7 == 2 or tcp & 7 == 2 )'
- Third group, is using “packet-buffered” (-U option) to see if there will be improvement on response time.E:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w tcpdump.pcap port 3306F:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w tcpdump.pcap 'port 3306 and tcp & 7 == 2 and tcp & 7 == 2'G:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w tcpdump.pcap 'port 3306 and ( tcp & 7 == 2 or tcp & 7 == 2 )'
- Next streams the backup to a remote location via netcat.H:$DUMPCMD -i any -G 300 -W 1 -Z root -w - port 3306 | nc remote_ip 33061I:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w - port 3306 | nc remote_ip 33062J:$DUMPCMD -i any -G 300 -W 1 -Z root -U -w - 'port 3306 and tcp & 7 == 2 and tcp & 7 == 2' | nc remote_ip 33063
- The last group, the one most of us are probably accustomed with is piping the dumped packets to file.K:timeout -s KILL 300 $DUMPCMD -i any port 3306 > tcpdump.pcapL:timeout -s KILL 300 $DUMPCMD -i any 'port 3306 and tcp & 7 == 2 and tcp & 7 == 2' > tcpdump.pcapM:timeout -s KILL 300 $DUMPCMD -i any 'port 3306 and ( tcp & 7 == 2 or tcp & 7 == 2 )' > tcpdump.pcap
- $DUMPCMD is defined astcpdump -s 65535 -x -nn -q -tttt
- On each group there is an AND and OR variation in port filtering. I wanted to see whether how much of additional impact port expressions have. And as you will see below, they do not have significant impact on performance, but on number of queries captured.
I’ve graphed the sysbench data during the test and labeled each test for easy comparison on the graphs.Findings
- We see that, piping the pcap (K, L, M) data as decoded packets has significant overhead in terms of number of captured queries, response time and reads requests completed.
- Using the slow log has about 30% overhead in response time, nearly 20% drop in throughput but have highest number of queries captured.
- Writing captured packets directly to binary file using the -w option has the lowest overhead in response time, around 10%. Throughput drops depending on how much filtering is involved though while also there are noticeable stalls when the operating system flushes the page cache. This side effect causes sysbench to drop to 0 reads or even reach response times of several seconds!
- Streaming packets to a capable remote server in terms of network bandwidth, IO performance combined with -w option to capture binary data produces 20-25% overhead in response time, 10-15% drop in throughput, no stalls and number of queries captured as close to slow query log.
Use tcpdump -w option in all cases and decode later. If you are looking for an overall view of ALL your queries, streaming tcpdump data to remote is also ideal. If you have low bandwidth though i.e. 100Mbps, this might not be enough as 5mins of binary tcpdump data produced 31G of file. That is 105MBps requirement! In which case, consider writing to a separate partition with enough IO.
If you are using Percona Server or MariaDB and is only looking to capture a portion of your workload i.e. table scans, temp table on disk or rate limit the collection, the extended slow query logging capability with this versions are also an excellent way to capture the data you need.
The post Measuring the impact of tcpdump on Very Busy Hosts appeared first on MySQL Performance Blog.
Recently, I was working on one of the issue related to locks and deadlocks with InnoDB tables and I found very interesting details about how InnoDB locks and deadlocks works with or without index for different Isolation levels.
Here, I would like to describe a small test case about how SELECT ..FOR UPDATE (with and without limit) behave with INSERT/UPDATE and with READ-COMMITED and REPEATABLE-READ Isolation levels. I’m creating a small table data_col with few records. Initially, this test case was written by Bill Karwin to explain details to customer, but here I have used a bit modified test case.
CREATE TABLE data_col (dataname VARCHAR(10), period INT, expires DATE, host VARCHAR(10));
INSERT INTO data_col VALUES (‘med1′, 1,’2014-01-01 00:00:00′, ‘server1′);
INSERT INTO data_col VALUES (‘med2′, 1,’2014-02-15 00:00:00′, ‘server2′);
INSERT INTO data_col VALUES (‘med3′, 1,’2014-03-20 00:00:00′, ‘server3′);
INSERT INTO data_col VALUES (‘med4′, 1,’2014-04-10 00:00:00′, ‘server4′);
INSERT INTO data_col VALUES (‘med5′, 1,’2014-05-01 00:00:00′, ‘server5′);
Case 1: No index on expires, tx_isolation=READ-COMMITTED.
Session 1: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med6′, 1,’2014-06-03 00:00:00′, ‘server6′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* hangs */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* deadlocks */
With READ-COMMITTED, even If Session 1 locks records with condition “expires < ‘2014-03-01′ “, Session 2 can Insert the record as Session 1 is not using gap lock (lock_mode X locks rec but not gap waiting) and we can insert/update the records outside of set of Session 1 examined. But when Session 1 tried to acquire locks on higher range (“expires < ‘2014-07-01′ “), it will be hanged and if we do the same thing from Session 2, it will turn to deadlock.
Here, When there is no primary key, InnoDB table will create it’s own cluster index, which is GEN_CLUST_INDEX.
Case 2: No index on expires, tx_isolation=REPEATABLE-READ.
Session 1: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med6′, 1,’2014-06-03 00:00:00′, ‘server6′); /* hangs */
With REPEATABLE-READ, we can see that when Session 1 locks records with condition “expires < ‘2014-03-01′ “, Session 2 can’t Insert the record and waiting for the lock release (lock_mode X insert intention waiting) from Session 1 because it’s using gap locks. Here, insert intention is one type of gap lock. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap.
This is why for some scenario/ business logic, REPEATABLE-READ is better isolation level to prevent deadlocks by using more row locks. (including gap locks)
Case 3: Added Primary Key on dataname and Index on expires, tx_isolation=READ-COMMITTED.
Session 1: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med13′, 1,’2014-06-13 00:00:00′, ‘server13′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* success */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE; /* hang*/
If there is index on “expires” column which we use to filter the records, it will show different behavior with READ-COMMITED. In Case 1, we were not able to acquire locks on higher range (“expires < ‘2014-07-01′ “) while here, we can do that with index on expires. So when Session 1 has already locked the rows, another Session 2 can’t acquire the same lock and will be waiting to release locks from Session 1.
But it we remove LIMIT 1 from SELECT…FOR UPDATE then it will behave the same like Case 1.
———————–Just removed limit 1—————————————
Session 1: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 2: SET tx_isolation=’READ-COMMITTED'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-03-01′ ORDER BY expires FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med14′, 1,’2014-06-04 00:00:00′, ‘server14′); /* success */
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE; /* hang */
Session 2: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE; /* deadlock*/
Case 4: Added Primary Key on dataname and Index on expires, tx_isolation=REPEATABLE-READ.
Session 1: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires LIMIT 1 FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med15′, 1,’2014-06-10 00:00:00′, ‘server15′); /* success */
Here, Unlike Case 2, we’ll be able to INSERT record from Session 2 as Session 1 is not using gap lock.
——————–Successfully happened but if I’ll remove the limit 1, —————————-
Session 1: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 2: SET tx_isolation=’REPEATABLE-READ'; START TRANSACTION;
Session 1: SELECT * FROM data_col WHERE expires < ‘2014-07-01′ ORDER BY expires FOR UPDATE;
Session 2: INSERT INTO data_col VALUES (‘med11′, 1,’2014-06-11 00:00:00′, ‘server11′); /* hang */
But here, Session 1 will use gap lock, so Session 2 can’t able to insert record until lock release. To know more about deadlock, I would suggest to read below blog post, “How to deal with MySQL deadlocks” by my colleague Peiran Song.
There are plenty of blog posts to describe InnoDB locks but few things are still missing in manual. Here is the list of some of those bugs.
- http://bugs.mysql.com/bug.php?id=71638 – “Manual does not explain “insert intention” lock mode properly”
- http://bugs.mysql.com/bug.php?id=71736 – “Manual does not explain locks set by UPDATE properly”
- http://bugs.mysql.com/bug.php?id=71916 – “Manual does not explain locks set for UPDATE … WHERE PK=’const’ properly”
- http://bugs.mysql.com/bug.php?id=71735 – “Manual does not explain locks set by SELECT … FOR UPDATE properly”
Conclusion: We can see from above test case that to prevent deadlocks, sometimes we have to use indexes, sometimes higher isolation level helps, even if it counter-intuitive and sometimes application OR table schema changes can help.
If you are coming to the Percona Live MySQL Conference and Expo 2015, then visit me for this talk: “Understanding InnDB locks and deadlocks” (16 April 3:00PM – 3:50PM @ Ballroom A).
The post InnoDB locks and deadlocks with or without index for different isolation level appeared first on MySQL Performance Blog.
After we released pquery to the community, and as we started logging bug reports with pquery testcases, it quickly became clear that pquery binaries with statically compiled-in client libraries would be of great convenience, both for ourselves and for the community.
(If you haven’t heard about pquery yet, read the pquery introduction blog post, come and join the pquery introduction lightning talk at Percona Live (15 April just around 6PM in Hall A), or keep an eye out for some of the upcoming episodes in the MySQL QA Series.)
While we were in the process of creating these binaries (which turned out to be not as straighforward as we thought it would be), we also disovered a rather significant memory de-allocation bug in pquery, which would at times have caused pquery to crash (segfault). I want to especially thank Sergei for helping get the client libraries compiled into pquery, as well as fixing the segfault memory de-allocation bug and some other problems we found, and Ramesh who helped with binary testing and debugging.
I also want to thank Sveta (previously at Oracle, now at Percona), Umesh (Oracle), and Satya Bodapati (Oracle) for not giving up too easily when we logged some bug reports with dynamically linked (i.e. client libs not included) and at times failing pquery binaries!
Now, the pquery binaries with statically included client binaries are finally ready! We have pquery-ps (with static Percona Server 5.6 client libs), pquery-ms (with static MySQL 5.6 client libs), and pquery-md (with static MariaDB 5.5 client libs). The pquery binaries can also be used to test any other community solution or product. For example, we use it to test our beloved Percona XtraDB Cluster (PXC) using specially developed pquery+Docker+PXC scripts.
Things have also continued to evolve quickly in the pquery framework, as well as in the accompanying reducer.sh ($ bzr branch lp:randgen – available as randgen/util/reducer/reducer.sh), so check out the many updates now! To get it, $ bzr branch lp:percona-qa and start by having a look at pquery-run.sh
You may also like to checkout our latest pquery-reach.sh and pquery-reach++.sh – which are a wrapper around most of the major pquery framework tools. Not as straightforward to use and setup as pquery-run.sh (as it requires setup within the sub-scripts it uses…), but reviewing pquery-reach.sh will give you a good idea on how to setup pquery-run.sh and get into things.
Stay tuned for the upcoming MySQL QA episodes (link above), and you’ll soon be hunting bugs like Mr. Nuclear! (To meet (the nice) Mr. Nuclear, come and see our lightning talk at Percona Live!)
The post pquery binaries with statically included client libs now available! appeared first on MySQL Performance Blog.
In April and May, Percona will hold and participate in two OpenStack events: OpenStack Live and the OpenStack Summit. Join our talks at these events in Santa Clara and Vancouver for new insights into the MySQL operations of the core of OpenStack as well as the latest information on MySQL guest instances.
Next week (April 13-14), Percona will host OpenStack Live at the Santa Clara Convention Center. Conveniently located for those in the Bay Area, this two day, user-focused event will cover many core aspects of OpenStack including Nova, Swift, Neutron, and Trove as well as related technologies like Ceph and Docker. (OpenStack Live passes include access to keynotes and the expo hall at the Percona Live MySQL Conference.)
OpenStack Live will be a packed two days with 6 hands-on tutorials, 18 sessions, and keynotes and panel discussions featuring speakers from Facebook, EMC, ObjectRocket, Percona, Yahoo, VMware, Deep Information Sciences, and special guest, Steve Wozniak. Yes, the real Steve Wozniak.
Of course I’m looking forward to my session, “An introduction to Database as a Service with an emphasis on OpenStack using Trove,” with Amrith Kumar, Founder and CTO of Tesora, on Tuesday from 11:30am – 12:20pm in Room 209. Some other topics you shouldn’t miss include:
Tutorial: Deploying, Configuring and Operating OpenStack Trove
Monday, April 13: 9:30am – 12:30pm in Room 203
Amrith Kumar (Tesora) and Sriram Kalyanasundaram (Tesora) will lead a 3 hour hands-on tutorial on Trove DBaaS. It’s rare to get such a focused tutorial on this key component of OpenStack and led by Tesora, major contributors to the Trove project. If you are even considering implementing Trove for your cloud, this is a must-attend class.
Tutorial: How to Get Your Groove on with OpenStack Swift Object Storage
Monday, April 13: 1:30pm – 4:30pm in Room 204
John Dickinson (OpenStack Swift), Manzoor Brar (SwiftStack), and Sergei Glushenko (Percona) will lead the audience on a journey into Swift, OpenStack’s object store project. John, the Swift “Project Team Lead” (PTL), and Manzoor will show you how to deploy a Swift cluster, use it with real applications, and, with the assistance of Sergei from Percona, how to properly backup MySQL databases to a Swift cluster.
Sessions which I’d recommend attending are:
Session: MySQL and OpenStack deep dive
Tuesday, April 14: 11:30am – 12:20pm in Room 204
Peter Boros (Percona)
Session: Deploying a OpenStack Cloud at Scale at Time Warner Cable
Tuesday, April 14: 1:20pm – 2:10pm in Room 203
Matthew Fischer (Time Warner Cable), Clayton O’Neill (Time Warner Cable)
Session: Designing a highly resilient Network Infrastructure for OpenStack Clouds
Tuesday, April 14: 3:50pm – 4:40pm in Room 203
Pere Monclus (PLUMgrid)
This is only a small sample of the tutorials and talks which you will hear at OpenStack Live. If you’re in the Bay Area next week and are interested in OpenStack or just want to learn more about some of the core components, register today to join us in Santa Clara from April 13-14. It will be well worth the investment.
As a reminder, getting the most out of all tutorials at OpenStack Live requires that you bring your own laptop and are ready to dig into the technologies. Review the session descriptions for any materials that should be downloaded or installed prior to the event.Talks at OpenStack Summit Vancouver
Next month, Percona will join thousands in Vancouver for the next OpenStack Summit. In addition to helping those developing and running OpenStack to operate, optimize, and achieve high availability of the MySQL core, Percona MySQL experts will speak in two sessions.
Deep Dive into MySQL replication with OpenStack Trove, and Kilo
George Lorch (Percona) and Amrith Kumar (Tesora) will present an in depth exploration of MySQL replication with Trove and Kilo capabilities. This is a great way for anyone interested in OpenStack DBaaS to find out the latest information.
Core Services MySQL Database Backup and Recovery to Swift
Most OpenStack services rely heavily on MySQL, but how do you mitigate data loss in the event of a failed upgrade or unplanned outage? Kenny Gryp (Percona) and Chris Nelson (SwiftStack), experts in MySQL and storage, will discuss this challenge and provide advice and strategies for a fast, complete recovery from highly available storage sources.
The team from Percona will again be found in the exhibit hall at booth T37. We’re looking forward to seeing old friends and meeting new ones that rely on Percona software, such as the Galera-based Percona XtraDB Cluster and Percona XtraBackup, in their OpenStack clouds.
The post More on OpenStack Live and our talks at OpenStack Summit Vancouver appeared first on MySQL Performance Blog.
Two years ago Ovais Tariq had explained in detail what kinds of problems existed before MySQL introduced metadata locks in 5.5.3 and how these locks help to prevent them. Still, some implications of metadata locking in MySQL remain unclear for users – DBAs and even software developers that target recent MySQL versions. I’ve decided to include a slide or two into the presentation about InnoDB locks and deadlocks I plan to make (with my colleague Nilnandan Joshi) on April 16 at Percona Live 2015.
I decided to do this as recently I’ve got an issue to work on where it was claimed that the behavior of SELECT blocking TRUNCATE TABLE is wrong, just because transaction isolation level was set to READ COMMITTED and thus there should be no locks set by SELECT and transaction should not even start no matter what the value of autocommit is (it was explicitly set to 0 by smart software).
The MySQL manual clearly says:
“To ensure transaction serializability, the server must not permit one session to perform a data definition language (DDL) statement on a table that is used in an uncompleted explicitly or implicitly started transaction in another session. The server achieves this by acquiring metadata locks on tables used within a transaction and deferring release of those locks until the transaction ends. A metadata lock on a table prevents changes to the table’s structure. This locking approach has the implication that a table that is being used by a transaction within one session cannot be used in DDL statements by other sessions until the transaction ends.”
So, the real challenge was to show these metadata locks still set in a transaction that started implicitly, by SELECT immediately following SET autocommit=0 in a session. It was a good chance to check how metadata locks are exposed in MySQL 5.7 via Performance Schema, so I’ve set up a simple test.
First of all, I’ve enabled instrumentation for metadata locks:[openxs@centos 5.7]$ bin/mysql --no-defaults -uroot -proot mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 4 Server version: 5.7.6-m16 MySQL Community Server (GPL) Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'global_instrumentation'; Query OK, 0 rows affected (0.00 sec) Rows matched: 1 Changed: 0 Warnings: 0 mysql> UPDATE performance_schema.setup_instruments SET ENABLED = 'YES' WHERE NAME = 'wait/lock/metadata/sql/mdl'; Query OK, 0 rows affected (0.00 sec) Rows matched: 1 Changed: 0 Warnings: 0
Then I’ve set up a simple test based on the details from the issue (I’ve create the InnoDB table, t, and added a row to it before this):mysql> set session autocommit=0; Query OK, 0 rows affected (0.00 sec) mysql> select @@autocommit, @@tx_isolation; +--------------+----------------+ | @@autocommit | @@tx_isolation | +--------------+----------------+ | 0 | READ-COMMITTED | +--------------+----------------+ 1 row in set (0.00 sec) mysql> select * from t limit 1; +----+------+ | id | val | +----+------+ | 1 | 1 | +----+------+ 1 row in set (0.00 sec)
Now, from another session I tried to TRUNCATE the table before the fist session got a chance to do explicit or implicit COMMIT (In the issue I mentioned software used just had not cared to do this, assuming transaction had not started. It worked with MySQL 5.1 really well that way.)mysql> truncate table t;
I was not surprised that TRUNCATE hung. Manual clearly says that until transaction is committed we do not release metadata locks. But let’s check them in Performance Schema (from the first session, where we executed SELECT):mysql> select * from performance_schema.metadata_locksG *************************** 1. row *************************** OBJECT_TYPE: TABLE OBJECT_SCHEMA: test OBJECT_NAME: t OBJECT_INSTANCE_BEGIN: 140450128308592 LOCK_TYPE: SHARED_READ LOCK_DURATION: TRANSACTION LOCK_STATUS: GRANTED SOURCE: sql_parse.cc:5585 OWNER_THREAD_ID: 27 OWNER_EVENT_ID: 17 *************************** 2. row *************************** OBJECT_TYPE: GLOBAL OBJECT_SCHEMA: NULL OBJECT_NAME: NULL OBJECT_INSTANCE_BEGIN: 140450195436144 LOCK_TYPE: INTENTION_EXCLUSIVE LOCK_DURATION: STATEMENT LOCK_STATUS: GRANTED SOURCE: sql_base.cc:5224 OWNER_THREAD_ID: 30 OWNER_EVENT_ID: 8 *************************** 3. row *************************** OBJECT_TYPE: SCHEMA OBJECT_SCHEMA: test OBJECT_NAME: NULL OBJECT_INSTANCE_BEGIN: 140450195434272 LOCK_TYPE: INTENTION_EXCLUSIVE LOCK_DURATION: TRANSACTION LOCK_STATUS: GRANTED SOURCE: sql_base.cc:5209 OWNER_THREAD_ID: 30 OWNER_EVENT_ID: 8 *************************** 4. row *************************** OBJECT_TYPE: TABLE OBJECT_SCHEMA: test OBJECT_NAME: t OBJECT_INSTANCE_BEGIN: 140450195434368 LOCK_TYPE: EXCLUSIVE LOCK_DURATION: TRANSACTION LOCK_STATUS: PENDING SOURCE: sql_parse.cc:5585 OWNER_THREAD_ID: 30 OWNER_EVENT_ID: 8 *************************** 5. row *************************** OBJECT_TYPE: TABLE OBJECT_SCHEMA: performance_schema OBJECT_NAME: metadata_locks OBJECT_INSTANCE_BEGIN: 140450128262384 LOCK_TYPE: SHARED_READ LOCK_DURATION: TRANSACTION LOCK_STATUS: GRANTED SOURCE: sql_parse.cc:5585 OWNER_THREAD_ID: 27 OWNER_EVENT_ID: 18 5 rows in set (0.00 sec)
Note SHARED_READ lock set on table t and EXCLUSIVE lock is pending on the same table t above. TRUNCATE is blocked (as DDL).
Note also locks related to out SELECT from the metadata_locks table in the output. Yes, access to Performance Schema is also protected with metadata locks!
We can get a nice view of all metadata locks from other sessions, excluding our current one, and check also all we could get about them before MySQL 5.7 (just a thread state in the SHOW PROCESSLIST output):
As soon as I complete transaction where SELECT was executed, TRUNCATE completes and we see no pending metadata locks:mysql> commit; Query OK, 0 rows affected (0.00 sec) mysql> SELECT OBJECT_TYPE, OBJECT_SCHEMA, OBJECT_NAME, LOCK_TYPE, LOCK_STATUS, THREAD_ID, PROCESSLIST_ID, PROCESSLIST_INFO FROM performance_schema.metadata_locks INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID WHERE PROCESSLIST_ID <> CONNECTION_ID(); Empty set (0.01 sec) mysql> select * from t; Empty set (0.00 sec)
To summarize, MySQL 5.7 allows you to study all metadata locks in detail. They are set for both transactional and non-transactional tables, but remember that if you use autocommit=0 or start transaction explicitly they are released only when commit happens, implicit or explicit. If you want single statement SELECT to not block any DDL after it is completed, make sure to COMMIT immediately or use autocommit=1.
We can surely call the behavior of metadata locks for this case a “bug” or file a “feature request” to change it, but for now any software that is supposed to work with MySQL 5.5.3+ should just take all implications of metadata locks into account.
Every day hundreds of millions of dollars are wasted by allowing improperly tuned or misconfigured systems, misunderstood infrastructure, and inefficient IT operations to live and thrive in data centers around the globe. There are both direct and indirect costs associated with allowing these unhealthy systems to continue to exist. Let’s look at some.The setup:
Let us start by using a small example. We will start by looking at a small database setup. This setup will have a single master-slave, with a database size of lets say 500GB. Traffic is steady and let’s say this translates into 500 IOPS on the master. You have chosen to host this on Amazon’s AWS. A common way of ensuring backups occur in AWS is to setup ebs snapshots of the slave. In terms of usage, let us assume your CPU is about 50% used and you have about 20GB of hot data that needs to stay in the memory for the database.
If we look at what this would take to support in EC2 you are looking roughly at this:
- 2 c3.4xlarge servers (16 vcpu, 30GB of memory )
- Master-Slave Set
- with 1TB of Provisioned IOPS SSD, over 2 volumes
- with 500 IOPS on the master, 125 iops on the slave
- estimated 7TB of storage for snapshots
This calculator gives us an estimated cost of $3,144.28 per month, or roughly $38,000 a year in hosting fees. Note that you can choose other tiers of service, or reserved or spot servers to get different pricing.Regular, steady growth:
Now let’s assume your database is growing along with its traffic at about 5% per month (these are rough numbers I know). After a year your database server would be out of steam using 86% CPU, 34GB of hot data (so relying more heavily on disk), and be consuming just about 850GB of storage space. Moving up to the next tier of servers and with additional iops you will see your spend per month jump to around $4,771.32 per month ($57,000 per year).
When tuning and auditing an environment like the above we been able to give some customers up to 50% or more improvement in performance, and often see 20-25% reduction in space. Let’s be conservative and say we can get a 25% boost in your performance, reduce your 5% monthly growth to 4%, and shrink your database by 10%. Based on that you can stave off upgrading your servers an additional 9 months, saving you almost $15,000 in that first year alone. Over 4 years this customer would end up saving an estimated $75,000 in total spend in AWS costs just based on smaller data and performance enhancements.
In this case performance enhancements are not the only place to save costs. Moving from EBS Snapshots to regular MySQL backups using Percona XtraBackup, keeping one copy on disk and sending those backups to s3, the cost of the environment drops to $2,043.87 per month ( from $3144.28). This means a simple switch of backup methodology can net you about $1,100 a month or $13,200 a year off your hosting bill.
Often we are not only reducing the resources needed, but we can also reduce the number of servers needed to run your application through tuning. We had a recent client who was able to see a 90% reduction in their read heavy workload and actually turn off servers that used to be used to serve their application. Here is what their savings over the next couple of years would look like:
Here we helped cut this customers direct costs by two thirds.Handling spikes:
The one thing to keep in mind is this assumes a linear growth in terms of application and database usage. This would mean you can predict when you will need hardware. If your user base is growing and feature adds are controlled it is possible, however in most environments you will not see that linear growth. You will probably see something like this:
Understanding this pattern and the spikes are vital to keeping your costs down. See that giant spike up to 2,500? The first reaction for many is upgrade their hardware, then tune. Inevitably any tuning benefit is offset by the already sunk cost of the hardware upgrade which after the tuning they may not have needed. Getting in front of that spike and preventing it could have saved tens or even hundreds of thousands of dollars.
Spikes kill performance and cost real dollars. Those spikes may not be easy to find. A few years ago I was working with a Fortune 500 client who had one of these spikes. They had been running perfectly fine with steady but controllable growth for 7 or 8 months, then the 9th month things went very wrong very quick.
A critical component of their company was to certify professionals through a testing application. During their peak time of season these certifications had stopped completely – delaying certifications for thousands of employees and clients for 2 weeks. I was flown out to help control the bleeding and hopefully fix the problem.
The number of users using the application was the same, the number of page views on the web was steady, but the number of queries to the database skyrocketed. None of the queries had hit their threshold of 1 second to be flagged as problem queries. It turned out to be one query that took 250ms to execute that was causing this company to grind to a halt. That one query ended up being executed 25,000 times per page when certain conditions were met, and those conditions were not met until the 9th month after this application was re-released.
This query lay like a trojan horse waiting to destroy this company’s ability to deliver to its customers.
Two lessons can be learned from this. The first is even a seemingly well-tuned system may not be. Second, small things matter. In this case fixing the code is the correct solution, however, proper indexing of the tables dropped the query time from 250ms down to 50ms. This was enough of a relief to allow the certification process to start up again until the code could be fixed. A seemingly small impacting query still should be optimized.
Another source for these performance spikes is a company’s application release cycle. Applications are very a living entity in today’s world. They grow and expand and change on a regular basis. In order to stay ahead of any problems you need to have a process and resources in place that can proactively monitor and tune. Every release of new code should be going through a rigorous performance review to prevent trojan problems that may cause problems and extra costs down the road.Indirect and hard to calculate costs:
All of this discussion so far has been around direct hosting costs. There is also a cost to your reputation and your ability to deliver services that meet a customer’s expectations. Customers who come to your site or are using your application can leave in droves due to poor performance. We have seen several customers who lost 50% or more of their user base due to performance problems with the application.
Lost revenue and profits are often much more difficult to quantify, and vary greatly from company to company. This cost, however, is very real. Silicon Valley is littered with the remnants of companies that did not plan to address scale or simply missed important problems in their IT infrastructure. Unfortunately I have worked directly with numerous companies that learned this lesson the hard way. These hidden costs can kill a customer quicker than any competitor or market shift.
One of the biggest hidden costs companies needlessly pay is the cost of downtime.The cost of downtime:
I was reading a gartner study where they estimated that the cost per minute of downtime was $5,600 dollars; other studies, like this one, have pegged the cost per minute of downtime at $7,900.
Anyway you slice it being down for even a minute costs you money. If we are conservative in our estimates, the cost of an hour of downtime can easily top $100,000. It’s amazing the number of well-established companies that don’t have a solid plan for dealing with downtime.
Let’s look at some common disaster recovery policies:Restore from backup:
How quickly can your DBAs get alerted to an outage, then login to look at the outage, and finally make a call whether or not to restore? I submit that most people are going to take a few minutes to get an alert (let’s say 2). They will then take a few minutes to get to the computer and into the system (let’s say 5 minutes). Then they will take at least 10 minutes to try and figure out what’s going on. Fast-forward 17 minutes later…. minimum has gone by with nothing to show for it.
Restoring the backup itself could take a few minutes or several hours. Let’s just say 40 minutes total. If we use that $7,900 number, you could have just lost $316,000. That’s a huge amount that could have easily been avoided. Maybe you know that you’re not losing $7,900 a minute, maybe it’s only $1,000. That’s still $40,000!Manual failover to a slave:
The time for getting, reacting and taking action does not change in this equation. The original 17 minutes of time (minimum) to react and start fixing just potentially cost you $134,300.Automated failover:
Not all automated failovers are created equal. Some solutions can take several minutes to even hours to restore proper service (passive cold slaves warm up time). Just because you think you are protected does not mean you are. Having the right automated solution can mean you minimize your downtime risks to $10K or less, having the wrong one can be worse than having none at all.
It’s important to understand the cost of downtime and pick the proper solution to mitigate it.Cost of being wrong is high:
These are just a few of the costs that companies can incur by having the improper database and infrastructure setup. Mitigating these costs requires a solid process, a high-level of expertise, and the right resources in place.
On a recent engagement I worked with a customer who makes extensive use of UUID() values for their Primary Key and stores it as char(36), and their row count on this example table has grown to over 1 billion rows.
The table is INSERT-only (no UPDATEs or DELETEs), and the bulk of their retrieval are PK lookups. Lookups by PK were performing acceptably, but they were concerned with the space usage by the table as we were approaching 1TB (running with innodb_file_per_table=1 and Percona Server 5.5).
This schema model presents an increasing burden for backups since they use Percona XtraBackup, and so the question was asked: does their choice of an effectively random Primary Key based on UUID() impact their on-disk storage, and to what extent? And as a neat trick I show towards the end of this post how you can calculate the rate of fragmentation in your table on a regular basis if you’re so inclined. So read on!
For background, the more common approach for a Primary Key in InnoDB is one that uses an integer AUTO_INCREMENT value. One of the benefits of a PK AUTO_INCREMENT is that it allows InnoDB to add new entries at the end of the table, and keeps the BTREE index from having to be split at any point. More on this splitting reference in a moment. Note also that this blog post isn’t intended to promote one type of model over another, my goal is really to illustrate the impact your choice of PK will have on the data on disk.
A Primary Key serves multiple purposes in InnoDB:
- Ensures uniqueness between rows
- InnoDB saves row data on disk clustered by the Primary Key
- Depending on the type used and INSERT / UPDATE pattern used, either provides for a unfragmented or severely fragmented Primary Key
I wanted to profile three different Primary Key types:
- integer AUTO_INCREMENT – this key will consume 4 bytes
- binary(16) using Ordered UUID() – as per Karthik’s UUID()-optimised blog post
- char(36) using UUID() – what my customer was used
I then used the powerful tool innodb_space’s function space-lsn-age-illustrate (from Jeremy Cole’s innodb_ruby project) to plot the LSN (InnoDB’s Log Sequence Number, an always-incrementing value) pages from each table that uses the different Primary Keys via ASCII colour (so hot, right? Thanks Jeremy!!). For reference, the legend indicates that the darker the colour, the “older” the page’s updated version is (the LSN), while as you move across the colour spectrum to pink you’re looking at the most recently written LSN values. What I’m trying to illustrate is that when you use AUTO_INCREMENT or UUID() that has been modified to insert in an ascending order, you get virtually no page splits, and thus consume the minimal amount of database pages. On the left side you’re looking at the page IDs for this table, and the lower the number of pages consumed, the more efficiently packed the table’s data is within those pages.
This is an example of INSERT-only based on a Primary Key of AUTO_INCREMENT. Notice how the darker colours are heavy at the earliest pages and lighter as we get to writing out the higher number pages. Further this table finishes writing somewhere around 700 pages consumed.
As we look at the optimised-UUID() INSERT pattern we see that it too has a very evenly distributed pattern with oldest pages at the start (lowest page IDs) and newest written pages at the end of the table. More pages are consumed however because the Primary Key is wider, so we end somewhere around 1,100 pages consumed.
Finally we arrive at the UUID() INSERT pattern, and as we expected, the fragmentation is extreme and has caused many page splits — this is the behaviour in InnoDB when a record needs to be written into an existing page (since it falls between two existing values) and InnoDB realises that if this additional value is written that the capacity of the page will be overcommitted, so it then “splits” the page into two pages and writes them both out. The rash of pink in the image below shows us that UUID() causes significant fragmentation because it is causing pages to be split all throughout the table. This is deemed “expensive” since the ibd file now is more than 2x greater than the UUID()-optimised method, and about 3x greater than a Primary Key with AUTO_INCREMENT.
Based on this investigation we determined that the true size of the 1 billion row table was about half the size as reported by Linux when examining the .ibd file. We happened to have an opportunity to dump and load the table (mysqldump | mysql) and found that on restore the table consumed 450GB of disk — so our estimate was pretty good!
I also wanted to highlight that you can determine for yourself the statistics for data / pages split. As you can see below, the first two PK distributions are very tight, with pages packed up to 90%, however the UUID model leaves you with just slightly higher than 50%. You can run this against your prepared backups if you use Percona XtraBackup since at least version 2.1 by using the –stats option.[root@mysql]# xtrabackup --stats --datadir=/data/backups/mysql --target-dir=/data/backups/mysql | grep -A5 test | grep -A5 PRIMARY table: test/t1, index: PRIMARY, space id: 13, root page: 3, zip size: 0 estimated statistics in dictionary: key vals: 8, leaf pages: 99, size pages: 161 real statistics: level 1 pages: pages=1, data=1287 bytes, data/pages=7% leaf pages: recs=60881, pages=99, data=1461144 bytes, data/pages=90% -- table: test/t2_uuid_ordered, index: PRIMARY, space id: 14, root page: 3, zip size: 0 estimated statistics in dictionary: key vals: 8, leaf pages: 147, size pages: 161 real statistics: level 1 pages: pages=1, data=3675 bytes, data/pages=22% leaf pages: recs=60882, pages=147, data=2191752 bytes, data/pages=91% -- table: test/t3_uuid, index: PRIMARY, space id: 15, root page: 3, zip size: 0 estimated statistics in dictionary: key vals: 8, leaf pages: 399, size pages: 483 real statistics: level 2 pages: pages=1, data=92 bytes, data/pages=0% level 1 pages: pages=2, data=18354 bytes, data/pages=56%
Below are the table definitions along with the scripts I used to generate the data for this post.mysql> show create table t1G *************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `c1` int(10) unsigned NOT NULL AUTO_INCREMENT, `c2` char(1) NOT NULL DEFAULT 'a', PRIMARY KEY (`c1`), KEY `c2` (`c2`) ) ENGINE=InnoDB AUTO_INCREMENT=363876 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql> show create table t2_uuid_orderedG *************************** 1. row *************************** Table: t2_uuid_ordered Create Table: CREATE TABLE `t2_uuid_ordered` ( `pk` binary(16) NOT NULL, `c2` char(1) NOT NULL DEFAULT 'a', PRIMARY KEY (`pk`), KEY `c2` (`c2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql> show create table t3_uuidG *************************** 1. row *************************** Table: t3_uuid Create Table: CREATE TABLE `t3_uuid` ( `pk` char(36) NOT NULL, `c2` char(1) NOT NULL DEFAULT 'a', PRIMARY KEY (`pk`), KEY `c2` (`c2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)[root@mysql]# cat make_rows.sh #!/bin/bash while [ 1 ] ; do mysql -D test -e "insert into t1 (c2) values ('d')" ; mysql -D test -e "insert into t2_uuid_ordered (pk, c2) values (ordered_uuid(uuid()), 'a')" ; mysql -D test -e "insert into t3_uuid (pk, c2) values (uuid(), 'a')" ; done[root@mysql]# cat space_lsn_age_illustrate.sh #!/bin/bash cd /var/lib/mysql echo "T1 - AUTO_INCREMENT" innodb_space -f test/t1.ibd space-lsn-age-illustrate echo "T2 - UUID() ORDERED" innodb_space -f test/t2_uuid_ordered.ibd space-lsn-age-illustrate echo "T3 - UUID()" innodb_space -f test/t3_uuid.ibd space-lsn-age-illustrate
I hope that this post helps you to better understand the impact of random vs ordered Primary Key selection! Please share with me your thoughts on this post in the comments, thanks for reading!
Note to those attentive readers seeking more information: I plan to write a follow-up post that deals with these same models but from a performance perspective. In this post I tried to be as specific as possible with regards to the disk consumption and fragmentation concerns – I feel it addressed the first part and allude to this mysterious “fragmentation” beast but only teases at what that could mean for query response time… Just sit tight, I’m hopeful to get a tag-along to this one post-PLMCE!
By the way, come see me speak at the Percona Live MySQL Conference and Expo in Santa Clara, CA the week of April 13th – I’ll be delivering 5 talks and moderating one Keynote Panel. I hope to see you there! If you are at PLMCE, attend one my talks or stop me in the hallway and say “Hi Michael, I read your post, now where’s my beer?” – and I’ll buy you a cold one of your choice
- Keynote Panel: The next disruptive technology: “What comes after the Cloud and Big Data?”
- Main Conference – What I learned while migrating MySQL on-premises to Amazon RDS
- MySQL 101 & Main Conference – Running MySQL on AWS
- MySQL 101 – Percona Server Advantage (with Christopher Jeffus)
- MySQL 101 – MySQL & InnoDB Fundamentals & Configuration (with Justin Swanhart)
The post Illustrating Primary Key models in InnoDB and their impact on disk usage appeared first on MySQL Performance Blog.
Years ago when I worked for the MySQL Support organization at the original MySQL AB, we spoke about MySQL Support as insurance and focused on a value proposition similar to that of car insurance. For your car to be fully covered, you must purchase car insurance before the incident happens – in fact most places around the world require automobile insurance. Similarly with many organizations, any production-use technology might be mandated to have its own insurance in the way of 24/7 support.
I think however this is a very one-sided view that does not capture the full value (and ROI) that a MySQL Support contract with Percona provides. Let’s look at the different dimensions of value it provides based on the different support cases we have received throughout the years.
Reduce and Prevent Downtime
If your database goes down, the time to recover will be significantly shorter with a support agreement than without it. Cost of downtime varies widely between organizations. Gartner estimates the average cost of downtime is $5,000 per minute.
With most of our clients, we have found that the cost of preventing or rapidly reducing even one significant downtime event a year more than pays for the cost of support. Even when the client’s in-house team is very experienced, our help is often invaluable as we are exposed to a great variety of incidents from hundreds of companies, so it is much more likely we have encountered the same incident before and have a solution ready. Helping to recover from downtime quickly is a reactive part of support – you can realize even more value by proactively working with support to get advice on your HA options as well as ensure that you’re following the best database backup and security practices.
Having a MySQL Support contract by itself is not enough to prevent all security incidents. MySQL will be only one of the components for a possible attack vector and it takes a lot of everyday work to stay secure. There is nothing that can guarantee complete security. MySQL Support, however, can be an invaluable resource for your security team to learn how to apply security and compliance practices to your MySQL environment and how to avoid typical mistakes.
The cost of data breaches can be phenomenal and also impact business reputations much more than downtime or performance issues. Depending on the company size and market, costs will vary. Different studies estimate costs ranging in average from $640K in direct costs to $3.5M. What everyone seems to agree upon is that security risks and security costs are on the rise and you can’t afford to leave this areas unchecked.
Fix Database Software Bugs
While you might have great DBAs on your team who are comfortable with best practices and downtime recovery, most likely you do not have a development team comfortable with fixing bugs in the database kernel or supporting tools. Being able get software fixes contributes to downtime reduction as well as all kinds of other things, such as ensuring efficient development and operation teams, avoiding using complex workarounds, etc.
A large number of questions we get are performance-related which, when addressed, provide a better experience for users, saves costs, and minimizes environmental impact by using less resources.
Savings vary depending on your application scale and how much it is already optimized. In the best cases, our support team has helped customers make applications more than 10x more efficient. In most cases though, we are able to help make things at least 30% more efficient. If you’re spending $100K or more on your database environment, this benefit alone will make a support agreement well worth it.
This is important one. Way too often customers do not even give their developers access to support, even though these developers are critical in realizing the the full value of their application. Developers working with databases make many decisions about schema design, query writing, and the use of MySQL features such as stored procedures, triggers or foreign keys. Without a MySQL Support contract, developers often have resort to “Google” to find an answer – and often end up with inapplicable, outdated or simply wrong information. Combined with this, they often apply or resort to time-consuming trial and error.
With help of the Percona Support team, developers can learn the proven practices that apply to their specific situation, save a lot of time and get a better application to the market faster. Even with a single US-based developer intensively working with MySQL, a support agreement might be well worth the cost based on increased developer efficiency alone. Larger development teams simply cannot afford to not have support.
Your operations staff (DBAs, DevOps, Sysadmins) are in the same boat – if your database environment is significant, chances are you are always looking for ways to save time, make operations more efficient and reduce mistakes. Our Support team can provide you with specific actionable advice for the challenges you’re experiencing.
Chances are we have seen environments similar to yours and know which software, approaches and practices work well and which do not. This all of course contributes to downtime prevention and reduction, but also helps with team efficiency. With the Percona Support team’s help, you will be able to handle operations with a smaller team or be able to have stuff done with less experienced staff members.
Percona Support access helps developers not only be more productive, but also results in better application quality because best practices in application database interface design, schema, queries, etc. are followed. The Percona team has supported many applications for many years, so we often will think about problems before you might think about them, such as:
- “How will this design play with replication or sharding?”
- “Will it scale with large amounts of users or data?”
- “How flexible is such a design when the application will inevitably be evolving over years?”
While a better application is hard to quantify, it really is quite important.
Faster Time to Market
Yet another benefit that comes from developers having access to a MySQL Support team is faster time to market. For many agile applications, being able to launch new features faster is even more important than cost savings – this is how businesses can succeed against the competition. At Percona, we love helping businesses succeed.
As you see, there are a lot of ways Percona Support can contribute to the success of your business. Support is much more than “insurance” that you should consider purchasing for compliance reasons. If you’re using MySQL for your applications, Percona Support will provide a great return on investment, allowing you to minimize risks and costs while delivering the highest quality of application or service possible.
Recently Todd Farmer shared an interesting story about the mysql command line prompt in MySQL 5.7: how it was changed to provide more context and why the change was finally reverted. This made me think that after using the command line client for MongoDB for awhile, I would love seeing a much more modern mysql shell prompt. Here are a few examples of what a modern command line client can do.Add dynamic information to the prompt
If you use replication with MongoDB, you have probably noticed a nice feature of the prompt: it is replication aware. What I mean is that for a standalone instance, the prompt is simply:>
When you configure this instance to be the primary of a replica set named RS, the prompt automatically becomes:RS:PRIMARY>
and for secondaries, you will see:RS:SECONDARY>
And of course if an election is triggered and roles are switched, the prompt is updated accordingly to reflect the change.
Such a feature may not be easily transposed to MySQL as the allowed replication topologies are more flexible: for instance with master-master replication or for chained replication, some servers may be both a master and a slave.
Your prompt will be changed to something like:test/11:37:51>
And the change can be persisted by defining the prompt variable in a ~/.mongorc.js file.
If I could do something similar with MySQL, I would for instance be able to know the replication lag of a slave or the uptime of MySQL and so on and so forth. That would be incredibly useful in some situations.Better help system
Let’s assume I want to manually fail over to a secondary in a MongoDB replica set. I can simply instruct the master that it should no longer be the master. But what is the exact command?
I know that it is rs.xxx because all replica sets functions have this format, so let’s simply open the shell and type rs. and all the available options show up:> rs. rs.add( rs.constructor rs.propertyIsEnumerable( rs.syncFrom( rs.addArb( rs.debug rs.prototype rs.toLocaleString( rs.apply( rs.freeze( rs.reconfig( rs.toString( rs.bind( rs.hasOwnProperty( rs.remove( rs.valueOf( rs.call( rs.help( rs.slaveOk( rs.conf( rs.initiate( rs.status( rs.config( rs.isMaster( rs.stepDown(
Here again, I’d love to see something similar with MySQL. For instance each time I need to convert a timestamp to a human readable date, I have to look at the documentation to find the exact function name.
Actually I could have used the built-in help system to find the name of the function, but it is much more convenient to look at the documentation.Conclusion
What’s your opinion on which new features a modern mysql shell should provide? If many people agree on a set of features, it could be worth filing a feature request to get a better mysql command line client in a future version of MySQL.
Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.
- Decrypting backup with the wrong key would make the backup unusable and unrecoverable. innobackupex doesn’t automatically delete the *.qp and *.xbcrypt files anymore, after --decrypt and --decompress are used. Bug fixed #1413044.
- XtraDB Changed Page Tracking wasn’t working with innobackupex. Bug fixed #1436793.
- Fixed Percona XtraBackup assertion caused by dirty pages remaining in the buffer pool after the log was fully applied. Bug fixed #1368846.
- Backup will not be prepared and innobackupex will stop with an error if the transaction log file is corrupted and it wasn’t applied to the intended LSN. Previously this was showing only as a warning. Bug fixed #1414221.
- New status log-applied is introduced for backup prepared with --redo-only to avoid making the backup unusable by preparing full or incremental backup without --redo-only and then applying next incremental on top of it. Incremental backup now can be applied only to backup in log-applied state, but not to full-prepared as it was earlier. Bug fixed #1436790.
Release notes with all the bugfixes for Percona XtraBackup 2.2.10 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.
The post Percona XtraBackup 2.2.10 for MySQL hot backups is now available (free!) appeared first on MySQL Performance Blog.
Based on Percona Server 5.5.41-37.0 including all the bug fixes in it, Galera Replicator 2.11, and on Codership wsrep API 25.11, Percona XtraDB Cluster 5.5.41-25.11 is now the current 5.5 General Availability release. All of Percona‘s software is open-source and free, and all the details of the release can be found in the 5.5.41-25.11 milestone at Launchpad.
- XtraBackup SST wouldn’t stop when MySQL was SIGKILLed. This would prevent MySQL to initiate a new transfer as port 4444 was already utilized. Bug fixed #1380697.
- garbd was returning incorrect return code, ie. when garbd was already started, return code was 0. Bugs fixed #1308103 and #1422863.
- wsrep_sst_xtrabackup-v2 script was causing innobackupex to print a false positive stack trace into the log. Bug fixed #1407599.
- MyISAM DDL (CREATE TABLE only) isn’t replicated anymore when wsrep_replicate_myisam is OFF. Note, for older nodes in the cluster, wsrep_replicate_myisam should work since the TOI decision (for MyISAM DDL) is done on origin node. Mixing of non-MyISAM and MyISAM tables in the same DDL statement is not recommended with wsrep_replicate_myisam OFF since if any table in list is MyISAM, the whole DDL statement is not put under TOI (total order isolation). This also doesn’t work if default_storage_engine variable is set to MyISAM (which is not recommended for Percona XtraDB Cluster) and a table is created without the ENGINE option. Bug fixed #1402338.
- Percona XtraDB Cluster now shows a warning in case additional utilities, like pv which may not affect critical path of SST, are not installed. Bug fixed #1248688.
- wsrep_causal_reads variable was not honored when declared as global. Bug fixed #1361859.
- garbd would not work when cluster address was specified without the port. Bug fixed #1365193.
- garbd was running as root user on Debian. Bug fixed #1392388.
- Errors in garbd init script stop/start functions have been fixed. Bug fixed #1367956.
- If mysqld gets killed during the SST it will leave an unclean data directory behind. This would cause Percona XtraDB Cluster to fail when the server would be started next time because the data directory would be corrupted. This was fixed by resuming the startup in case wsrep-recover failed to recover due to corrupted data directory. The old behavior is still achievable through --exit-on-recover-fail command line parameter to mysqld_safe or exit-on-recover-fail under [mysqld_safe] in my.cnf. Bug fixed #1378578.
- gvwstate.dat file was removed on joiner when XtraBackup SST method was used. Bug fixed #1388059.
- xtrabackup-v2 SST did not clean the undo log directory. Bug fixed #1394836.
- stderr of SST/Innobackupex is logged to syslog with appropriate tags if sst-syslog is in [sst] or [mysqld_safe] has syslog in my.cnf. This can be overridden by setting the sst-syslog to -1 in [sst]. Bug fixed #1399134.
- clustercheck can now check if the node is PRIMARY or not, to allow for synced nodes which go out of PRIMARY not to take any writes/reads. Bug fixed #1403566.
- Race condition between donor and joiner in Xtrabackup SST Configuration has been fixed. This caused XtraBackup SST to fail when joiner took longer to spawn the second listener for SST. Bug fixed #1405668.
- SST will now fail early if the xtrabackup_checkpoints file is missing on the joiner side. Bug fixed #1405985.
- socat utility was not properly terminated after a timeout. Bug fixed #1409710.
- 10 seconds timeout in Xtrabackup SST Configuration script was not enough for the joiner to delete existing files before it started the socat receiver on systems with big datadir. Bug fixed #1413879.
- Conflict between enforce_storage_engine and wsrep_replicate_myisam for CREATE TABLE has been fixed. Bug fixed #1435482.
- SST processes are now spawned with fork/exec instead of posix_spawn to allow for better cleanup of child processes in event of non-graceful termination (SIGKILL or a crash etc.). Bug fixed #1382797.
- Variable length arrays in WSREP code were causing debug builds to fail. Bug fixed #1409042.
- Signal handling in mysqld has been fixed for SST processes. Bug fixed #1399175.
- Inserts to a table with autoincrement primary key could result in duplicate key error if another node joined or dropped from the cluster during the insert processing. Bug fixed #1366997.
Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!
Please also note that Percona XtraDB Cluster 5.6 series is the latest General Availability series and current GA release is 5.6.22-25.8.
The post Percona XtraDB Cluster 5.5.41-25.11 is now available appeared first on MySQL Performance Blog.