Posted on: 23 April 2010
I'm spending quite a lot of time at the moment working on performance optimisations for Analytics SEO, and learning much more about the subtle differences between the different MySQL storage engines.
We use a mixture of storage engines - as a general rule, small tables which are mostly read and only occassionally updated with small, quick updates use the MyISAM storage engine, while our larger tables which have more frequent and often long-running updates use InnoDB.
Lately, as the load has been increasing, I've been seeing more locks being held for longer. A bit of digging reveals some interesting concurrency issues being caused by the MyISAM tables. With a few pointers from Peter, I recreated the problem like this:
- start a long-running INSERT INTO innodb_table1 SELECT FROM innodb_table2 JOIN myisam_table1
- try a short, fast UPDATE myisam_table1
- try a short, fast SELECT FROM myisam_table1
It turns out that the long-running SELECT in query 1 puts a read lock on the MyISAM table, which blocks the update in query 2, which is fair enough. However, the SELECT in query 3 should still be able to read from the MyISAM table - concurrent SELECTs should be fine. But MySQL queues the SELECT lock behind the UPDATE in query 2, so both queries 2 and 3 get blocked.
The solution turns out to be pretty straightforward - change the storage engine for the MyISAM table to InnoDB. Thanks to InnoDB's MVCC architecture, the update in step 2 and the select in step 3 can both run even though the query in step 1 is updating the entire table.
Now I just need to schedule migrating the other MyISAM tables to InnoDB.