Hello. I have a new blog.

I've moved this blog to the following URL Kerkness.ca. Thank you for visiting, please update your bookmarks.

Friday, April 25, 2008

Getting a performance boost from MySQL

I have to admit I am no database guru but I do know how to organize my information into a decent set of tables. Every table has a primary key and every field is just large enough and the right type for the data it needs to hold. In general my databases always perform well (enough) and I have little issue.

Then while reading up on how to properly configure replication with MySQL I stumbled upon some information on Indexes and I realized that I wasn't using my database as best I could.

If you asked me a week ago what an 'Index' was in relation to a MySQL table I would have told you it was the 'primary key'. Which is only partially true. While the primary key is automatically used as an Index for a table it doesn't have to be the only Index for the table.

What is an Index

An Index is basically a method MySQL (and other databases) use to organize a table to make it easier to search. For example, because the primary key for any table is indexed (and unique) when you search the table for a record matching a primary key value, the database knows it only needs to find 1 matching record. If there was no primary key the database doesn't know how many possible matches it needs to find so it will have to look at every single record to try and find all possible matches.

Knowing this, if you create Indexes for your table on all fields which are frequently searched or are used in Join statements you'll help your database find records more easily.

An Example

I typically use a lot of joins when I query the database. Let's look at how one of these join queries perform without any Indexing ( apart from the default indexing of my primary keys ).

In my database there is a table called 'artist' which contains the names, biography and details on different artists. The artist table has a primary key field call artistid. In the same database there is another table called artwork which contains details on individual pieces of art. The artwork table has a primary key field called artid, but it also contains its own artistid field. The artist and artwork tables have a one-to-many relationship, meaning that a single artist can have many works of art.

Using this database design, you can load a complete listing of works of art by a single artist with the following query.

SELECT artwork.*, artist.* FROM artwork
LEFT JOIN artist ON artwork.artistid=artist.artistid WHERE artwork.artistid=46;
This query will work just fine. However if we want to gain some insight into how much information MySQL needs to process to perform this query we can add the EXPLAIN declaration to the front of the query.
EXPLAIN SELECT artwork.*, artist.* FROM artwork
LEFT JOIN artist ON artwork.artistid=artist.artistid WHERE artwork.artistid=46;
When we run this query at the command prompt MySQL gives the following response.
+----+-------------+---------+-------+---------------+----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | artwork | ALL | NULL | NULL | NULL | NULL | 3106 | Using where |
| 1 | SIMPLE | artist | const | artistid | artistid | 4 | const | 1 | |
+----+-------------+---------+-------+---------------+----------+---------+-------+------+-------------+
This tells us that when looking up information from the artist table the database was able to use the primary key index and looked at a total of 1 rows in the table in order to find the information it needed. That seems pretty efficient. However when looking in the artwork table the database had do a full table scan and looked at 3106 rows to find matching records.

Let's see what happens when we add an Index on artistid to the artwork table.
ALTER TABLE artwork ADD INDEX( artistid );
Now when we run the same EXPLAIN query we get the following details from MySQL
+----+-------------+---------+-------+---------------+----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | artwork | ref | artistid | artistid | 5 | const | 4 | Using where |
| 1 | SIMPLE | artist | const | artistid | artistid | 4 | const | 1 | |
+----+-------------+---------+-------+---------------+----------+---------+-------+------+-------------+
Now doesn't that look much more efficient. This time because the database is keeping an Index on the artistid field in both tables. It only needed to look at 4 rows from the artwork table to find the matching results.

I did some additional performance testing to see how much of a boost adding this one extra Index provided during peak times of database use. Prior to the indexing the query on average took 0.06 seconds, after the indexing it took 0.00. That's just on one tiny little query on a database that has only 3000 records. If you start adding extra indexes to all the tables in your database you should notice a nice boost in speed and performance.

2 comments:

  1. Great book on mysql is High Performance MySQL.

    Goes over tons of optimisations you'd never know about otherwise.

    ReplyDelete
  2. Thanks for the tip. I'll check it out.

    ReplyDelete

Thank you for the comments.