MySQL Query Profiling

When you begin designing, implementing and testing an application, you obviously have a very clear picture of what queries are executed where and roughly how often. Following every rule in the book, you design your schema and spec your servers accordingly. However, as the application grows, becomes more dynamic, get more users, new functionality, get tweaked etc. your initial clear overview might be quite a bit off from the truth. Therefore it is quite good to every now and then run a query profiler on your DBs.

It is really quite simple, first step is to enable the general log which you do by putting this in my.cnf:

1
2
general_log=1
log_output=TABLE

You can chose to log to both a table and a file, or just a file as well for that matter. To log to both, set log_output to TABLE,FILE. It will then be logged to the mysql.general_log table as well as whatever file you specify with the log= statement. If you omitt the log_output variable in MySQL 5.0, the default is to log to file, this has however changed in versions > 5.1.6 where a table is the default. There is obviously a slight overhead in this as each statement along with some other information is written on each query. The structure of the general_log table looks like this:

1
2
3
4
5
6
7
8
CREATE TABLE `general_log` (
`event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`user_host` mediumtext,
`thread_id` int(11) DEFAULT NULL,
`server_id` int(11) DEFAULT NULL,
`command_type` varchar(64) DEFAULT NULL,
`argument` mediumtext
) ENGINE=CSV DEFAULT CHARSET=utf8 COMMENT='General log'

To get MySQL logging without having to restart the server, you simply issue these commands:

1
SET GLOBAL log_output='TABLE'; SET GLOBAL general_log='ON';

which has the same effect (note that this is only possible in 5.1) Then let it run for an adequate amount of time, an hour or two during or around peak hours is usually sufficient to get a good statistically significant data set. When you feel that you’ve got enough queries and that they portray normal usage you need to dump the general_log table such as this:

1
2
3
mkdir /dump
chown mysql.mysql /dump
mysqldump --skip-lock-tables -K --tab=/dump/ --fields-terminated-by=',' --fields-enclosed-by='"' --database mysql --table general_log > general_log.sql

This will give you two files in the /dump directory, general_log.sql and general_log.txt. The latter is a CSV formatted file which we will feed to a program called MyProfi. This is just a zipped PHP script. You run this script on the general_log.txt, like this for instance:

1
/usr/bin/php parser.php -csv -top 20 -type "select, update, delete, insert" /dump/general_log.txt

This will give you the top 20 most common queries executed on your DBs. You can of course change the arguments to -type to just say “select” to get the 20 most common select queries etc. Very useful!

Mar 5th, 2008