Resyncing Slaves With Slaves

When dealing with replicated setups with two or more slaves sharing a master, it appears as if a lot of people overlook the obvious. You don’t need to take your master down to resync a slave. I was hoping I wouldn’t need to post about this, but I see people taking down their masters when they have perfectly healthy slaves way too often to let it slip.

You’ve got everything you need on the other slave(s). Provided that it’s in good health, you’ve got all the data, the master’s binlog file name and position. Run SHOW SLAVE STATUS\G on the slave, take note of Relay_Master_Log_File and Exec_Master_Log_Pos which are the same as what you’d get from SHOW MASTER STATUS\G on the master instance, minus the lag which is irrelevant in this case. Then proceed to sync the data from the healthy slave and use the above values in the CHANGE MASTER TO statement (obviously setting MASTER_HOST to the real master, not the other slave).

Happy higher availability!

Jul 5th, 2008

MySQL Load Balancing With Mylbhelper

When your app outgrows its single DB, the next logical step is attaching a slave to your current DB server to spread the read load. To do this efficiently, you will need a load balancer. If your app and growth is somewhat normal, you will at this point already have at least two front-end servers. Chances are that these are load balanced as well. So when it’s time to load balance your DB servers, you’ll already have the means to do this at hand.

In the highly likely - yet unfortunate - event that you have a load balancer without MySQL capabilities, you can always set up a generic TCP cluster (all traffic on port 3306 goes there and there). The down side of this is that there is no L7 checks - something you really want when load balancing your backend. The best you can do is L4 (is there anything accepting connections on this port?) Needless to say, there are a lot of problems that can impact your application which doesn’t make MySQL stop listen on its port. Table corruption, accidentally dropped tables, permission issues, max_connection hit, privilege problems etc.

A customer of ours was in this particular situation. They had a very decent hardware load balancer for their webservers with capacity to spare. So they ended up load balancing the mysql instances through the same device and using a piece of software I’ve written called mylbhelper.

In a nutshell, it runs as a daemon and periodically runs a custom query on the local DB server, if it fails in any way shape or form, it executes a custom script. The script which comes with mylbhelper blocks L4 access (ie. firewalls port 3306) so that the load balancer stops sending traffic to it. Of course you can write your own scripts. Once mylbhelper has successfully executed the predefined query (twice, to avoid flapping), another script runs. Obviously, the shipped script simply removes the firewall rule put in place.

And oh it’s written in C, so you’ll need libmysql in order to compile and run it. It runs on any posix compliant system and is released under the BSD license.

Jun 29th, 2008

Tool Tip: Mysqlsniff

mysqlsniff is a tool that I find very useful and utilise a lot, but it doesn’t seem to be so widely known as it deserves to be. I often see people ask “how can I see what queries are being ran on my server?” to which the answer almost always is “enable general logging or run show processlist”. That’s all fine and well in some cases general logging requires restarting the server (unless you’re on 5.1) and show processlist is just point-in-time. They can both help, but aren’t ideal in all situations! Sure, show processlist is good to see that long running query, so it’s obviously not altogether useless, but as a tool to get an overview of query frequency etc. it’s rather limited.

With querysniffer you get a real time overview of all queries which are running. It’s a simple perlscript and is easy enough to get going with. On RedHat/CentOS, you’d go about it like this:

1
2
3
4
5
wget http://iank.org/querysniffer/mysqlsniff-0.10.pl.txt -O mysqlsniff.pl
yum install libpcap-devel
cpan -i Net::PcapUtils
cpan -i NetPacket::Ethernet
perl mysqlsniff.pl eth0

And now you should see any query sent to the server over eth0 on your terminal.

Jun 18th, 2008

Wet Dream Finally Coming Through?

I’m one of the people who don’t believe replication is the worst thing since bad hair day was invented. Flawed? Absolutely! But used in the right situation and catered for in setup and application, it can take you quite far!

Being at the peak of the “scalability hype”, where everyone pretends to be a mini-livejournal or flickr , the vast amount of companies of a more humble size out there is often forgotten about. The type of companies that easily get away with a decent master and a slave or two hanging off of it, and most likely will for a very long time to come.

So, why is it flawed? My major beef with it is that it’s not self-healing. Sure, you can monitor, script and re-jig things to a certain extent. But this is why I was thrilled when read the MySQL Forge suggestions for Google Summer of Code. One of the suggestions is to enable self-healing replication using components, or at least concepts, from maatkit and Google’s MMM.  

Interesting! While I can’t see it become completely fool proof, I’m sure it would help in the majority of scenarios I’ve seen where replication has broken.

As the linked forum post says, it is a bit of a shame that the tools to make this has been conceived by people outside of MySQL when it really should have been a part of the server’s implementation a long time ago!

Let’s just hope someone talented with some spare time steps up to the challenge!

Jun 3rd, 2008

Paramy - Import Dumps in a Flash

Follwing an interesting and lengthy discussion one evening on whether multi threaded data importing should be part of the client or the server, Domas went ahead and scratched his itch to write something and came up with paramy.

Basically it’s a multithreaded client. Most servers these days have, or certainly should have, multiple disks and multiple CPU cores and reasonably fast storage. So using a single threaded client to insert those hundreds of thousands or millions of records doesn’t make that much sense today. There’s quite a lot of time to save. I’ve ran some tests on MySQL 5.1.24 and compared the results with those from the stock mysql client:

1 million rows, 15k SAS drives in RAID 10, 4 cores @ 2.33GHz, 6G of ram, 4 threads for paramy. Table is int, char(8), char(8), InnoDB plugin 1.0.1 (given enough time, I’ll compare these to MySQL 5.0 and 4.0 as well at a later stage):

Extended insert format:

1
2
3
4
paramy: 58786.74 inserts/s - 0m4.977s
stock: 25898.35 inserts/s - 0m9.588s
-------------------------------------
51% gain

Single row format:

1
2
3
4
paramy: 24329.54 inserts/s - 0m25.221s
stock: 16642.02 inserts/s - 0m52.753s
-------------------------------------
47% gain

So, effectively cutting the insert time in half. It can’t handle auto incremented fields very well yet, but domas' working code (not publicly available yet as far as I know) handles these with 1.3 core speed.

Be careful not to over-do it though, as more threads doesn’t necessarily equal more throughput. Take CPU cores and disks into account when setting the amount. I ran paramy with a few different number of threads. As stated above I ran these test on a 4 core system:

1
2
3
4
5
n threads, single row format:
2 threads: 0m39.700s
4 threads: 0m25.221s
6 threads: 0m31.889s
8 threads: 0m32.761s

I tested paramy on an RHEL4 system, and experienced an issue to begin with. After reporting this, it was concluded that it was related to the LOCK TABLES, so after a swish grep -v "LOCK TABLES" paramy.sql > paramy_nolock.sql, it was good to go.

All in all - this tool is bound to save me hours and hours of time in the future! Good work indeed! 

May 30th, 2008

InnoDB Plugin Compression With Benchmarks

Following the announcement of InnoDB Plugin 1.01 I was keen to get some time over to give it a try (yes, I’ve been busy and this post has been in edit for some time now). I first tried the shortcut and downloaded the precompiled plugin, but got linker errors and messages of the wrong API (turns out it’s built against 5.1.23 and I was trying on 5.1.22). I also had an issue with the RHEL4 specific RPMs and the glibc specific plugin. Seeing where this was going, I figured it was quicker to recompile the lot from source. This was a painless process.

As I am quite intrigued by the compression facility, I shortly after the compilation proceeded by trying to create a table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
mysql> CREATE TABLE compressed_4 (id int(11) primary key, txt char(8)) 
Engine=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4;
Query OK, 0 rows affected, 5 warnings (0.00 sec)

mysql> show warnings;
+---------+------+----------------------------------------------------------------+
| Level   | Code | Message                                                        |
+---------+------+----------------------------------------------------------------+
| Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires innodb_file_per_table.         |
| Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires innodb_file_format > Antelope. |
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4.                             |
| Warning | 1478 | InnoDB: ROW_FORMAT=COMPRESSED requires innodb_file_per_table.  |
| Warning | 1478 | InnoDB: assuming ROW_FORMAT=COMPACT.                           |
+---------+------+----------------------------------------------------------------+
5 rows in set (0.00 sec)

Primary mistake: Not reading the docs properly. So after sticking innodb_file_per_table=1 and innodb_file_format=“Barracuda” into my.cnf and restart MySQL, the creation worked fine. I created three tables, 3 with compression and 1 without:

1
2
3
4
CREATE TABLE compressed_4 (id int(11) PRIMARY KEY, txt char(8)) Engine=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4;
CREATE TABLE compressed_8 (id int(11) PRIMARY KEY, txt char(8)) Engine=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;
CREATE TABLE compressed_1 (id int(11) PRIMARY KEY, txt char(8)) Engine=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1;
CREATE TABLE noncompressed_4 (id int(11) PRIMARY KEY, txt char(8)) Engine=InnoDB;

I then proceeded to insert 50000 rows and inspect the on-disk size:

1
2
3
4
5
# du /var/lib/mysql_i/testing/*.ibd
3076    /var/lib/mysql_i/testing/compressed_1.ibd
4100    /var/lib/mysql_i/testing/compressed_4.ibd
6156    /var/lib/mysql_i/testing/compressed_8.ibd
10256   /var/lib/mysql_i/testing/noncompressed_4.ibd

So far so good! 30% with 1K page size (referred to as KEY_BLOCK_SIZE at table creation because MySQL doesn’t allow storage engines to add their own syntax (another FIXME for our friends at MySQL? At least one should get the option in my opinion)).

But what is the performance like? Both the insert and select benchmarks are done with to a ramdisk. I want to know the impact on the CPUs and the difference between the different compression rates, not the performance of the disks:

1
2
3
4
1K insert time: 7.4322438240051 seconds.
4K insert time: 5.1487679481506 seconds.
8K insert time: 4.8629088401794 seconds.
NC insert time: 3.5483801364899 seconds. (No Compression)

Pretty much what I expected for the inserts.

Selects are another story altogether though:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
1K select time: 1.8464620113373 seconds.
4K select time: 1.7520101070404 seconds.
8K select time: 1.6863808631897 seconds.
NC select time: 1.631724023819 seconds. (No Compression)

1K select time: 1.5653350353241 seconds.
4K select time: 1.8094820976257 seconds.
8K select time: 1.7305459976196 seconds.
NC select time: 1.8121519088745 seconds. (No Compression)

1K select time: 1.7385520935059 seconds.
4K select time: 1.8407368659973 seconds.
8K select time: 1.8273019790649 seconds.
NC select time: 1.7326860427856 seconds. (No Compression)

Rather random from the looks of things. Not entirely sure if that’s down to poor benchmarking tools or if I’ve overseen something in regards to how select queries are performed. Yes, the query cache was disabled and the server was rebooted between the runs. I was so perplexed that I ran the tests on two other machines, and both with and without using a ramdisk for storage, but the variation is there regardless. Does anyone have some deeper insight into how the bits and bytes of this works?

My only half-feasible theory at this point is that the reading of the data takes 2-2.5x the time it takes to decompress it. So in a scenario with reading 1K and 4K pages; while MySQL reads the additional 3K, it’s already well on the way of decompressing the 1K since it’s obviously quicker to read 1K than 4, and in the end of the day, they sum up to pretty much the same amount of time - just in different ways (CPU or storage).

So as an example - say a SELECT takes 1 second: For the 1K page, 0.3s would be spent reading it and 0.7s would be spent decompressing it. For the 4K page, 0.7s would be spent reading it and 0.3s would be spent decompressing it.

I suppose this could be tested with a really slow single core CPU and fast storage, or vice versa.

May 29th, 2008

The Cost of Table Locking and Index Flushing in MySQL

This post may not apply to the large majority of applications out there, but I think it’s interesting enough. When doing sequential bulk inserts into MyISAM tables, you may think you’re getting rather good performance number wise. On low end commodity hardware with not too wide tables, you very easily get up into the thousands per second.

When you perform an INSERT on a MyISAM table, you first implicitly aqcuire a full table write lock, then make the insert followed by a flushing of the index. The index flushing is done as the lock is released, so if you insert 1000 rows, you aqcuire a lock and flush the index 1000 times. This is of course a bit of an overhead.

Luckily, if your application allows for it - mitigating the overhead is very simple and effective. If you aquire a table write lock at the beginning of the session, run your inserts and then release the lock, you get rid of quite a big chunk of code execution. I’ve benchmarked this, using a simple int,varchar table with an index on the int column, and the difference is quite noticeable:

1
2
3
Inserts per second, implicit locking: 4272 rows per second

Inserts per second, explicit locking: 5186 rows per second

So in this case, we gained nearly 900 inserts per second, just by prepending the torrent of inserts with LOCK TABLES a WRITE and append it with UNLOCK TABLES;

In contrast, using MySQLs extended insert functionality, these numbers can further be drastically increased. The same benchmark, but with 310 rows inserted with each INSERT statement yields these figures:

1
2
3
Inserts per second, implicit locking: 138666 rows per second

Inserts per second, explicit locking: 159786 rows per second</blockquote>

So a combination of extended inserts and explicit table locking may give your insert heavy application quite a boost! For obvious reasons, this does not really apply for InnoDB!

May 6th, 2008

MySQL's non-GPL Adventure

I’m a bit late on this piece of news, and it’s been treated by more or less everyone who owns a blog. The news in short is that MySQL has decided to not release their online backup tool/driver to the masses. It will only be available to paying Enterprise customers. A lot of people have been writing about it, Jeremy Cole, Vadim Tkachenko and Don MacAskill and they all have valid points in both directions.

The reason why I’m rather late with this is that I’m not entirely sure what I think about it. On one side, I can see their point since they after all are a revenue driven company. MySQL/Sun employees need food on their table as well. Selling additional services and to a certain degree also selling additional software is a model which has been around for a long time. That said; it’s not necessarily a good model by default. The Fedora/RedHat model is frequently referenced as a successful one. I believe this would be ideal for MySQL as well. Let the masses break it, fix it and hand it over to customers willing to pay for the reassurance and increased stability.

On the other side, I’m really pro-opensource and any diversion away from it worries me. The main problem with this approach is that since the backup tool will not have been tested by the general masses and only by the QA team. While the QA team might do their job very well, they couldn’t come close to the quality of testing that hundreds of thousands of people can do together) I’m not particularly fuzzed with not being able to use the backup tool everywhere. I tend to roll my own, or use something a brilliant mind has GPLed. My biggest concern is the quality of the software and the potential of taking this further, extending the witholding of features into the more critical parts of MySQL (storage engines, further developed optimizers etc.) Let’s hope that day never comes!

A mid-way compromise would be to release one version of the tool under a license which allows closed forks, get the bugreports, fix it and release the finalised version to the enterprise customers. However, this is probably the least attractive option even though it’d give the highly skilled community a base to fork the software from and apply our own fixes.

All in all, considering what MySQL still is contributing to the community I will continue to be a fan.

Apr 18th, 2008

Queries Per Second

It’s obviously very important to know how your application and DBs are performing during business as usual. Otherwise you might find yourself in a situation where you’re facing performance degradation, and you have no clue what it’s down to. One important aspect of this is knowing how many queries per second your DBs are processing. If you know that it normally does 8000 queries per second, and you all of a sudden see an average of 12000, you know where to start looking. But if you don’t know what your normal load is, 12000 might strike you as normal and you initially overlook the cause for the slow-down.

I’ve written a perl script which gathers the average amount of queries per second over a given period of time. For the heck of it, I've written it in Perl, and I make no claims of being extremely knowledgeable in this language, so you might be horrified at what you’re about to see:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#!/usr/bin/perl
use POSIX;
use integer;
use DBI;
$user="root";
$pass="";
@hosts = qw \
10.0.0.201
10.0.0.202
10.0.0.203
10.0.0.204
10.0.0.205
\;
###################################################################
sub getQueries {
      foreach (@hosts) {
        $db="test";
        $cs="dbi:mysql:$db;$_";
        $dbh = DBI->connect($cs,$user,$pass);
        $sth = $dbh->prepare("SHOW STATUS LIKE 'questions'");
        $sth->execute();
        $sth->bind_columns(\$var, \$queries);
        $sth->fetch();
        chomp($queries);
        push(@queries, $queries);
        $sth->finish();
        $dbh->disconnect;
      }
}
$interval = 5;
getQueries();
@queries2 = @queries;
@queries = ();
sleep $interval;
getQueries();
$i = 0;
$tot = 0;
print "==========================================\n";
while ($i != $#queries+1){
     $res = ($queries[$i]-$queries2[$i])/$interval;
     $tot = $tot + ceil($res);
     print @hosts[$i] . " = ". $res."\n";
     $i++;
}
print "Total: " . $tot;
print "\n==========================================\n";

 You might want to change the user, pass and IP addresses in the top of the code. The output of this code looks like this:

1
2
3
4
5
6
7
8
9
# perl qps.pl
==========================================
10.0.0.201 = 1752
10.0.0.202 = 1908
10.0.0.203 = 1719
10.0.0.204 = 2031
10.0.0.205 = 1809
Total: 9219
==========================================

I might post a bit later on other tools you can use to get to know your setup a bit better.

Apr 1st, 2008

MySQL Query Cache vs. Memcached? Ridiculous!

Reading around on mailing lists, blog comments and forums etc. about methods of effective caching, I’ve seen people dismiss any form of caching, such as memcache, with the argument that MySQL has got a query cache anyways.

This is what I have to say about that:

1
2
3
4
5
6
7
mysql> show status like "QCache_hits";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Qcache_hits   | 0     |
+---------------+-------+
1 row in set (0.01 sec)
1
2
3
4
5
6
7
8

mysql> select txt from a where id=2;
+------+
| txt  |
+------+
| data |
+------+
1 row in set (0.00 sec)
1
2
3
4
5
6
7
mysql> select txt from a where id=2;
+------+
| txt  |
+------+
| data |
+------+
1 row in set (0.00 sec)
1
2
3
4
5
6
7
mysql> show status like "QCache_hits";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Qcache_hits   | 1     |
+---------------+-------+
1 row in set (0.01 sec)
1
2
mysql> insert into a values (5,"other unrelated data");
Query OK, 1 row affected (0.00 sec)
1
2
3
4
5
6
7
mysql> select txt from a where id=2;
+------+
| txt  |
+------+
| data |
+------+
1 row in set (0.00 sec)
1
2
3
4
5
6
7
mysql> show status like "QCache_hits";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Qcache_hits   | 1     |
+---------------+-------+
1 row in set (0.00 sec)

In plain English, it’s not really comparable. They don’t even do the same thing. The MySQL query cache is invalidated as soon as the table is modified in any shape or form. Be it an UPDATE, INSERT or DELETE. This is not a problem with memcache, where nothing one action do will change the state of an already cached object. Sure, the MySQL query cache is a really good feature to have, despite its limitations and quirks. However, if you’re working on a typical OLTP site and find yourself at a point where you are considering (or are more or less forced to) implementing caching, you probably have enough frequent table modifications to not have a lot of benefit from the query cache.

Also, if you run MySQL in multi-server setup, there is no distribution of the query cache. So if your application/load balancer decides to do the same query on another instance the next time around, you will also lose out. This is not true with multiple instances of memcached thanks to it’s brilliantly simple means of clustering.

There are obviously more arguments for this, but I believe this is enough to get the point across.

Mar 19th, 2008