Measure the performance of the "cloud" drive - saving MySQL

Recently, in the cloud and hosting services are increasingly began to see "virtual" hard drives. Technical service host can attest that the "virtual" disk - as fast as a dozen raids 10 (100 raid ;-)) and holds hundreds, even thousands of IOPS - but MySQL customers slows markedly. But how to prove it hoster?

The problem is that the measure the "speed" of the virtual hard disk from within a virtual machine - is not easy, because it's unclear what to measure in the first place, what and why. And this must be done to convince administrators virtual configuration that is not in the application and configuration MySQL. And I had, as they say, simply "wash their hands" before reading the manual to the store .

In this article I will illustrate a simple method to find the "tipping point" of virtual hard disk, using the tools available in this distribution - sysbench and iostat. We also measure the "tipping point" known for their Latency EBS virtual disks from Amazon - as usual EBS, and Provisioned IOPS EBS (1000 and 2000 IOPS).

Theory

Go to hell! Too many dimensions and letters - sequential read / write, random read / write, rewrite, influence file cache kernel optimization queue, options options and file system architecture ... We deal easier - semuliruem multithreading load on a virtual drive created similar to the MySQL server and see how the disk, or that he was hiding in the network, to breathe.

And that was not boring, add a little romance to the gnu.org :-)

How MySQL Loaded

For InnoDB, for a typical web application, if the data set does not fit in main memory (which is why the database was invented ;-)), the information will mainly be read from disk in random order (page buffer pool and clustered indexes stored on the disc), and recorded - in series (the transaction logs and the binary log).

Periodically buffer pool of RAM will be flushed - Random write. Immediately rule out the absence of the write cache and the "batteries" to the virtual store - it has to be, otherwise MySQL mode ACID (innodb-flush-log-at-trx-commit = 1) simply die of regret.

It is important to understand here that the client requests MySQL performs in parallel streams - and the disc will load, respectively, by multiple threads simultaneously.

A burden

Let's start with a simple EBS-ROM Amazon. Load the virtual disk tool will sysbench (in CentOS is available in packages from source is easy to assemble)

yum install sysbench mkdir -p /mount/disk_ebs/mysql/test_folder cd /mount/disk_ebs/mysql/test_folder sysbench --test=fileio --file-total-size=16G prepare

An important point - create the total amount of test files (16G), more than at least 2 times the amount of memory the virtual machine (do not ask why 2 times ;-), the more - the better). This is to reduce the influence of the operating system file cache - when you run the test files better, so peregenerit again (or create some test files and switch between them.)

Now we create a load of N flows, emulating ~ N clients database server, run queries (yes, I know that there are some within the database service flows, but let us not complicate things). Say, it is expected that the database will simultaneously work on the 10 Apaches:

 sysbench --num-threads=10 --test=fileio --file-test-mode=rndrw --max-time=180 --file-rw-ratio='2' --file-total-size=16G --max-requests=1000000 run

What to measure?

Now the fun part. We are not interested in what the results will show sysbench - it's just stress creates. We only look at how it feels to a virtual drive with load:

 iostat –xm 10 Device: rrqm/s wrqm/sr/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdm 0.00 0.00 120.50 0.00 2.05 0.00 34.79 10.50 87.47 8.30 100.00

CD "swamped" demands much of the CPU time, as seen by «% util = 100" - disk driver accumulates requests in a queue and when ready device "feeds" them (some confuse this figure with bus bandwidth to the disk, which of course true). Immediately clear if the driver has to wait most of the time, the drive - loaded to capacity.

The average processing time of one query «svctm» - 8.3 ms. Much, but okay to drive Amazon. There's nothing criminal - the usual physics.

The average waiting time of processing a query «await» - 87.47 ms, and the average length of the queue in the driver disk «avgqu-sz» - 10.5. That's a lot, wait about 100 ms to process a single request! Obviously, as it turns out it is - roughly the size of the queue (avgqu-sz) multiplied by the processing time per query (svctm).

Thus, we see that only 10 competitive requests for random read / write (keys - file-test-mode = rndrw and - file-rw-ratio = '2 ') cause slow down to a virtual hard disk.

So much so that we have to wait for one query about 100 ms. And if the web page generates 200 requests to the disk - how long will it be built? 20 seconds?

Interestingly, and for any number of threads drive Amazon starts to accumulate all service requests faster and at least 50 ms (it is better than 20 ms - subjective)? We see that with 5 threads. Of course, this is a weak indicator, without softvarnogo raid not do ...

 Device: rrqm/s wrqm/sr/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdm 0.00 0.00 127.50 0.00 2.05 0.00 32.88 5.10 39.78 7.84 100.00

We see that the size of the queue - 5.1 and a single query execution time - 39.78.

Test the "fast" virtual drives Amazon

Relatively recently, Amazon announced a "fast" drives with a guaranteed number of IOPS (theory IOPS - extensive, as is our country, google, en.wikipedia.org / Wiki / IOPS ). We know that mere mortals SATA-drives do not hold more than 100 IOPS (simultaneous read and write), and, unfortunately, also the death 15k SAS-drives - no more than about 200 IOPS. We also know that SSD-drives, and other SAN technologies can master hundreds, and even thousands of IOPS - it is clear that they are much more expensive.

Now let's look at what the number of simultaneous streams the "fast" virtual disks start Amazon rebate and collect requests in the queue. Broke his left leg goat!

EBS disk with 1000 IOPS

One thread:

 Device: rrqm/s wrqm/sr/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdk 0.00 0.00 1084.50 0.00 27.33 0.00 51.61 3.72 3.43 0.91 99.20

Pay attention to the little search time by the virtual disk - 0.91 ms. Visible mass on SSD ;-) queue size ~ 4, and the average time per request - 3.43 ms.

20 threads:

 Device: rrqm/s wrqm/sr/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdk 0.00 0.00 1059.50 0.00 26.37 0.00 50.98 55.39 51.97 0.94 100.00

We see that at 20 the request stream to wait ~ 50 ms due to the production line at the 55 requests.

EBS disk with 2000 IOPS

20 threads:

 Device: rrqm/s wrqm/sr/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdl 0.00 0.00 1542.50 0.00 36.29 0.00 48.18 33.20 21.29 0.65 100.00

50 threads:

 Device: rrqm/s wrqm/sr/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvdl 0.00 0.00 1498.50 0.00 36.63 0.00 50.06 86.17 57.05 0.67 100.00

Results of measurements

See that EBS-drive with 2000 IOPS shows roughly the same latency (~ 50ms) for 50 streams, as a disc with 1000 IOPS for 20 streams and normal EBS drive for 6-7 streams (apparently ordinary EBS disk IOPS is within 200 -300).

What else is happening

Virtual hard disks are often presented with a surprise. Sometimes they just nedonastroili as did not have time to finish reading the man ...

Recently faced with a similar briefcase, when creating multi-threaded load test on an empty MySQL server at 'big-expensive-fast-network "virtual disk rode svctm index from 0.5 to 1 ms at night and from about 10 to 100 ms - the day ( wanted to try on a full moon, too long). MySQL course - brake. The reason was in parallel using network storage and unaware of each other projects, not setting MySQL, which is trying to make guilty ;-)

Summary

Using hand tools, we have a solution for the multi-threaded load limit competition, in which the virtual disk is to store and maintain all pretty typical MySQL queries for 50 ms or more. Now you can guess how many drives to collect in a raid to ensure latency, say, in the 10-20 ms for a given number of clients. Clearly, this is an approximate data, but they will certainly help to move on, especially if you measure the performance of the hard disk / raid and come with these comparative data, and a bottle of champagne to the cloud hosting provider ;-)