MySQL-over-DRBD Performance

Discussion:

Art Age Software

17 years ago

I'm testing a configuration of MySQL 5 running on a 2-node DRBD
cluster. It is all configured and seems to be running fine. However,
upon running the MySQL sql-bench tests I am seeing some surprising
(and alarming) results. I would be interested to hear from anybody who
has configured a similar setup to hear what sort of performance you
are seeing and what you have done to maximize performance of MySQL
over DRBD.

In my case, I am replacing a 3-year old pair of Dell 2850's (4GB,
Dual-Proc/Single-Core) with a pair of new Dell 2950's (8GB,
Dual-Proc/Quad-Core). Clearly, I expect to see an overall performance
boost from the new servers. And for most operations I am seeing better
performance. However for writes, I am seeing **worse** performance.
**But only when the database is located on the DRBD device.**

Here is some sample result data from the sql-bench insert test:

Old Servers:
Database on Local Storage
insert test: 212.00 sec.
Database on DRBD Device
insert test: 998.00 sec.
----------------------------
DRBD Overhead: 786 sec. = 370%

New Servers:
Database on Local Storage
insert test: 164.00 sec. (22% better than old servers)
Database on DRBD Device
insert test: 1137.00 sec. (14% *worse* than old servers)
----------------------------
DRBD Overhead: 973 sec. = 590%

As you can see, the new servers performed better when writing locally,
but performed worse when writing to the DRBD device. I have tested the
local write performance on both primary and secondary, and all
hardware and software config is identical for both nodes. So I believe
this rules out local I/O subsystem as the culprit and points to either
DRBD or something in the TCP networking stack as the problem.

The dedicated GigE link connecting the DRBD peers seems to be
operating well. I performed a full resync with the syncer rate set at
100MB and DRBD reported throughput very close to 100MB during the
entire sync process.

Here are the main differences (that I know about) between the old and
new configs (other than hardware):

1. Old config runs DRBD version 0.7.13
New config runs DRBD version 8.0.6

2. Old config runs a single GigE cross-connect between servers for
DRBD traffic.
New config runs a bonded dual-GigE cross-connect between servers for
DRBD traffic.

So, a couple of questions:

1) I have read that DRBD should impose no more than about a 30%
performance penalty on I/O when a dedicated gigabit ethernet
connection is used for the DRBD link. I'm seeing inserts take **7
times longer** to the DRBD device as compared to the local disk. Can
that be right?

2) Why is performance of the new configuration worse than the old
configuration when DRBD is involved (when it is clearly better when
DRBD is *not* involved)? Is DRBD 8.x generally slower than DRBD 0.7.x?
Or might there be something else going on?

In general, I'm a bit at a loss here and would appreciate any input
that might shed some light.

Thanks,

Sam

Lars Ellenberg

17 years ago