今天在qq群里,有个哥们说千万不要用drbd做mysql的HA,说对性能影响非常大,drbd对性能有影响是肯定的,把数据通过网络发送给对端备库必定有性能损耗,而我正好有一套drbd,借此测试一把,看看drbd对性能的影响到底有多大,也给网友一个参考。
我测试的是一套两节点的drbd+pacemaker+corosync的mysqlHA高可用集群,主机都是普通的过时的pc机,内存2g,cpu 2核。
1.首先使用sysbench对正常状态的集群初始化数据:
[root@topdb ]# sysbench --test=oltp --oltp-table-size=5000000 --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --db-driver=mysql prepare
初始化的时候可以用dstat命令看看主备的压力:
主库:
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
5 4 41 51 0 0| 80k 19M| 959k 9388k| 0 0 |5697 13k
4 5 44 47 0 1| 88k 15M| 963k 9474k| 0 0 |4845 10k
7 4 36 53 0 0| 0 16M| 326k 9048k| 0 0 |5324 12k
8 3 41 47 0 1| 0 18M| 939k 8708k| 0 0 |5963 12k
7 4 30 59 0 0| 0 21M| 975k 9659k| 0 0 |5763 14k
...
6 5 42 48 0 0| 0 17M|1389k 7702k| 0 0 |5524 13k
10 3 39 48 0 0| 0 17M| 380k 10M| 0 0 |5198 11k
4 3 45 48 0 1| 0 19M| 950k 8993k| 0 0 |6003 14k
5 4 43 48 0 0| 0 13M| 991k 10M| 0 0 |4863 11k
--备库:
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
9 5 86 0 0 1| 0 8580k|9114k 319k| 0 0 | 12k 14k
4 3 92 0 0 1| 0 7572k|7992k 280k| 0 0 |9422 12k
0 3 96 0 0 1| 0 8348k|8842k 309k| 0 0 | 10k 14k
0 2 97 0 0 1| 0 7544k|7988k 279k| 0 0 |9351 12k
0 3 96 0 0 1| 0 9164k| 10M 345k| 0 0 | 12k 14k
0 3 97 0 0 1| 0 8180k|8232k 293k| 0 0 |9662 13k
...
0 3 97 0 0 1| 0 8544k|9036k 314k| 0 0 | 10k 14k
0 3 96 0 0 1| 0 7672k|8123k 285k| 0 0 |9543 12k
0 3 97 0 0 1| 0 8888k|9448k 327k| 0 0 | 11k 14k^C
可以看出,主库每秒要写10几兆左右,网络发送也要将近10M,从库写每秒7M-9M左右,网络接收也和写操作的速率差不多
查看下sysbench的测试表的大小:
[root@db163 mcldb]# du -sh sbtest.*
12K sbtest.frm
1.2G sbtest.ibd
2.测试drbd集群状态的性能:
采用复合模式,即增删改查模式:
[root@topdb ~]# sysbench --oltp-auto-inc=off --max-requests=0 --max-time=60 --num-threads=4 --test=oltp --db-driver=mysql --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --oltp-test-mode=complex run
sysbench 0.4.10: multi-threaded system evaluation benchmark
WARNING: Preparing of "BEGIN" is unsupported, using emulation
(last message repeated 3 times)
Running the test with following options:
Number of threads: 4
Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Not using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)
Done.
OLTP test statistics:
queries performed:
read: 11130
write: 3975
other: 1590
total: 16695
transactions: 795 (13.21 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 15105 (251.06 per sec.)
other operations: 1590 (26.43 per sec.)
Test execution summary:
total time: 60.1639s
total number of events: 795
total time taken by event execution: 240.4650
per-request statistics:
min: 100.55ms
avg: 302.47ms
max: 889.42ms
approx. 95 percentile: 614.07ms
Threads fairness:
events (avg/stddev): 198.7500/3.90
execution time (avg/stddev): 60.1163/0.03
可以看出,1分钟内请求了16695次查询,tps为13.21(不足为奇,我的机子就比较烂)
3.测试单机状态的性能:
把备库离线:
[root@db162 ~]# crm
crm(live)# status
Last updated: Mon Jul 28 18:23:32 2014
Last change: Sat Jul 26 10:05:58 2014 via cibadmin on db163
Stack: classic openais (with plugin)
Current DC: db163 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured, 2 expected votes
7 Resources configured
Online: [ db162 db163 ]
Master/Slave Set: ms_drbd_mysql [drbd_mysql]
Masters: [ db163 ]
Slaves: [ db162 ]
Resource Group: g_mysql
fs_mysql (ocf::heartbeat:Filesystem): Started db163
p_ip_mysql (ocf::heartbeat:IPaddr2): Started db163
mysqld (lsb:mysqld): Started db163
Clone Set: cl_ping [p_ping]
Started: [ db162 db163 ]
crm(live)# node standby db162
再来看下主库drbd的状态:
[root@db163 mcldb]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
0: cs:WFConnection ro:Primary/ Unknown ds:UpToDate/ DUnknown C r-----
ns:6074648 nr:0 dw:6075004 dr:294960 al:466 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:308
确实已经不在同步数据了,下面再来测试:
[root@topdb ~]# sysbench --oltp-auto-inc=off --max-requests=0 --max-time=60 --num-threads=4 --test=oltp --db-driver=mysql --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --oltp-test-mode=complex run
sysbench 0.4.10: multi-threaded system evaluation benchmark
WARNING: Preparing of "BEGIN" is unsupported, using emulation
(last message repeated 3 times)
Running the test with following options:
Number of threads: 4
Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Not using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)
Done.
OLTP test statistics:
queries performed:
read: 16394
write: 5851
other: 2340
total: 24585
transactions: 1169 (19.45 per sec.)
deadlocks: 2 (0.03 per sec.)
read/write requests: 22245 (370.14 per sec.)
other operations: 2340 (38.94 per sec.)
Test execution summary:
total time: 60.0990s
total number of events: 1169
total time taken by event execution: 240.2136
per-request statistics:
min: 73.11ms
avg: 205.49ms
max: 741.33ms
approx. 95 percentile: 432.89ms
Threads fairness:
events (avg/stddev): 292.2500/3.34
execution time (avg/stddev): 60.0534/0.03
确实有drbd的话,性能损耗了(1-16695/24585)= 32%,但目前drbd没有调优,下面调整下drbd的参数,再来测试 4.调优drbd后测试性能:
先还原drbd备库:
[root@db162 ~]# crm
crm(live)# node online db162
crm(live)# status
Last updated: Mon Jul 28 18:45:16 2014
Last change: Mon Jul 28 18:45:30 2014 via crm_attribute on db162
Stack: classic openais (with plugin)
Current DC: db163 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured, 2 expected votes
7 Resources configured
Online: [ db162 db163 ]
Master/Slave Set: ms_drbd_mysql [drbd_mysql]
Masters: [ db163 ]
Slaves: [ db162 ]
Resource Group: g_mysql
fs_mysql (ocf::heartbeat:Filesystem): Started db163
p_ip_mysql (ocf::heartbeat:IPaddr2): Started db163
mysqld (lsb:mysqld): Started db163
Clone Set: cl_ping [p_ping]
Started: [ db162 db163 ]
[root@db162 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:38064 dw:38064 dr:0 al:0 bm:18 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
可以看出,drbd又还原回来了,现在又开始了数据同步。
调整优化drbd参数:
[root@db163 ~]# vim /etc/drbd.d/global_common.conf
disk {
on-io-error detach;
disk-flushes no;
}
net {
max-buffers 8000; #增大缓存区为8000
max-epoch-size 8000;
sndbuf-size 0; #是sendbuffer自动调整
}
syncer {
rate 10M; #我的是百兆带宽,所以调整10M就行了
al-extents 257; #增大活动日志区为257个
}
common {
protocol C; #协议仍采用C,即主要把数据发送到从的tcp缓存区才算完成,这是最安全严格的方式了。
}
把文件也拷贝到从上面:
[root@db163 ~]# scp /etc/drbd.d/global_common.conf db162:/etc/drbd.d/
global_common.conf 100% 2181 2.1KB/s 00:00
主备都在线调整下配置文件:
[root@db163 ~]# drbdadm adjust all
[root@db162 ~]# drbdadm adjust all
再来测试:
[root@topdb ~]# sysbench --oltp-auto-inc=off --max-requests=0 --max-time=60 --num-threads=4 --test=oltp --db-driver=mysql --mysql-host=192.168.1.163 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=mcldb --oltp-test-mode=complex run
sysbench 0.4.10: multi-threaded system evaluation benchmark
WARNING: Preparing of "BEGIN" is unsupported, using emulation
(last message repeated 3 times)
Running the test with following options:
Number of threads: 4
Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Not using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)
Done.
OLTP test statistics:
queries performed:
read: 16366
write: 5845
other: 2338
total: 24549
transactions: 1169 (19.45 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 22211 (369.46 per sec.)
other operations: 2338 (38.89 per sec.)
Test execution summary:
total time: 60.1174s
total number of events: 1169
total time taken by event execution: 240.1222
per-request statistics:
min: 70.51ms
avg: 205.41ms
max: 685.97ms
approx. 95 percentile: 413.63ms
Threads fairness:
events (avg/stddev): 292.2500/5.45
execution time (avg/stddev): 60.0306/0.05
这次测试的tps为19.45,和单机测试的tps是一样的,也就是drbd对性能几乎没有影响了。
5.dd测试:
由于数据库都是小io,多随机读写,对吞吐量测试不了,我再测试下大数据块的写入及传输速度:
[root@db163 data]# dd if=/dev/zero of=/data/dd.test bs=1M count=4096
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 341.432 s, 12.6 MB/s
其中/data是我的drbd磁盘,在drbd网络传输的情况下,速度为12.6MB/s
下面关闭备节点,再来测试:
[root@db162 ~]# crm node standby
[root@db163 data]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil@Build64R6, 2013-10-14 15:33:06
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:10309484 nr:0 dw:10407952 dr:533640 al:1514 bm:18 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:65724
[root@db163 data]# dd if=/dev/zero of=/data/dd.test2 bs=1M count=4096
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 81.3911 s, 52.8 MB/s
速度为52MB/s,不过这也是正常的,因为单机情况下dd都是写到缓存中了,drbd集群状态下速度为12.6M/s已经是网络带宽的最大容量了,因此写入速度受限于带宽而不是磁盘。如果在生产者用千兆内网,网络应该不成问题,对主库性能应该不会造成多少影响。而且将来drbd模块会集成到linux内核中,所以drbd确实性能还是可以的。