It's already 2 years since I had a chance to burn-in and install HP Proliant DL320G5p server. Even I took notes back then, I never found time until now when suddenly a good reason came... So I'll write a bit about the hardware, installation of Debian Lenny with Xen paravirtualization and performance tests that I've ran back then. Someone might find the test numbers useful as a reference.
Here are the HW parts:
1x SERV HP DL320G5p QC-3210, 2.13GHz/1333 2x1GB SAS/SATA, Rack 4x HP 2GB UB PC2-6400 1x2GB Kit (ML110G5, ML310G5, DL320G5p) 3x HDD HP 72GB DP 3.5" Hot Plug, SAS 15k 1x HDD HP 500GB 3,5" Hot Plug, SATA 7.2K 2x HP SC44Ge PCI-Ex HBA 1x HP DL320G5p iLO Port Opt Kit 1x HP DL1U 4 Drive Cage 1x HP HBA SAS-SATA 4x1LN Cable Kit
or in short - quad-core 2.13GHz Xeon CPU, 8GB RAM, 3x 72GB SAS + 500GB SATA HDD, with a iLO hardware remote console.
neo:~# lspci 00:00.0 Host bridge: Intel Corporation 3200/3210 Chipset DRAM Controller (rev 01) 00:01.0 PCI bridge: Intel Corporation 3200/3210 Chipset Host-Primary PCI Express Bridge (rev 01) 00:06.0 PCI bridge: Intel Corporation 3210 Chipset Host-Secondary PCI Express Bridge (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02) 00:1c.2 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 3 (rev 02) 00:1c.3 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 4 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02) 00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02) 00:1d.3 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) 00:1f.0 ISA bridge: Intel Corporation 82801IR (ICH9R) LPC Interface Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 4 port SATA IDE Controller (rev 02) 00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02) 01:02.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02) 01:04.0 System peripheral: Compaq Computer Corporation Integrated Lights Out Controller (rev 03) 01:04.2 System peripheral: Compaq Computer Corporation Integrated Lights Out Processor (rev 03) 01:04.4 USB Controller: Hewlett-Packard Company Proliant iLO2 virtual USB controller 01:04.6 IPMI SMIC interface: Hewlett-Packard Company Proliant iLO2 virtual UART 02:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev b5) 03:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3) 03:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5715 Gigabit Ethernet (rev a3) 15:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
Putting together the hardware pieces was quite straightforward, only later I've realized I have to put the SATA disk first and only then the SAS, as I had to change the cabling when I found out that the write performance on the SATA disk was "11.3 MB/s" using SC44Ge PCI-Ex controller vs "31.6 MB/s" using the on-board Intel ICH9 controller. So I left the three SAS disks connected to SC44Ge and the single SATA disk to the main board.
To install the base Debian Lenny system I didn't had to do any special tricks, just used the virtual CDROM and went through the installer.
neo:~# fdisk -l /dev/sda Disk /dev/sda: 73.4 GB, 73407865856 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x000c51f0 Device Boot Start End Blocks Id System /dev/sda1 * 1 31 248976 fd Linux raid autodetect /dev/sda2 32 8924 71433022+ fd Linux raid autodetect neo:~# fdisk -l /dev/sdb Disk /dev/sdb: 73.4 GB, 73407865856 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x000da224 Device Boot Start End Blocks Id System /dev/sdb1 * 1 31 248976 fd Linux raid autodetect /dev/sdb2 32 8924 71433022+ fd Linux raid autodetect neo:~# fdisk -l /dev/sdc Disk /dev/sdc: 73.4 GB, 73407865856 bytes 255 heads, 63 sectors/track, 8924 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x000ee4f3 Device Boot Start End Blocks Id System /dev/sdc1 * 1 31 248976 fd Linux raid autodetect /dev/sdc2 32 8924 71433022+ 8e Linux LVM neo:~# fdisk -l /dev/sdd Disk /dev/sdd: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00017cb0 Device Boot Start End Blocks Id System /dev/sdd1 1 60801 488384001 8e Linux LVM
The first partition of the three SAS disks (sda, sdb, sdc) is in RAID1 for the /boot
with the grub loader. Then the rest of the first two disks (sda, sdb) is in RAID1 as a PV for LVM. Third SAS and the SATA disks (sdc, sdd) were stand-alone and also PV for LVM. So in total 3x volume groups vg00, vg01 and vg02. vg00, vg01 with capacity 68GB and vg02 with 465GB. All of them with a different characteristic. vg00 protected from one disk failure, vg01 standalone fast SAS 15k disk and vg02 standalone SATA with a big capacity.
neo:~# pvdisplay --- Physical volume --- PV Name /dev/md1 VG Name vg00 PV Size 68.12 GB / not usable 2.69 MB Allocatable yes PE Size (KByte) 4096 Total PE 17439 Free PE 2819 Allocated PE 14620 PV UUID FiVmLS-7f3H-0S9x-7YjQ-bKnE-7M1t-E1aYIw --- Physical volume --- PV Name /dev/sdc2 VG Name vg01 PV Size 68.12 GB / not usable 2.81 MB Allocatable yes PE Size (KByte) 4096 Total PE 17439 Free PE 11071 Allocated PE 6368 PV UUID Jxx0zz-8hBR-jfUW-ZhaQ-QXCY-J9sb-qJwqoJ --- Physical volume --- PV Name /dev/sdd1 VG Name vg02 PV Size 465.76 GB / not usable 1.50 MB Allocatable yes PE Size (KByte) 4096 Total PE 119234 Free PE 73858 Allocated PE 45376 PV UUID wMdrbc-s7G0-rsHj-O5CS-6JH8-MJTf-lHCKcy
The hadware remote console is an independent piece of the hardware inside the server chassis, sharing only power supply. Using this console server can be powered on or off and it allows also to see the "screen", access "keyboard" and virtual USB CDROM of the machine remotely allowing to reinstall the server remotely from anywhere.
Why paravirtualization even when the hardware was capable of full virtualization? Even thou the guest systems has to run modified Xen-domU kernels, the paravirtualizations brings the advantage that the single partitions from host system dom0 can be directly used in the guest domU systems. So there is no need to partition and use LVM in the domU-s again. It's easy to shut down the domU and mount the partition and do maintenance on it, like for example resize.
I proved that the system was able to use the speed of the disks in the three volume groups independently via running the badblocks read/write check on one and then on all three at the same time without noticing significant difference. So this hardware with 4 cores has a potential of running 4 really independent machines at once, with my set-up 3 as the two of the disks are joined in RAID1.
testing with hdparm -t
:
/dev/sda: Timing buffered disk reads: 352 MB in 3.00 seconds = 117.26 MB/sec neo:~# hdparm -t /dev/sdb /dev/sdb: Timing buffered disk reads: 352 MB in 3.00 seconds = 117.33 MB/sec neo:~# hdparm -t /dev/sdc /dev/sdc: Timing buffered disk reads: 354 MB in 3.01 seconds = 117.44 MB/sec neo:~# hdparm -t /dev/sdd /dev/sdd: Timing buffered disk reads: 322 MB in 3.01 seconds = 106.94 MB/sec --- /dev/mapper/vg00-mirror--sas: Timing buffered disk reads: 352 MB in 3.00 seconds = 117.25 MB/sec /dev/mapper/vg01-single--sas: Timing buffered disk reads: 354 MB in 3.01 seconds = 117.59 MB/sec /dev/mapper/vg02-single--sata: Timing buffered disk reads: 324 MB in 3.02 seconds = 107.42 MB/sec
SAS RAID1 disk sync:
### sas disks sync md1 : active raid1 sda2[0] sdb2[2] 71432896 blocks [2/1] [U_] [>....................] recovery = 2.9% (2091456/71432896) finish=10.4min speed=110076K/sec ----total-cpu-usage---- --dsk/sda-----dsk/sdb-----dsk/sdc-----dsk/sdd-- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ: read writ: read writ: read writ| recv send| in out | int csw 0 0 100 0 0 0| 110M 0 : 0 110M: 0 0 : 102M 0 |2534B 3590B| 0 0 |8306 14k 0 0 100 0 0 0| 109M 0 : 0 109M: 0 0 : 100M 0 | 384B 3476B| 0 0 |8217 13k
bad blocks check - only one disk running
neo:~# time badblocks -s -w /dev/mapper/vg01-single--sas Testing with pattern 0xaa: done Reading and comparing: done Testing with pattern 0x55: done Reading and comparing: done Testing with pattern 0xff: done Reading and comparing: done Testing with pattern 0x00: done Reading and comparing: done real 134m39.806s user 3m0.267s sys 0m16.969s
bad blocks check - all three VG at once (stripped output only real time left):
neo:/mnt# time badblocks -s -w /dev/mapper/vg00-mirror--sas real 135m0.663s neo:~# time badblocks -s -w /dev/mapper/vg01-single--sas real 135m4.589s neo:~# time badblocks -s -w /dev/mapper/vg02-single--sata real 255m34.293s
bad blocks check - all three VG at once (stripped output only real time left) and inside a XEN virtual machines:
mirror:~# time badblocks -s -w /dev/sdb1 real 135m31.627s sas:~# time badblocks -s -w /dev/sdb1 real 135m28.460s sata:~# time badblocks -s -w /dev/sdb1 real 257m33.197s
note there is no difference in speed between a single disk and the same two disks in SW RAID1
read speeds SAS vs SATA with dd
sas:~# time dd if=/dev/sdb1 of=/dev/null bs=1M count=10000 skip=10000 10485760000 bytes (10 GB) copied, 88.9769 s, 118 MB/s # read sata:~# time dd if=/dev/sdb1 of=/dev/null bs=1M count=10000 skip=10000 10485760000 bytes (10 GB) copied, 93.1479 s, 113 MB/s
write speeds SAS vs SATA with dd
sas:~# time dd if=/dev/zero of=/dev/sdb1 bs=1M count=10000 10485760000 bytes (10 GB) copied, 87.8604 s, 119 MB/s # through SC44Ge sata:~# time dd if=/dev/zero of=/dev/sdb1 bs=1M count=10000 10485760000 bytes (10 GB) copied, 927.064 s, 11.3 MB/s # through on-board controller later sata:~# time dd if=/dev/zero of=/dev/sdb1 bs=1M count=10000 10485760000 bytes (10 GB) copied, 305.611 s, 34.3 MB/s
It turned out that the read speeds on the SAS disk and SATA disk were equal and the write speed was ⅓. Let's try to prove this via the backblocks run times. During the run there is the same amount data to read and write. So for the SAS disk it was 135min/2 = 67,5min for each operation. If the SATA writes would be 3x slower then the total time would be 67,5m + 3x 67,5m = 270m. Which is actually similar to the real run time of 257min. I'm pretty sure some mathematician will beat me for this prove, but ...
The two Ethernet ports are configured in active-backup bonding mode. So both ports can be plugged to one or two switches and only one is communicating. When one switch goes down (power down, port shut-down) the other port takes over. The bond0 virtual interface that is created from the two eth0 and eth1 interfaces is then configured in a bridge interface so that the virtual machines can get public ip from the same IP pool as the physical interface has. Here is the system configuration:
# apt-get install bridge-utils ifenslave-2.6 # echo bonding >> /etc/modules: # echo "alias bond0 bonding" >> /etc/modprobe.d/aliases # echo "options bonding mode=active-backup miimon=100 max_bonds=1" >> /etc/modprobe.d/aliases # vim /etc/network/interfaces auto br0 iface br0 inet static address 62.40.64.245 netmask 255.255.255.240 broadcast 62.40.64.247 gateway 62.40.64.241 bridge_ports bond0 bridge_fd 0 bridge_stp off pre-up ifconfig bond0 up pre-up ifconfig eth0 up pre-up ifconfig eth1 up pre-up ifenslave bond0 eth0 eth1
I've crimped a 1Gb cross-over Ethernet cable and used netperf
and ifstat
to measure the throughput between my laptop and the server.
ifstat (one way and then the other way):
eth0 Kbps in Kbps out 961336.7 13976.19 961350.0 14070.64 961197.4 14146.69 961434.5 14138.59 eth0 Kbps in Kbps out 20822.81 958567.5 20731.41 953624.3 20842.09 959024.5 20700.29 952988.1
netperf (both sides had the netserver and netclient ran at the same time):
sas:~# netperf -l 60 -H 62.40.64.241 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 62.40.64.241 (62.40.64.241) port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.02 936.60 ant:~# netperf -l 60 -H 62.40.64.242 TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 62.40.64.242 (62.40.64.242) port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 60.01 107.55
4x:
neo:~# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X3210 @ 2.13GHz stepping : 11 cpu MHz : 2128.046 cache size : 4096 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu de tsc msr pae cx8 apic mtrr cmov pat clflush acpi mmx fxsr sse sse2 ss ht constant_tsc pni ssse3 bogomips : 4262.53 clflush size : 64 power management:
The server is running for over 2 years now hosting couple of virtual machines. Some stand-alone with own public IP, some private ones behind nginx reverse proxy. So far so good! :-)
PS it took me 3,5h to write this blog entry...