I'm interested in storage benchmarks for various configurations in order to figure out what's best for a virtualization environment. The virtualization environment will be proxmox, as it is my choice for the best manageable virtualization platform with plenty of features right now.
I want to look at the following configuration options, which may have an impact on performance:
multi disk technology(technology, raidlevel)
Thin provisioning is the method of having virtually unlimited space and provide actual physical existent space only in the amount of actually used space. So you can define multiple TB of disk capacity and only have a 250 GB SSD at the back. If that backend device is getting filled up, you can add more storage when you need it. It's especially helpful in the times of SSDs because they are still considerably more expensive, so you do not want to spend thousands of $ when you in fact do not need it. Furthermore there are big differences in SSD products. SSDs for desktop use maybe quite cheap. But SSDs for server which are heavily written on are much more expensive.
normal consumer SSD: 500 GB m.2 ssd start from 80 € (Total Lifetime Write Capacity: 300 TB = 600 Full Writes)
datacenter SSD: 375GB Intel Optane SSD DC P4800X PCIe costs about 1200 €. (Total Lifetime Write Capacity: 20.5 PB = 57,000 Full writes)
filesystem and lvm
Many filesystems have interesting features, which are helpful besides the pure performance and problems which one would not like:
PRO: zfs and btrfs has checksums and selfhealing against data corruption.
PRO: zfs and lvm provides methods for thin provisioning
PRO: ext4 is easy to use. a simple fire and forget filesystem.
PRO: btrfs has an enormous flexibility
PRO: lvm has the flexibility to change configurations without downtime
CON: ext3 has quite long filesystemcheck times.
Transparent compression is a layer which reduces the amount of written/read data onto/from the raw disk and thus may increase speed at the cost of cpu power.
multi disk technology(technology, raidlevel)
There are different multi disk technologies available. Linux Software RAID, LVM, btrfs raid, zfs raid. They combine the speed of multiple devices and add redundancy to be able to cope with device failures without data loss.
ssd caching can accelerate slower hdds by adding putting used data onto the fast ssd as read cache or by storing datas to be written preliminary to the ssd and have it synced to the slower hard disks in the background, not loosing data security, because data written to the ssd is already persistent.
ceph - no option here
Ceph is a very interesting technology. I'm not considering using it, because the money needed to get it run with good performance is a lot higher than just with disks and ssds. You need at least 10 G networking, or even better, which is a lot more costly than 1 G. You need full equipped SSD Storage which is more expensive too. A big plus with ceph is that you get a redundant network storage, so you can immediately start virtual machines on other nodes if a compute node crashes. If money is no problem, and the performance is not needed at the maximum, ceph would be an excellent choice. I have a 3-node-cluster with ceph here up and running. It works like charm. Administration is easy and performance is fine.
In the following threads, I'll introduce more on my environment and scripts of the benchmarking.
The hard disks are of type SAS and attached to the adaptec raid controller as single disks. One Intel SSD as OS-Filesystem. The other one is attached PCIe SSD-m.2 Adapter. An additional m.2 SSD will be attached for later tests with ssd caching.
For the tests I will make use of fio - flexible I/O tester - one of the currently most popular storage benchmarking tools.
My production scenario will be webhosting. So it will be 25% write and 75% read. I will test that probably later after the basic read/write tests.
At first I'm making sure the device names I use are fixed so my tests will not overwrite any of the wrong disks. This may happen under linux because there is no fixed device naming of storage devices. The ordering may be different at every reboot. And it actually is, as I have noticed.
So I'm checking the serial numbers and copy the device file names to unique names I will be using then.
What's regarding partitions: I try to avoid using them and use whole disks instead as it makes the procedere simpler.
Would love to see zfs test on that ratio.
Should shine with separated l2arc devices on ssd, when it gets warm.
Be sure to limit ARC size in production scenarios, leaving <insert size> for large application allocations if required.
If you intend to benchmark zfs as well.
For KVM and ZFS inside VM(s), more tuning will be required... would not recommend inside virtual machines with additional layers on top of raw(s), qcow(s) or zvol(s)
Containers on the other hand work directly, so it should be interesting to see performance on LXC with zpool configured with L2ARC and log devices.
AFAIK transparent compression with snapshots/clones etc. outside brtfs and zfs will be hard to find on linux filesystems.
So it's XFS or EXT4 all the way i'm afraid with LVM inside hypervisors for flexibility.
Stripe it over those rust disks and explore LVM caching a bit (have not used it, but it's there )
You will have everything but transparent compression @ your disposal.
These 3 Users Gave Thanks to Peasant For This Post:
I got some advices from a person who wrote his thesis on the subject of benchmarking:
benchmark with the applications that are like the later used applications.
test with concurrent i/o-requests(if that's your scenario, and it is like that almost always).
test with small block sizes, as this will be the realistic work load for the storage in my case.
test with virtual machines and with network, so it will be like the i/o when the system is used in production.
I'm already testing with small block sizes. Concurrent jobs testing is running at the moment. Real-world testing will be done at some later point.
--- Post updated at 04:53 PM ---
1. Performance Base Line of the system
This benchmark is to demonstrate the actual speed of the used system. It's in no way relevant for the later workload and just to make sure there storage system is generally performing without major trouble.
Interesting: RAID-5 has slower write speeds than expected. zfs-RAIDZ which is similar in it's data distributin is considerably faster.
Single-Disk, Sequential Read, Single Threaded, Test with 1M Block-Size. Bandwidth in KB/s
Single-Disk, Sequential Write, Single Threaded, Test with 1M Block-Size. Bandwidth in KB/s
Single-Disk, Sequential Read, Single Threaded, Test with 4K Block-Size. IOPS
Single-Disk, Sequential Write, Single Threaded, Test with 4K Block-Size. IOPS
Multi-Disk, Sequential Read, Single Threaded, Test with 1M Block-Size. Bandwidth in KB/s
Multi-Disk, Sequential Write, Single Threaded, Test with 1M Block-Size. Bandwidth in KB/s
Multi-Disk, Sequential Read, Single Threaded, Test with 4K Block-Size. IOPS
Multi-Disk, Sequential Write, Single Threaded, Test with 4K Block-Size. IOPS
--- Post updated at 04:56 PM ---
Originally Posted by hicksd8
So I reckon that RAID3 will be slightly better than RAID5 (unless you're going to use RAID10 with a large number of members).
Why do you think that? I do not understand that RAID3 would perform better. As I understand it, it should be nearly the same. Only the parity goes to one dedicated disk and that disk is used heavily.
EDIT: Ahh. I possibly understand. RAID3 uses Byte-Level-Striping instead of Block-Level_Striping of RAID4/5 which may be calculated faster? Unfortunately Linux Software RAID does not support RAID3.
--- Post updated at 05:25 PM ---
2. First Insight: LVM seems not to impact read/write throughput or iops performance
Check the numbers by comparing the neighbor rows with and without lvm with the same other specs. The numbers of those pairs do only differ very little. I'll still test and watch the lvm performance readings, but I'll not report them any more, except there is some worth mentioning.
Single-Disk, Random Read, Single Threaded, Test with 4K Block-Size. IOPS
Single-Disk, Random Write, Single Threaded, Test with 4K Block-Size. IOPS
Multi-Disk, Random Read, Single Threaded, Test with 4K Block-Size. IOPS
Multi-Disk, Random Write, Single Threaded, Test with 4K Block-Size. IOPS Multi-Disk, Random Read, Single Threaded, Test with 4K Block-Size. Bandwidth in KB/s Multi-Disk, Random Write, Single Threaded, Test with 4K Block-Size. Bandwidth in KB/s
I have an old IBM Power 5 9111-520 that has data on it but the system is failing. I need to move it to a more reliable server. The current system has two drives and no raid. I would like to setup my "newer" system with raid and two partitions then clone my setup over. What is the best way to do... (2 Replies)
Couple of sentences for background: I'm a software developer, whose task was to create a server software for our customer. Software is ready for deployment and customer has a new T4-1 SPARC, but somehow it also became my task also to setup the server. I have managed to get the server is up... (13 Replies)
I need a little clarification in understanding why there would be a need for a benchmark file when used with a backup script. Logically thinking would tell me that the backups itself(backuptest.tgz) would have the time created and etc. So what would be the purpose of such a file:
touch... (6 Replies)
I wanted to find out that in my database server which filesystems are shared storage and which filesystems are local. Like when I use df -k, it shows "filesystem" and "mounted on" but I want to know which one is shared and which one is local.
Please tell me the commands which I can run... (2 Replies)
I've a HP-UX 10x running on HP9000 box and also I have 3 scsi hdd(9Gb),
one of them is working. I need to check the other 2 hdd physically.
Is there an utility to check them from unix or another way to do it?
Thanks.... (5 Replies)
STEP 1: Get the source here:
STEP 2: Unzip or Untar
STEP 3: make
STEP 4: Run
STEP: 5: Please login to www.unix.com and post test results along with platform info to:
Include (if you... (0 Replies)