Raid0 array stresses only 1 disk out of 3

04-11-2016

Registered User

358, 5

Join Date: Nov 2008

Last Activity: 11 June 2020, 6:22 AM EDT

Location: various

Posts: 358

Thanks Given: 17

Thanked 5 Times in 5 Posts

Raid0 array stresses only 1 disk out of 3

Hi there,

I've setup a raid0 array of 3 identical disks using :

Code:

mdadm --create --verbose /dev/md0 --level=stripe --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1

I'm using dstat to monitor the disk activity :

Code:

dstat --epoch -D sdb,sdc,sdd --disk-util 30

The results show that the stress is not evenly split (stripped) across the 3 disks:

Code:

2016-04-11 09:35:30 |   26%   28%   27%
[...]
2016-04-11 10:15:00 |    0%  100%    0%
2016-04-11 10:15:30 |    0%    3%   97%
2016-04-11 10:16:00 |    0%    0%   81%
2016-04-11 10:16:30 |    0%    0%  100%
2016-04-11 10:17:00 |    0%    0%   30%
[...]
2016-04-11 11:28:30 |    0%    0%   55%
2016-04-11 11:29:00 |    0%    0%   49%
2016-04-11 11:29:30 |    0%    0%   31%
2016-04-11 11:30:00 |    0%    0%   73%
2016-04-11 11:30:30 |    0%    0%    4%
2016-04-11 11:31:00 |    0%    0%   99%
[...]
2016-04-11 11:32:00 |    0%    0%   81%
2016-04-11 11:32:30 |    0%    0%   43%
[...]
2016-04-11 11:43:30 |    0%   93%    0%
2016-04-11 11:44:00 |    0%  100%    0%
2016-04-11 11:44:30 |    0%   97%    0%
2016-04-11 11:45:00 |    0%  100%    0%
2016-04-11 11:45:30 |    0%   10%    0%
[...]
2016-04-11 11:51:30 |    0%   79%    0%
2016-04-11 11:52:00 |    0%  100%    0%
2016-04-11 11:52:30 |    1%    9%    1%
2016-04-11 11:53:00 |    0%  100%    0%
2016-04-11 11:53:30 |    0%   98%    0%
2016-04-11 11:54:00 |    0%   30%    0%
2016-04-11 11:54:30 |    1%    1%    1%
2016-04-11 11:55:00 |    2%    3%    2%
[...]
2016-04-11 12:07:30 |    0%   68%    1%
2016-04-11 12:08:00 |    0%  100%    0%
2016-04-11 12:08:30 |    0%  100%    0%
2016-04-11 12:09:00 |    0%   38%    0%
[...]
2016-04-11 12:23:00 |    0%   84%    1%
2016-04-11 12:23:30 |    0%   58%    0%
[...]
2016-04-11 14:17:00 |    0%   43%    0%
2016-04-11 14:17:30 |    0%   99%    0%
2016-04-11 14:18:00 |    0%  100%    0%
2016-04-11 14:18:30 |    1%    6%    1%
[...]
2016-04-11 14:46:30 |    2%    2%    1%
[...]
2016-04-11 14:48:00 |    1%    9%    1%
2016-04-11 14:48:30 |    0%  100%    0%
2016-04-11 14:49:00 |    0%   96%    0%
2016-04-11 14:49:30 |    0%  100%    0%
2016-04-11 14:50:00 |    0%   99%    0%
2016-04-11 14:50:30 |    0%  100%    0%
2016-04-11 14:51:00 |    0%   41%    0%
2016-04-11 14:51:30 |    0%  100%    0%
2016-04-11 14:52:00 |    2%   18%    2%
[...]
2016-04-11 15:23:30 |    3%    5%    3%
[...]
2016-04-12 09:25:30 |    4%    3%    3%

Do you have an explanation?
Thanks for your help.

Santiago

OS : Debian Wheezy 7.4
Disks : ATA Hitachi HUA72302, 2000GB

Last edited by chebarbudo; 04-12-2016 at 07:00 AM.. Reason: edited command

chebarbudo

View Public Profile for chebarbudo

Find all posts by chebarbudo

04-11-2016

Registered User

1,003, 248

Join Date: Dec 2004

Last Activity: 16 June 2020, 4:14 AM EDT

Location: Isle-of-Skye

Posts: 1,003

Thanks Given: 53

Thanked 248 Times in 203 Posts

Hi,

This could be a number of things, but it will most likely revolve around the stripe size.

Regards

Gull04

gull04

View Public Profile for gull04

Visit gull04's homepage!

Find all posts by gull04

04-12-2016

Registered User

358, 5

Join Date: Nov 2008

Last Activity: 11 June 2020, 6:22 AM EDT

Location: various

Posts: 358

Thanks Given: 17

Thanked 5 Times in 5 Posts

Hi Gull04,

Thank you for your answer.
Is "stripe size" the same as "chunk size"?

Apparently, mine is 512k:

Code:

cat /proc/mdstat

returns

Code:

Personalities : [raid0]
md0 : active raid0 sdd1[2] sdc1[1] sdb1[0]
      5860543488 blocks super 1.2 512k chunks

unused devices: <none>

How can I identify if this is the source of the problem?

Regards
Santiago

chebarbudo

View Public Profile for chebarbudo

Find all posts by chebarbudo

04-12-2016

Moderator

2,327, 710

Join Date: Feb 2012

Last Activity: 3 May 2020, 3:12 AM EDT

Location: Devon, UK

Posts: 2,327

Thanks Given: 442

Thanked 710 Times in 578 Posts

RAID0 "stripes" the data across the three actuators you have and the stripe size (that's official RAID speak) is the minimum allocation. So if the stripe is 2k then the first 2k bytes of a file is written to the first drive, the next 2k to the second drive, and the third 2k to the third drive. It then goes back to the first drive, and so on.

So it's not difficult to see that writing lots of small files will give unpredictable results respecially if they're less than 2k each. Also, read requests can only be satisfied be reading the drive(s) where the files were written.

So your results are misleading.

If you have a desire to test this then you need to do something like......
Create a 4GB file on (ideally) an internal drive not part of this RAID0 array. Kick all the users off if you can and then copy this 4GB to the RAID filesystem and take your measurements whilst that's going on. It won't be precise but should give you a better set of figures.

hicksd8

View Public Profile for hicksd8

Find all posts by hicksd8

04-12-2016

Registered User

6,384, 2,214

Join Date: May 2005

Last Activity: 28 October 2019, 4:59 PM EDT

Location: In the leftmost byte of /dev/kmem

Posts: 6,384

Thanks Given: 143

Thanked 2,214 Times in 1,548 Posts

Quote:

Originally Posted by hicksd8

Create a 4GB file on (ideally) an internal drive not part of this RAID0 array.

Wouldn't it be sufficient to fire 4GB worth of any data (for instance some brand new hexadecimal zeroes freshly out of /dev/zero) with dd? Like

Code:

dd if=/dev/zero of=/the/raid/somefile bs=1G count=4

True, this will be off by the overhead of /dev/zero, wouldn't that be negligible given the bandwidth of disks and the memory interface (which are apart some orders of magnitude)?

I hope this helps.

bakunin

This User Gave Thanks to bakunin For This Post:

bakunin

View Public Profile for bakunin

Find all posts by bakunin

04-12-2016

Moderator

2,327, 710

Join Date: Feb 2012

Last Activity: 3 May 2020, 3:12 AM EDT

Location: Devon, UK

Posts: 2,327

Thanks Given: 442

Thanked 710 Times in 578 Posts

@bakunin......point taken.....good idea.

Sent from my HTC Desire S using Tapatalk 2

hicksd8

View Public Profile for hicksd8

Find all posts by hicksd8

04-14-2016

Registered User

358, 5

Join Date: Nov 2008

Last Activity: 11 June 2020, 6:22 AM EDT

Location: various

Posts: 358

Thanks Given: 17

Thanked 5 Times in 5 Posts

Hi guys,

Thank you very much for your contributions.

First of all, my problem does not happen any more. I created the raid with sdb, sdc and sdd on April 11 at 09:35.
Until 11:32, sdd was very busy, then until 14:51, sdc was very busy.
Since then (3 days), the 3 disks are always under the same moderate load altogether (0-20%). The server is used by 5 graphic designers manipulating quite large files (100M-2G).

I ran some tests and the results leave me quite puzzled. So I created simultaneously 10 files. 1GB each. But all the load went on sda. Leaving sdb, sdc and sdd with a moderate 20% load.

The command:

Code:

for i in {1..10}; do
  file=$(mktemp /galaxy/XXXXXXX)
  echo $file >> /galaxy/dd.files
  dd if=/dev/zero of=$file bs=1G count=1 &
  echo $!    >> /galaxy/dd.pids
done

The output of dstat:

Code:

----system---- sda--sdb--sdc--sdd-
     time     |util:util:util:util
14-04 15:56:30|  21:   0:   0:   0
14-04 15:57:00| 100:   0:   0:   0
14-04 15:57:30| 101:   0:   0:   0
14-04 15:58:00| 100:   2:   2:   1
14-04 15:58:30| 101:   3:   4:   2
14-04 15:59:00| 102:   4:   5:   4
14-04 15:59:30|  98:   2:   3:   2
14-04 16:00:00| 100:   4:   4:   2
14-04 16:00:30| 103:  16:  16:  15
14-04 16:01:00|  98:  16:  17:  15
14-04 16:01:30| 101:  15:  15:  15
14-04 16:02:00|  99:   9:   8:   8
14-04 16:02:30| 100:   3:   4:   3
14-04 16:03:00| 100:   2:   4:   3
14-04 16:03:30| 104:   4:   4:   3
14-04 16:04:00|  95:   4:   4:   3
14-04 16:04:30| 100:   3:   4:   2
14-04 16:05:00| 101:   3:   4:   3
14-04 16:05:30|  99:  12:  13:  12
14-04 16:06:00| 102:  20:  22:  18
14-04 16:06:30|  98:  17:  19:  18
14-04 16:07:00| 101:   7:   9:   8
14-04 16:07:30|  99:   4:   5:   3
14-04 16:08:00| 102:   4:   5:   3
14-04 16:08:30|  98:   3:   5:   3
14-04 16:09:00| 100:   5:   7:   5
14-04 16:09:30| 101:   5:   5:   4
14-04 16:10:00| 100:   4:   4:   2
14-04 16:10:30| 100:  17:  18:  16
14-04 16:11:01| 105:  16:  20:  16
14-04 16:11:30|  95:  15:  17:  17
14-04 16:12:00| 100:  12:  11:  10
14-04 16:12:30|  34:  15:  16:  14

Is /dev/zero an actual file of sda?
How do you interpret the results?

Regards
Santiago

chebarbudo

View Public Profile for chebarbudo

Find all posts by chebarbudo

UNIX for Dummies Questions & Answers

Raid0 array stresses only 1 disk out of 3

9 More Discussions You Might Find Interesting

1. AIX

DISK ARRAY PROTECTION SUSPENDED message following disk replacement

Discussion started by: terrya

2. Filesystems, Disks and Memory

DISK ARRAY PROTECTION SUSPENDED message displayed following disk replacement

Discussion started by: terrya

3. UNIX for Advanced & Expert Users

Disk Array

Discussion started by: pmoren

4. Solaris

Disk Array

Discussion started by: Kjons76

5. UNIX for Dummies Questions & Answers

Why is RAID0 faster?

Discussion started by: figaro

6. Linux

Raid0 recovery from external HD

Discussion started by: dangral

7. UNIX for Advanced & Expert Users

3510 Disk Array Problem

Discussion started by: Tirmazi

8. Solaris

Solaris RAID0 doubt...

Discussion started by: saagar

9. Solaris

A1000 Disk storage array

Discussion started by: Dulasi