Visit Our UNIX and Linux User Community


Filesystem mystery: disks are not busy on one machine, very busy on a similar box


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Filesystem mystery: disks are not busy on one machine, very busy on a similar box
# 1  
Old 08-12-2009
Filesystem mystery: disks are not busy on one machine, very busy on a similar box

Hi,
We have a filesystem mystery on our hands. Given:
  • 2 machines, A and Aa.
  • Machine Aa is the problem machine.
  • Machine A is running Ubuntu, kernel 2.6.22.9 #1 SMP Wed Feb 20 08:46:16 CST 2008 x86_64 GNU/Linux. Machine Aa is running RHEL5.3, kernel 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux.
  • Both are running the anticipatory i/o scheduler.
  • Both are running two software applications, x and y
  • x reads from the network and writes to disk.
  • y reads from x' file, filters, and writes annotated data about x
  • Both x and y perform their work in partition W. They are the only applications that open files in that partition. I have used lsof and also shut down the applications and immediately unmounted the partition (successfully), so I know those are the only two things with files open in there.

The problem:
  • On machine A, our disk utilization for the device whereupon partition W is mounted is very low. Interactive response is very good.
  • On machine Aa, our disk utilization is very high. Interactive response can sometimes lag; e.g., an ls of a directory in the busy partition can take 10-15 seconds to return.

So, how can we determine what causes the slowness on machine Aa?

Discussion:
Here are some example iostat readings, using
Code:
iostat -x -d <device> 1

First, machine A (the good machine):
Code:
Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c1d0        0.00   123.76    0.00  891.09     0.00  8118.81     9.11     1.64    1.84   0.06   4.95

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c1d0        0.00   108.91    0.00 1273.27     0.00 11065.35     8.69     1.81    1.42   0.07   8.91

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c1d0        0.00  3905.94    0.00 2447.52     0.00 50835.64    20.77    11.80    4.82   0.04  10.89


Now machine Aa. Notice how the w/s values are so much smaller, yet %util is 100:
Code:
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00   687.00  0.00 568.00     0.00  6144.00    10.82   144.54  292.86   1.76 100.10

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00   291.00  2.00 355.00   256.00  9104.00    26.22   122.17  425.53   2.80 100.10

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00  1239.00  1.00 603.00   128.00 15128.00    25.26    95.82  145.84   1.65  99.90

We are at a loss. What can we look for? I have changed the i/o scheduler around (to cpq, noop, deadline); this has made no difference to the stats of Aa. There is perhaps 60 Gig/day written to partition W. While an appreciable amount, the busier machine of the two seems to have no problem with it.

On machine Aa I have done
Code:
dd if=/dev/zero of=/file/in/partitionW bs=1024 count=20000000000

and get 189 MB/sec to partition W. If I try it to the regular root disk, I get 213 MB/sec. So it seems to be fairly fast.

Both machine A and machine Aa have an HP P400i RAID card. The card is set to cache writes, and the cache is 256 Meg Read and 256 Meg Write. Both machines' disks are configured in a RAID 5 hardware array. The devices, then, use HP's cciss driver and there is no software RAID or logical volumes built on top of them.

Machine Aa is an HP DL360 G6. Machine A is a G5 machine (older). But machine Aa has the problem...?

The HP Proliant utilities run on machine Aa show no issues with the drives. There are no orange warning lights on the front of the machine.

Is there anything else we can look at? Thanks.
-Mike Schwager

Last edited by mschwage; 08-12-2009 at 04:46 PM..
# 2  
Old 08-12-2009
Are the machines reading/writing the same directories?. Directory size can really affect performance of ls and other file operations.

Are the files mounted with NSF? If so are the mountpoints off the root directory / ?

IO request queue lengths are huge on the bad box as well.
# 3  
Old 08-13-2009
Quote:
Originally Posted by jim mcnamara
Are the machines reading/writing the same directories?. Directory size can really affect performance of ls and other file operations.

Are the files mounted with NSF? If so are the mountpoints off the root directory / ?

IO request queue lengths are huge on the bad box as well.
Thank for the reply. Here's my replies:
  • No they are not the same size. For the bad machine, the directories were 10 times as large. I have asked the developer to tune his application so the directory sizes are comparable.
  • No they are not mounted on NFS. They are on locally-mounted hardware RAID5 partition.
  • Yes, I expect queue lengths to be long since the machine is backed up writing to the disk.

Now the developer has reduced the number of files being written to. I have run these commands in the directories where the files live:
Code:
touch /tmp/outfile; sleep 10 ; find . -newer /tmp/outfile | wc -l

This will show me how many files have been written to in the last 10 seconds. On the good machine, I am writing to 3000 files in that time period. On the bad machine, I am writing to 125 files. This is an average, more or less.

Now I am still seeing this on the bad machine:
Code:
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        7.00   180.00 30.00 323.00  3712.00  4032.00    21.94    66.76  189.12   2.18  76.80

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00  2004.00  7.00 171.00   896.00  8744.00    54.16    73.36  201.26   2.92  52.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00  2445.00  2.00 237.00   256.00 24112.00   101.96   121.27  539.90   4.19 100.10

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        8.00   147.00 85.00 50.00 10760.00  7968.00   138.73     3.45  246.15   2.38  32.10

Note that at 100.10% utilization, we're doing only writes, and the "await" value is very high- 500 ms. or so. It seems that writes are killing us for some reason.
-Mike

---------- Post updated 08-13-09 at 12:56 PM ---------- Previous update was 08-12-09 at 02:52 PM ----------

OK, we figured it out: our partition is a RAID-5 controlled by an HP P400i RAID controller. We turned on the Drive Write Cache, but the Array Accelerator needs to be turned on for *each logical drive*. We only had it on for drive 1. Our problematic partition is logical drive 2. Once we turned it on, using
Code:
hpacucli ctrl slot=0 ld 2 modify arrayaccelerator=enable

we saw much much much better iostat results. Using the same iostat command as above, we now have:

Code:
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00   173.20  0.00 77.40     0.00  2008.00    25.94     0.02    0.26   0.02   0.12

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
cciss/c0d1        0.00   512.00  0.00 95.00     0.00  4856.00    51.12     0.02    0.22   0.03   0.30


Previous Thread | Next Thread
Test Your Knowledge in Computers #21
Difficulty: Easy
James Gosling developed the Java Programming Language.
True or False?

4 More Discussions You Might Find Interesting

1. Programming

using signals instead of busy wait

Hello, First time posting on this board hopefully someone will able to help me. I am looking for some examples of c programs which use a signal to notify the program to begin a certain action instead of having it continually loop to check to see if condition is attained(busy wait). Any... (5 Replies)
Discussion started by: jut26
5 Replies

2. HP-UX

Cannot umount - device busy

Anyone have any idea why I cannot umount a directory even though fuser shows no process attached to it? fuser -cu /data/oracle/GMPSHRDM/export /data/oracle/GMPSHRDM/export: umount /data/oracle/GMPSHRDM/export umount: cannot unmount /data/oracle/GMPSHRDM/export : Device busy umount:... (2 Replies)
Discussion started by: keelba
2 Replies

3. UNIX for Advanced & Expert Users

CD Drive is busy

one problem which seems simple but i've faced many times is when i've used cd & want to eject, however i'm not in /dev/cdrom/... in any of my Xterms but it gives the message that the the CD drive is busy. the only thing i could do is /etc/init.d/volmgt stop & eject the CD manually but it doesn't... (6 Replies)
Discussion started by: nikk
6 Replies

4. UNIX for Dummies Questions & Answers

umount, device busy, but..

I'm trying to unmount a file system, but umount says the device is busy. A fuser -c /myfs tells me that nothing on this fs is in use. Any idea? Oh, and btw, why isn't my signature beeing displayed? Show user's signature is enabled and I have entered one :/ (10 Replies)
Discussion started by: sTorm
10 Replies

Featured Tech Videos