FAULTY DISK replacement HP rx4640


 
Thread Tools Search this Thread
Operating Systems HP-UX FAULTY DISK replacement HP rx4640
# 1  
Old 05-20-2012
FAULTY DISK replacement HP rx4640

Hello,

I'm new to this forum and as you will see from my question I'm new to UNIX as well.
One of our costumers has HP rx4640 running on UNIX with two 300GB hot-swappable disks that are mirrored. They reported to us that one of the disks is faulty and they want us to take care of it. Below is the only log they sent to us.

Code:
Fri May 18 17:50:11 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol1 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol3 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol4 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol5 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol6 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol7 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol8 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:50:12 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/SwapVol2 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol1 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol3 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol4 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol5 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol6 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol7 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/lvol8 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 
Fri May 18 17:56:06 2012    STCHK 122 sd_procchk sd_procchk 1 Logical volume 
    /dev/vg00/SwapVol2 is mirrored but has some stale blocks. Data loss on 
    hardware failure could occur. 






Code:
# pvdisplay -v /dev/disk/disk13_p2 | grep stale
   00000 stale    /dev/vg00/lvol1         00000 
   00089 stale    /dev/vg00/lvol3         00000 
   00090 stale    /dev/vg00/lvol3         00001 
   00094 stale    /dev/vg00/lvol3         00005 
   00096 stale    /dev/vg00/lvol3         00007 
   00121 stale    /dev/vg00/lvol4         00000 
   00122 stale    /dev/vg00/lvol5         00000 
   00171 stale    /dev/vg00/lvol5         00049 
   00176 stale    /dev/vg00/lvol5         00054 
   00177 stale    /dev/vg00/lvol5         00055 
   00183 stale    /dev/vg00/lvol5         00061 
   00184 stale    /dev/vg00/lvol5         00062 
   00186 stale    /dev/vg00/lvol5         00064 
   00215 stale    /dev/vg00/lvol5         00093 
   00219 stale    /dev/vg00/lvol5         00097 
   00221 stale    /dev/vg00/lvol5         00099 
   00237 stale    /dev/vg00/lvol5         00115 
   00242 stale    /dev/vg00/lvol5         00120 
   00279 stale    /dev/vg00/lvol6         00000 
   00296 stale    /dev/vg00/lvol7         00000 
   00298 stale    /dev/vg00/lvol7         00002 
   00299 stale    /dev/vg00/lvol7         00003 
   00306 stale    /dev/vg00/lvol7         00010 
   00309 stale    /dev/vg00/lvol7         00013 
   00314 stale    /dev/vg00/lvol7         00018 
   00318 stale    /dev/vg00/lvol7         00022 
   00326 stale    /dev/vg00/lvol7         00030 
   00327 stale    /dev/vg00/lvol7         00031 
   00337 stale    /dev/vg00/lvol7         00041 
   00338 stale    /dev/vg00/lvol7         00042 
   00340 stale    /dev/vg00/lvol7         00044 
   00344 stale    /dev/vg00/lvol7         00048 
   00415 stale    /dev/vg00/lvol8         00000 
   00416 stale    /dev/vg00/lvol8         00001 
   00417 stale    /dev/vg00/lvol8         00002 
   00422 stale    /dev/vg00/lvol8         00007 
   00429 stale    /dev/vg00/lvol8         00014 
   00434 stale    /dev/vg00/lvol8         00019 
   00437 stale    /dev/vg00/lvol8         00022 
   00438 stale    /dev/vg00/lvol8         00023 
   00439 stale    /dev/vg00/lvol8         00024 
   00441 stale    /dev/vg00/lvol8         00026 
   00445 stale    /dev/vg00/lvol8         00030 
   00446 stale    /dev/vg00/lvol8         00031 
   00447 stale    /dev/vg00/lvol8         00032 
   00448 stale    /dev/vg00/lvol8         00033 
   00449 stale    /dev/vg00/lvol8         00034 
   00459 stale    /dev/vg00/lvol8         00044 
   00460 stale    /dev/vg00/lvol8         00045 
   00461 stale    /dev/vg00/lvol8         00046 
   00462 stale    /dev/vg00/lvol8         00047 
   00497 stale    /dev/vg00/SwapVol2      00000

With my limited knowledge of UNIX i assumed from this that the disk ID is 13. If yes how do i find which of the two physical disk should be replaced?
And if i identify the problematic disk, are the below steps correct?

1) Check that the disk is not in the root volume group with lvlnboot -v command
2) continue with the disk replacement:
Code:
# pvchange -a N /dev/dsk/- 
# <replace the hot-swappable disk> 
# vgcfgrestore –n vg01 /dev/rdsk/-
# vgchange –a y vg01

If I'm way off please inform me as i got all this from "When Good Disks Go Bad" and as i mentioned I have very little experience with UNIX.

Any help is appreciated.
Thanks Gjk

Last edited by Scrutinizer; 05-20-2012 at 05:18 PM.. Reason: code tags
# 2  
Old 05-20-2012
I would not touch this if I were you. Ask an experienced HP-UX admin in your company to have a look...
# 3  
Old 05-21-2012
That was my first thought, to not touch it. But at the moment our UNIX admin is not available and costumer is expecting a solution tonight. If there is a straight-forward procedure, for at least how to identify which physical disk is, please help.
What i got so far is this command:
Code:
 # ioscan –m lun /dev/disk/disk13

Thanks again,
Gjk

Last edited by Scott; 05-23-2012 at 05:40 AM.. Reason: Code tags
# 4  
Old 05-21-2012
I dont know if I can help because I have VERY little experience on RX boxes now I managed to find one for home someone removed the disks before I got hold of it...
You did not say (or Im blind...) your OS version!
vg00 is usually the OS (and so root / boot disks...) even more true when you see no lvol2... for its the swap...
I would suggest before going further to get 2 new disks, get them ordered you may not need them or perhaps one but if things go wrong or the situation is not cool you may be glad to have 2 with you (believe my experience...).
Next once you have them try to make a recovery tape or an "Golden Image" somehow, if the system manages then you are perhaps in the way to solve your issue online, if not try to find the last bootable backup of the system (you may need it...)
Lest say you managed a make_recovery, you know can go and try stm or xstm (X GUI ) and see what the tool diagnose about your disks

here give us what you found, and we will do a bit of brainstorming...

Good luck!
# 5  
Old 05-21-2012
Thank you for your help VBE and you are not blind but I don't know the OS version. As i have very limited access to the costumer site only tonight i will be able to find out the OS version and more about the disk. What i have confirmed is that it is a boot disk. With what i know from costumer the system is connected to msl6030 tape library and they do backups of configuration every day. Unfortunately I'm not familiar with diagnose tool but I'll try my best to get as much info as possible and post.
Thank you very very much
# 6  
Old 05-22-2012
Type
Code:
 vgdisplay -v vg00 | grep dsk

to see what disks are in vg00
then again
Code:
vgdisplay -v vg00

And look at the last stanza : You should see something like:
Code:
   --- Physical volumes ---
   PV Name                     /dev/dsk/c2t2d0
   PV Status                   available                
   Total PE                    4340    
   Free PE                     357     
   Autoswitch                  On        
   Proactive Polling           On               

   PV Name                     /dev/dsk/c1t2d0
   PV Status                   available                
   Total PE                    4340    
   Free PE                     357     
   Autoswitch                  On        
   Proactive Polling           On

Hoping it will tell you what disk is failing... or already dead e.g:
Code:
# ioscan -funC disk

Class I H/W Path Driver S/W State H/W Type Description

===================================================================

disk 0 16/5.2.0 sdisk CLAIMED DEVICE TOSHIBA CD-ROM XM-5401TA

/dev/cdrom /dev/dsk/c1t2d0 /dev/rdsk/c1t2d0

disk 1 16/5.5.0 sdisk CLAIMED DEVICE SEAGATE ST39173N

/dev/dsk/c1t5d0 /dev/rdsk/c1t5d0

disk 2 16/5.6.0 sdisk NO_HW DEVICE SEAGATE ST39173N =>  No Hardware...

/dev/dsk/c1t6d0 /dev/rdsk/c1t6d0

Also, look what you have in your /var/adm/syslog/syslog.log ! You may fins EM- critical messages...
# 7  
Old 05-23-2012
Hello,

I was able last night to gather some more info about the OS and the condition of the disk. I think it needs to be replaced. As you will see from below is the alternate disk and not primary. I'm trying to scramble a procedure for replacement from "When Good Disks Go Bad" but i have some doubts about it. If you can help it will be real helpfully.

Code:
# uname -a
HP-UX - B.11.31 U ia64 2801820572 unlimited-user license

# model
ia64 hp server rx4640

# ioscan -funC disk
Class     I  H/W Path       Driver S/W State   H/W Type     Description
=======================================================================
disk      5  0/0/3/0.0.0.0  sdisk   CLAIMED     DEVICE       TEAC    DV-28E-N
                           /dev/dsk/c0t0d0   /dev/rdsk/c0t0d0
disk      0  0/1/1/0.1.0    sdisk   CLAIMED     DEVICE       HP 146 GST3146855LC
                           /dev/dsk/c2t1d0     /dev/dsk/c2t1d0s2   /dev/rdsk/c2t1d0    /dev/rdsk/c2t1d0s2
                           /dev/dsk/c2t1d0s1   /dev/dsk/c2t1d0s3   /dev/rdsk/c2t1d0s1  /dev/rdsk/c2t1d0s3
disk      4  0/1/1/1.0.0    sdisk   NO_HW       DEVICE       HP 146 GST3146855LC
                           /dev/dsk/c3t0d0     /dev/dsk/c3t0d0s2   /dev/rdsk/c3t0d0    /dev/rdsk/c3t0d0s2
                           /dev/dsk/c3t0d0s1   /dev/dsk/c3t0d0s3   /dev/rdsk/c3t0d0s1  /dev/rdsk/c3t0d0s3

# lvlnboot -v
Boot Definitions for Volume Group /dev/vg00:
Physical Volumes belonging in Root Volume Group:
    /dev/disk/disk9_p2 -- Boot Disk
    /dev/disk/disk13_p2 
Boot: lvol1    on:     /dev/disk/disk9_p2
            /dev/disk/disk13_p2
Root: lvol3    on:     /dev/disk/disk9_p2
            /dev/disk/disk13_p2
Swap: lvol2    on:     /dev/disk/disk9_p2
            /dev/disk/disk13_p2
Dump: lvol2    on:     /dev/disk/disk9_p2, 0

# pvdisplay –v /dev/dsk/c3t0d0 | more

# lvdisplay –v /dev/vg00/lvol1 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol2 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol3 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol4 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol5 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol6 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol7 | grep “Mirror copies”
Mirror copies               1  
# lvdisplay –v /dev/vg00/lvol8 | grep “Mirror copies”
Mirror copies  
             1              
# vgdisplay -v /dev/vg00
--- Volume groups ---
VG Name                     /dev/vg00
VG Write Access             read/write     
VG Status                   available                 
Max LV                      255    
Cur LV                      9      
Open LV                     9      
Max PV                      16     
Cur PV                      2      
Act PV                      2      
Max PE per PV               4357         
VGDA                        4   
PE Size (Mbytes)            32              
Total PE                    8694    
Alloc PE                    1994    
Free PE                     6700    
Total PVG                   0        
Total Spare PVs             0              
Total Spare PVs in use      0                     
VG Version                  1.0       
VG Max Size                 2230784m   
VG Max Extents              69712         

   --- Logical volumes ---
   LV Name                     /dev/vg00/lvol1
   LV Status                   available/stale           
   LV Size (Mbytes)            1824            
   Current LE                  57        
   Allocated PE                114         
   Used PV                     2       

   LV Name                     /dev/vg00/lvol2
   LV Status                   available/syncd           
   LV Size (Mbytes)            1024            
   Current LE                  32        
   Allocated PE                64          
   Used PV                     2       

   LV Name                     /dev/vg00/lvol3
   LV Status                   available/stale           
   LV Size (Mbytes)            1024            
   Current LE                  32        
   Allocated PE                64          
   Used PV                     2       

   LV Name                     /dev/vg00/lvol4
   LV Status                   available/stale           
   LV Size (Mbytes)            32              
   Current LE                  1         
   Allocated PE                2           
   Used PV                     2       

   LV Name                     /dev/vg00/lvol5
   LV Status                   available/stale           
   LV Size (Mbytes)            5024            
   Current LE                  157       
   Allocated PE                314         
   Used PV                     2       

   LV Name                     /dev/vg00/lvol6
   LV Status                   available/stale           
   LV Size (Mbytes)            544             
   Current LE                  17        
   Allocated PE                34          
   Used PV                     2       

   LV Name                     /dev/vg00/lvol7
   LV Status                   available/stale           
   LV Size (Mbytes)            3808            
   Current LE                  119       
   Allocated PE                238         
   Used PV                     2       

   LV Name                     /dev/vg00/lvol8
   LV Status                   available/stale           
   LV Size (Mbytes)            2624            
   Current LE                  82        
   Allocated PE                164         
   Used PV                     2       

   LV Name                     /dev/vg00/SwapVol2
   LV Status                   available/stale           
   LV Size (Mbytes)            16000           
   Current LE                  500       
   Allocated PE                1000        
   Used PV                     2       


   --- Physical volumes ---
   PV Name                     /dev/disk/disk9_p2
   PV Status                   available                
   Total PE                    4347    
   Free PE                     3350    
   Autoswitch                  On        
   Proactive Polling           On               

   PV Name                     /dev/disk/disk13_p2
   PV Status                   unavailable              
   Total PE                    4347    
   Free PE                     3350    
   Autoswitch                  On        
   Proactive Polling           On               

# cat bootconf 
l  /dev/disk/disk9_p2
l  /dev/disk/disk13_p2

# setboot
Primary bootpath : 0/1/1/0.0x1.0x0 (/dev/rdisk/disk9)
HA Alternate bootpath : 0/1/1/1.0x0.0x0 (/dev/rdisk/disk13)
Alternate bootpath : 0/1/2/0 (LAN Interface)

Autoboot is ON (enabled)
Hyperthreading : OFF
               : OFF (next boot)


# ioscan –m lun /dev/disk/disk13
Class     I  Lun H/W Path  Driver  S/W State   H/W Type     Health    Description
========================================================================
disk     13  64000/0xfa00/0x5   esdisk  NO_HW       DEVICE       disabled  HP 146 GST3146855LC       
             0/1/1/1.0x0.0x0
                      /dev/disk/disk13      /dev/disk/disk13_p2   /dev/rdisk/disk13     /dev/rdisk/disk13_p2
                      /dev/disk/disk13_p1   /dev/disk/disk13_p3   /dev/rdisk/disk13_p1  /dev/rdisk/disk13_p3

Then from the below i was able to identify it physicaly
 # dd if=/dev/dsk/c3t0d0 of=/dev/null bs=1024 
  /dev/dsk/c3t0d0: No such device or address
  dd: cannot open /dev/dsk/c3t0d0

# dd if=/dev/dsk/c2t1d0 of=/dev/null bs=1024
  2339888+0 records in
  2339888+0 records out

Any help will be great,

Thank you.

Last edited by Scott; 05-23-2012 at 05:39 AM.. Reason: Code tags, please...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Show faulty shows PS1 faulty

I plugged both power cables in both power supply. When I unplugged each power cable one by one, the SPARC T4-1 machine keep running. However, show faulty command shows below message. (I have also attached the picture of both power supply) -> show faulty Target ... (1 Reply)
Discussion started by: z_haseeb
1 Replies

2. AIX

DISK ARRAY PROTECTION SUSPENDED message following disk replacement

Hello, On 4/20/2018, we performed a disk replacement on our IBM 8202 P7 server. After the disk was rebuilt, the SAS Disk Array sissas0 showed a status of degraded. However, the pdisks in the array all show a status of active. We did see a message in errpt. DISK ARRAY PROTECTION SUSPENDED. ... (3 Replies)
Discussion started by: terrya
3 Replies

3. Filesystems, Disks and Memory

DISK ARRAY PROTECTION SUSPENDED message displayed following disk replacement

Hello, On 4/20/2018, we performed a disk replacement on our IBM 8202 P7 server. After the disk was rebuilt, the SAS Disk Array sissas0 showed a status of degraded. However, the pdisks in the array all show a status of active. We did see a message in errpt. DISK ARRAY PROTECTION SUSPENDED. ... (1 Reply)
Discussion started by: terrya
1 Replies

4. AIX

Disk replacement on SharedVG.

Hi, One of my disk is in 'disk missing state'. It is a sharedVG and cluster nodes. The errpt keeps reporting stale partition error. lvs are in open/stale state. In this sceanario is replacing the disk the best practice? When i do a lsdev the disk is labelled as below. hdisk3 Available ... (2 Replies)
Discussion started by: ElizabethPJ
2 Replies

5. Solaris

[solved] How to blink faulty disk in Solaris hardware?

Hi Guys, One of two disks in my solaris machine has failed, the name is disk0, this is SUN physical sparc machine But I work remotely, so people working near that physical server are not that technical, so from OS command prompt can run some command to bink faulty disk at front panel of Server.... (9 Replies)
Discussion started by: manalisharmabe
9 Replies

6. HP-UX

Remove faulty disk LV from VG

Hi, Have mirrored the primary disk to 3 . Server and OS: # uname -a HP-UX pdwp1s B.11.11 U 9000/800 118434630 unlimited-user license # model 9000/800/L3000-7x # strings /etc/lvmtab /dev/vg00 +F@< /dev/dsk/c1t2d0 /dev/dsk/c2t2d0 /dev/dsk/c2t0d0 But now I have only 1 disk... (5 Replies)
Discussion started by: Shirishlnx
5 Replies

7. HP-UX

Remove Faulty disk from HP-UX LVM VG

Requirement to remove a faulty mirrored disk from hp-ux LVM <root@pdwp1s>/etc # vgdisplay -v /dev/vg00 vgdisplay: Warning: couldn't query physical volume "/dev/dsk/c2t0d0": The specified path does not correspond to physical volume attached to this volume group vgdisplay: Warning: couldn't... (9 Replies)
Discussion started by: Shirishlnx
9 Replies

8. Solaris

Help with faulty Disk on Sun OS

Hi, Recently i came across a disk that seems to be faulty and need help. I have gathered some information by running below commands and any help on how to solve this will be great. # uname –a SunOS XYZ 5.7 Generic_106541-16 sun4u sparc SUNW,Ultra-4 #df -k Filesystem kbytes used... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

9. AIX

Removing Faulty Disk SSA

Hi Experts, I have configured A D40 Array. There is an faulty disk which is not part of an raid volume but shows fault in the diagnostics. pdisk15 U0.1-P1-I1/Q1-W40AA83CC2400D SSA160 Physical Disk Drive ( MB) Is there a way to stop this... (2 Replies)
Discussion started by: vuppala360
2 Replies

10. Solaris

Disk replacement with svm

I dont even know what raid level this is, but its raid 5 mirrored from the looks of it. I have a failed disk (t12) within this mirror. What is the best way to replace this disk? 2 things concern me, isn't there a command to prepare the disk for a hot swap? and what should i do with the... (3 Replies)
Discussion started by: BG_JrAdmin
3 Replies
Login or Register to Ask a Question