Sponsored Content
Operating Systems AIX errpt kept sending errors after disk replacement Post 302471221 by aixlover on Friday 12th of November 2010 09:10:57 AM
Old 11-12-2010
errpt kept sending errors after disk replacement

Hi,

The system is a Power6 8204 with an external storage 7031. OS is AIX 5.3. I replaced a failed disk hdisk28 and put it back to the volume group. Everything looks just fine. After the replacement, errpt has kept sending "Perm DISK OPERATION ERROR".

Other than the error, everything still looks fine with the new disk and the system. I've closed all the serviceable cases and renewed Log Repair in diag, but still received the errors.

My question are (1) how to stop errpt from sending the error? (2) How to troubleshoot any possible disk issue? Please help.

Thank you in advance!
 

10 More Discussions You Might Find Interesting

1. SuSE

ERRPT in aix , replacement in suse linux?

Errpt- Generates an error report from entries in the log. Errpt is a AIX command. we want equivallent suse linux command. If there is no exact replace ment pls give us alternative way. we want to log each n evry system error and generate an error report (2 Replies)
Discussion started by: vrguha
2 Replies

2. AIX

AIX 5.3 errpt full of message: DISK OPERATION ERROR

Hi All, Can anyone explain me the meanning of the following errors: LABEL: SC_DISK_ERR2 IDENTIFIER: B6267342 Description DISK OPERATION ERROR Probable Causes DASD DEVICE Failure Causes DISK DRIVE DISK DRIVE ELECTRONICS Recommended Actions PERFORM PROBLEM DETERMINATION... (1 Reply)
Discussion started by: gianlu
1 Replies

3. AIX

IDE DISK ERR2 and LVM SA STALEPP errors in errpt

Hi, I'm getting the errors below in the errpt report for a IBM Blade server. I'm guessing there's a problem with one of the disks but don't know how I can confirm this. Can anyone offer any suggestions? Regards Gareth (4 Replies)
Discussion started by: m223464
4 Replies

4. Solaris

Disk Replacement SVM

Hello, Can someone advise the proper procedure for replacing a mirrored disk in SVM. I have checked the docs and various websites but the procedure seems to vary. This is what I would do... 1. Remove the db replicas from the bad disk. 2. Detach it from the mirror 3. Clear it with... (4 Replies)
Discussion started by: Actuator
4 Replies

5. UNIX and Linux Applications

Got Errors while sending a transaction through ESB

We are getting below error when processing a transaction through ESB. I work for SOA admin and checked the JCA connection is working fine also code also working fine in other envs. An unhandled exception has been thrown in the ESB system. The exception reported is:... (1 Reply)
Discussion started by: KuldeepSinghTCS
1 Replies

6. AIX

How to read or understand errors in errpt

Hello, after upgrading the memory to 96GB & 6 Dual Processor for P 550 ( and still not applied the parameters which some experienced posters said in post https://www.unix.com/aix/141835-oracle-database-running-slow-aix-nmon-topas-6.html ) I am getting system dumps. How to understand and... (1 Reply)
Discussion started by: filosophizer
1 Replies

7. Shell Programming and Scripting

sending mail in perl.. No errors and also no output

Hi folks, I am trying to send an email in Perl script with the below code. I have written the code in Padre IDE and installed all the required modules(Mail::Sendmail) and executed the code. It is neither showing errors nor giving the output. I havnt received an mail after running the below... (1 Reply)
Discussion started by: giridhar276
1 Replies

8. Shell Programming and Scripting

Sending errors via mail

Hey guys. I have created a script that mounts an external folder via sshfd, counts the number of files then do some delete and then counts the files again to get how many files have been deleted. Then it sends the resault by mail. My question is, how can i send via mail the errors on this... (2 Replies)
Discussion started by: Pizza
2 Replies

9. Filesystems, Disks and Memory

DISK ARRAY PROTECTION SUSPENDED message displayed following disk replacement

Hello, On 4/20/2018, we performed a disk replacement on our IBM 8202 P7 server. After the disk was rebuilt, the SAS Disk Array sissas0 showed a status of degraded. However, the pdisks in the array all show a status of active. We did see a message in errpt. DISK ARRAY PROTECTION SUSPENDED. ... (1 Reply)
Discussion started by: terrya
1 Replies

10. AIX

DISK ARRAY PROTECTION SUSPENDED message following disk replacement

Hello, On 4/20/2018, we performed a disk replacement on our IBM 8202 P7 server. After the disk was rebuilt, the SAS Disk Array sissas0 showed a status of degraded. However, the pdisks in the array all show a status of active. We did see a message in errpt. DISK ARRAY PROTECTION SUSPENDED. ... (3 Replies)
Discussion started by: terrya
3 Replies
vxrelocd(1M)															      vxrelocd(1M)

NAME
vxrelocd - monitor Veritas Volume Manager for failure events and relocate failed subdisks SYNOPSIS
/etc/vx/bin/vxrelocd [-o vxrecover_argument] [-O old_version] [-s save_max] [mail_address...] DESCRIPTION
The vxrelocd command monitors Veritas Volume Manager (VxVM) by analyzing the output of the vxnotify command, and waits for a failure. When a failure occurs, vxrelocd sends mail via mailx to root (by default) or to other specified users and relocates failed subdisks. After com- pleting the relocation, vxrelocd sends more mail indicating the status of each subdisk replacement. The vxrecover utility is then run on volumes with relocated subdisks to restore data. Mail is sent after vxrecover executes. OPTIONS
-o The -o option and its argument are passed directly to vxrecover if vxrecover is called. This allows specifying -o slow[=iodelay] to keep vxrecover from overloading a busy system during recovery. The default value for the delay is 250 milliseconds. -O Reverts back to an older version. Specifying -O VxVM_version directs vxrelocd to use the relocation scheme in that version. -s Before vxrelocd attempts a relocation, a snapshot of the current configuration is saved in /etc/vx/saveconfig.d. This option specifies the maximum number of configurations to keep for each diskgroup. The default is 32. Mail Notification By default, vxrelocd sends mail to root with information about a detected failure and the status of any relocation and recovery attempts. To send mail to other users, add the user login name to the vxrelocd startup line in the startup script /sbin/init.d/vxvm-recover, and reboot the system. For example, if the line appears as: nohup vxrelocd root & and you want mail also to be sent to user1 and user2, change the line to read: nohup vxrelocd root user1 user2 & Alternatively, you can kill the vxrelocd process and restart it as vxrelocd root mail_address, where mail_address is a user's login name. Do not kill the vxrelocd process while a relocation attempt is in progress. The mail notification that is sent when a failure is detected follows this format: Failures have been detected by the Veritas Volume Manager: failed disks: medianame ... failed plexes: plexname ... failed log plexes: plexname ... failing disks: medianame ... failed subdisks: subdiskname ... The Volume Manager will attempt to find spare disks, relocate failed subdisks and then recover the data in the failed plexes. The medianame list under failed disks specifies disks that appear to have completely failed; the medianame list under failing disks indi- cates a partial disk failure or a disk that is in the process of failing. When a disk has failed completely, the same medianame list appears under both failed disks and failing disks. The plexname list under failed plexes shows plexes that were detached due to I/O fail- ures that occurred while attempting to do I/O to subdisks they contain. The plexname list under failed log plexes indicates RAID-5 or DRL (dirty region logging) log plexes that have failed. The subdiskname list specifies subdisks in RAID-5 volumes that were detached due to I/O errors. Spare Space A disk can be marked as ``spare.'' This makes the disk available as a site for relocating failed subdisks. Disks that are marked as spares are not used for normal allocations unless you explicitly specify them. This ensures that there is a pool of spare space available for relocating failed subdisks and that this space does not get consumed by normal operations. Spare space is the first space used to relocate failed subdisks. However, if no spare space is available, or the available spare space is not suitable or sufficient, free space is also used except for those marked with the nohotuse flag. See the vxedit(1M) and vxdiskadm(1M) manual pages for more information on marking a disk as a spare or nohotuse. Nohotuse Space A disk can be marked as ``nohotuse.'' This excludes the disk from being used by vxrelocd, but it is still available as free space. See the vxedit(1M) and vxdiskadm(1M) manual pages for more information on marking a disk as a spare or nohotuse. Replacement Procedure After mail is sent, vxrelocd relocates failed subdisks (those listed in the subdisks list). This requires finding appropriate spare or free space in the same disk group as the failed subdisk. A disk is eligible as replacement space if it is a valid Veritas Volume Manager disk (VM disk) and contains enough space to hold the data contained in the failed subdisk. If no space is available on spare disks, the relocation uses free space that is not marked nohotuse. To determine which of the eligible disks to use, vxrelocd first tries the disk that is closest to the failed disk. The value of ``close- ness'' depends on the controller, target, and disk number of the failed disk. A disk on the same controller as the failed disk is closer than a disk on a different controller; a disk under the same target as the failed disk is closer than one under a different target. vxrelocd moves all subdisks from a failing drive to the same destination disk if possible. If no spare or free space is found, mail is sent explaining the disposition of volumes that had storage on the failed disk: Hot-relocation was not successful for subdisks on disk dm_name in volume v_name in disk group dg_name. No replacement was made and the disk is still unusable. The following volumes have storage on medianame: volumename ... These volumes are still usable, but the redundancy of those volumes is reduced. Any RAID-5 volumes with storage on the failed disk may become unusable in the face of further failures. If any non-RAID-5 volumes were made unusable due to the disk failure, the following message is included: The following volumes: volumename ... have data on medianame but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored. If any RAID-5 volumes were made unavailable due to the disk failure, the following message is included: The following RAID-5 volumes: volumename ... had storage on medianame and have experienced other failures. These RAID-5 volumes are now unusable and data on them is unavailable. These RAID-5 volumes must have their data restored. If there is spare space available, a snapshot of the current configuration is saved in /etc/vx/saveconfig.d/dg_name.yymmdd_hhmmss.mpvsh before attempting a subdisk relocation. Relocation requires setting up a subdisk on the spare or free space not marked with nohotuse and using it to replace the failed subdisk. If this is successful, the vxrecover command runs in the background to recover the data in volumes that had storage on the disk. If the relocation fails, the following message is sent: Hot-relocation was not successful for subdisks on disk dm_name in volume v_name in disk group dg_name. No replacement was made and the disk is still unusable. If any volumes (RAID-5 or otherwise) become unusable due to the failure, the following message is included: The following volumes: volumename ... have data on dm_name but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored. If the relocation procedure was successful and recovery has begun, the following mail message is sent: Volume v_name Subdisk sd_name relocated to newsd_name, but not yet recovered. After recovery completes, a mail message is sent relaying the result of the recovery procedure. If the recovery is successful, the follow- ing message is included in the mail: Recovery complete for volume v_name in disk group dg_name. If the recovery was not successful, the following message is included in the mail: Failure recovering v_name in disk group dg_name. Disabling vxrelocd If you do not want automatic subdisk relocation, you can disable the hot-relocation feature by killing the relocation daemon, vxrelocd, and preventing it from restarting. However, do not kill the daemon while it is doing the relocation. To kill the daemon, run the command: ps -ef from the command line and find the two entries for vxrelocd. Execute the command: kill -9 PID1 PID2 (substituting PID1 and PID2 with the process IDs for the two vxrelocd processes). To prevent vxrelocd from being started again, you must comment out the line that starts up vxrelocd in the startup script /sbin/init.d/vxvm-recover. FILES
/sbin/init.d/vxvm-recover The startup file for vxrelocd. /etc/vx/saveconfig.d/dg_name.yymmdd_hhmmss.mpvsh File where vxrelocd saves a snapshot of the current configuration before performing a relocation. SEE ALSO
kill(1), mailx(1), ps(1), vxdiskadm(1M), vxedit(1M), vxintro(1M), vxnotify(1M), vxrecover(1M), vxsparecheck(1M), vxunreloc(1M) VxVM 5.0.31.1 24 Mar 2008 vxrelocd(1M)
All times are GMT -4. The time now is 05:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy