Uncorrectable data Error


 
Thread Tools Search this Thread
Operating Systems Solaris Uncorrectable data Error
# 1  
Old 05-02-2006
Uncorrectable data Error

Hi all,
I'm getting an error "Unocrrectable data Error: Block 77e9270" followed by "disk not responding to selection" and then "disk OK".
At a guess I'd assume a corrupt sector.
Anyone know how I can somehow mark that sector to tell the OS to leave it alone?
# 2  
Old 05-03-2006
From Sunsolve:

Quote:
Disk blocks (sectors) can go bad for several reasons. There may be a hardware problem
which can cause the formatting information for that sector to become ruined, or the
sector may have been weak to begin with, or a spike may have been sent to the drive
by a power failure, etc.

Most often, when sector errors are detected, the hard drive is assumed bad and
either reformatted or replaced. This involves locating the drive's tape backup if
one even exists, formatting or replacing the drive, repartitioning, newfs, restore, etc.

This procedure describes a far less drastic approach, which often proves to
be effective for either the long or short term, depending on how critical the
drive or how severe the errors.

If the following conditions are met, you can try to repair, or "map out" a single
bad sector or a consecutive group of bad sectors. Ensure that:

* fsck was performed on the raw device /dev/rdsk/...

* the filesystem was unmounted when fsck was performed.

* the disk is type SCSI, since this procedure will not work on IDE drives.
IDE drives do not support manually mapping out defective blocks.

* the bad sectors reported by fsck are few and are contained to a small area.

* you know the affected ABSOLUTE BLOCK NUMBER(S); this is important for all
slices except for whichever slice begins on block 0. Usually, it is slice 0.
Usually, slices are addressed by Relative Block Number (where block 0 of a slice is
the slice's starting block on the disk).
i.e. slice 0 begins on block 0, so all references to absolute and relative blocks will
be the same. If slice 1 begins on block 3000, relative block 100 for that slice is actually
absolute block 3100. From the above example, there is a way to calculate any slice's
absolute block number (but it is beyond the scope of this document to show you how).

* that the disk in question is NOT under the control of any disk management utility
such as Veritas or DiskSuite, (otherwise, special considerations may be required
that are not discussed herein, refer to SRDB 16305)


Once performed, the repair procedures below may detect either SOFT or HARD sector
errors, the expected behavior for either error type is described:

SOFT ERROR
----------
When a soft error is detected, the data residing on the affected block(s) will be
automatically moved to an alternate block, then the defective block is either
repaired, or mapped out as a flawed sector.

HARD ERROR
----------
Depending on the severity of the corruption, behavior may be the same as that
described above, or if the sector is too corrupt, the data could be discarded.
In this case, the data would have been unreadable by conventional means (anyway).


WARNING: Disk corruption may be severe to the extent that the data on the block cannot
be salvaged and is therefore discarded. For critical systems, you should take the following
precautions in case the corruption turns out to be severe:

1) You can perform a backup of the entire disk before attempting to repair blocks.

2) You can run the procedure "REPAIR A RANGE OF BAD SECTORS" below TWICE. Run it
first by answering "no" to the following setup question:

analyze> setup Repair defective blocks[yes]? no

Then, if you are satisfied that the desired result will be achieved (for instance only
correctable soft errors were detected), then you can run the procedure again to actually
repair the blocks by changing the setup question to "yes":

analyze> setup Repair defective blocks[yes]? yes


Depending on the number of bad sectors reported by fsck, you can either repair a single
sector or a range of sectors, here are the steps for both:

TO REPAIR A SINGLE BAD SECTOR:


Enter the format utility, select the drive, and repair a single bad absolute sector,

for example:

format> repair
Enter absolute block number of defect: 2036256
Ready to repair defect, continue? y
(at this point, format will indicate how and if the block was repaired)
format> q


Now fsck the raw slice and make sure there are no problems. Then you can mount it.

TO REPAIR A RANGE OF BAD SECTORS:

The example below addresses the sector errors displayed at the top of this
document, I have chosen to scan the disk starting at sector 2036240 and ending
at 2036270 to ensure there are no other stray bad sectors. Since our example bad
sectors are a result of fsck'ing slice 0, the "absolute block" is the same as the
"block". If yours is not slice 0, make sure you use the "absolute block(s)":

Start the format utility and select the affected disk:

#format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
0. c0t1d0 <SUN1.05 cyl 2036 alt 2 hd 14 sec 72>
/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@1,0
Specify disk (enter its number): 0
selecting c0t1d0
[disk formatted]

format> analyze

analyze> setup
Analyze entire disk[yes]? no
Enter starting block number[0, 0/0/0]: 2036240
Enter ending block number[2052287, 2035/13/7]: 2036270
Loop continuously[no]? no
Enter number of passes[2]: 1
Repair defective blocks[yes]? no/yes (NOTE: Review the "WARNING" paragraph way above)
Stop after first error[no]? no
Use random bit patterns[no]? no
Enter number of blocks per transfer[126, 0/2/0]: 1
Verify media after formatting[yes]? yes
Enable extended messages[no]? no
Restore defect list[yes]? yes
Restore disk label[yes]? yes

analyze> read
Ready to analyze (won't harm SunOS). This takes a long time,
but is interruptible with CTRL-C. Continue? y

pass 0

Error during read: block 2036256 (0x5da1c0) (2003/5/3)
Repairing soft error on 2036256 (2003/5/3)...ok.

Error during read: block 2036257 (0x5da1c1) (2003/5/4)
Repairing soft error on 2036257 (2003/5/3)...ok.

Error during read: block 2036258 (0x5da1c2) (2003/5/5)
Repairing soft error on 2036258 (2003/5/3)...ok.

Error during read: block 2036259 (0x5da1c3) (2003/5/6)
Repairing soft error on 2036259 (2003/5/3)...ok.

2035/13/7

Total of 4 defective blocks repaired.
(0x5da1c0) (2003/5/3)
(0x5da1c1) (2003/5/4)
(0x5da1c2) (2003/5/5)
(0x5da1c3) (2003/5/6)

analyze> quit
format> quit


N
ow re-run fsck on the raw unmounted filesystem again. It should fsck cleanly.

If sector errors keep recurring, you may have a bad disk or disk controller.
At this point hardware has to be replaced.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Error while extracting data from log file

I am running awk command to extract data from log file to calculate last 15 minutes log using below command and now i am getting bellow error: awk '$0>=$from' from=$(`date -u +"####<%d-%b-%Y %H:%M:%S o'clock GMT>"-15min`) test.log Error: date: 0551-402 Invalid character in date/time... (8 Replies)
Discussion started by: oberoi1403
8 Replies

2. Shell Programming and Scripting

[Solved] Data error need to fix

Hi Guys, I`m having a strange problem with my data set. Whenever there is a transition to another value is col1, the corresponding 3rd col goes to the next line. This is a huge file, so need to fix in a script. The file is tab delimited. Here is what is happening when transitioning from... (4 Replies)
Discussion started by: gina.lizar
4 Replies

3. AIX

USER DATA I/O ERROR

Hi all, I ran errpt -a . I got the following errors , what they could be ? any ideas would be appreciated. --------------------------------------------------------------------------- LABEL: J2_USERDATA_EIO IDENTIFIER: EA88F829 Date/Time: Sun Feb 12 16:08:03 USAST 2012... (3 Replies)
Discussion started by: h@foorsa.biz
3 Replies

4. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

5. SCO

Error: Value too large for defined data type

Hi all, I have this problem in one of the SCO UNIXWare 7.1.4. We have an application which is working on hundreds of machines. When we try to install the same application on a new machine, the executable/binary gives the following error and exits... "xxx startup failure: Value too large for... (1 Reply)
Discussion started by: chava01
1 Replies

6. UNIX for Advanced & Expert Users

WARNING: VxVM vxio V-5-0-2 Subdisk disk04-28 block 6654400: Uncorrectable read error

Hi , i am new to the veritas volume manager.i saw some message in /var/ad/messages like " WARNING: VxVM vxio V-5-0-2 Subdisk disk04-28 block 6654400: Uncorrectable read error".the disk04 is c2t4d0s2. i have issued some commands iostat,vxdisk list,vxprint -ht .In vxprint -ht i got a message... (1 Reply)
Discussion started by: rjay.com
1 Replies

7. AIX

pax error on appending data to LTO3

I have problem when I use the command "pax -awvf /dev/rmt0 ./data1" in AIX 5.3.0.0. The command with parameter -a allow me to append the tape but when I try to retrieve the data that I append, it will show me error. I would like to know if anyone have the same problem and any solution found? Tq. (0 Replies)
Discussion started by: kwliew999
0 Replies

8. Solaris

invalid compressed data--crc error

I am getting this error when trying to unzip a file.gz . Anyone know how to resolve this ? (3 Replies)
Discussion started by: jxh461
3 Replies

9. UNIX for Advanced & Expert Users

Data Access Error

Dear Reader, My Sun Machine comes to halt with a message 'Data Access Error'. What / Where could be wrong..?? Thanks in Advance.... (5 Replies)
Discussion started by: joseph_shibu
5 Replies

10. UNIX for Dummies Questions & Answers

/bin/data error

I am getting this email error now from a script (which I have not changed and has bee up for a while) Subject: Cron <aquarank@fish> /home/aquarank/www/cgi-bin/cron/rerank.cgi /bin/date: write error: File too large What file is too large? (9 Replies)
Discussion started by: AquaRank
9 Replies
Login or Register to Ask a Question