solaris 8 hangs and data access error on reboot


 
Thread Tools Search this Thread
Operating Systems Solaris solaris 8 hangs and data access error on reboot
# 1  
Old 08-13-2008
solaris 8 hangs and data access error on reboot

Hi

using solaris 5.8 on UltraSPARC-IIi 360MHz.

Quote:
uname -a
SunOS sparc5 5.8 Generic_108528-07 sun4u sparc SUNW,Ultra-5_10

I know it is an old hardware but similiar hardware is running fine. here is the issue,

System booted : works okay for some time then display was hung so system was rebooted ..it gave data access error. again rebooted and it came up fine..

I have pasted the /var/adm/messages here..

pastebin - collaborative debugging tool

pasting here in short ..

Quote:
***********
#
Aug 11 22:41:29 sparc5 UDBH 0x0233<UE> UDBH.ESYND 0x33 UDBL 0x0000 UDBL.ESYND 0x00
#
Aug 11 22:41:29 sparc5 UDBH Syndrome 0x33 Memory Module DIMM3
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 740919 kern.info] [AFT2] errID 0x00001802.ecb80126 E$tag != PA from AFAR; E$line was victimized
#
Aug 11 22:41:33 sparc5 dumping memory from PA 0x00000000.1a2edac0 instead
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.000010d0
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0xc00001ff.e198071a
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0xc00001ff.e198071a
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.10007244
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x000002a1.003f1ba0
#
Aug 11 22:41:33 sparc5 unix: [ID 836849 kern.notice]
#
Aug 11 22:41:33 sparc5 panic[cpu0]/thread=30000c1d000:
#
Aug 11 22:41:33 sparc5 unix: [ID 424568 kern.notice] [AFT1] errID 0x00001802.ecb80126 UE Error(s)
#
Aug 11 22:41:33 sparc5 See previous message(s) for details
#
Aug 11 22:41:34 sparc5 unix: [ID 100000 kern.notice]
#
Aug 11 22:41:34 sparc5
#
Aug 11 22:41:35 sparc5 genunix: [ID 723222 kern.notice] 000002a1003f

**************
test-all at OBP did not show any error.. passed all.

Is it a CPU problem or Hard disk or memory issue ? or something else ?

what could be solution , a kernel patch is an option ?


Thanks

Last edited by upengan78; 08-13-2008 at 12:50 PM..
# 2  
Old 08-13-2008
This is 99% sure to be failed memory. The 2nd line of your log is pointing to a memory error on DIMM 3, and later the CPU causes a panic to crash the box due to this unrecoverable memory error.

Latest versions of Sol 8 kernel patches had better error correction routines, but if it truly is failing memory that's not going to help much. I'd say replace the bad DIMM, or if the box has enough you can just remove the bad memory and keep it running on less try that and see if it helps.
# 3  
Old 08-13-2008
Bug

Quote:
Originally Posted by rhfrommn
This is 99% sure to be failed memory. The 2nd line of your log is pointing to a memory error on DIMM 3, and later the CPU causes a panic to crash the box due to this unrecoverable memory error.

Latest versions of Sol 8 kernel patches had better error correction routines, but if it truly is failing memory that's not going to help much. I'd say replace the bad DIMM, or if the box has enough you can just remove the bad memory and keep it running on less try that and see if it helps.
Okay I will try this now and keep you posted, thanks much
# 4  
Old 08-14-2008
Its DEFINITELY memory module break down
# 5  
Old 08-14-2008
Bug

Quote:
Originally Posted by upengan78
Okay I will try this now and keep you posted, thanks much
I said I will try memory modules, but it was a hassle to take out memory chips due to a floppy drive on top of memory...anyways I have swapped the disk in to a similiar hardware sparc machine and machine is going fine since yesterday ...

still curious to know if memory or hard drive is problem..

let me see how long this system is up...


Thanks all for responding !!
# 6  
Old 08-15-2008
Of course it will be fine, cos the problem is with the memory on the other H/W. Its the memory module dude Smilie
# 7  
Old 08-15-2008
Hammer & Screwdriver

Quote:
Originally Posted by incredible
Of course it will be fine, cos the problem is with the memory on the other H/W. Its the memory module dude Smilie
SmilieSmilie That fills me with confidance

Smilie Thanks .....
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Solaris 10 boot problem - ERROR: Last Trap: Fast Data Access MMU Miss

Hello, We have a T5140 server with Solaris 10 and its suddenly throwing "segmentation core" when I login into the server and not showing any output for commands like df, mount etc. so I had to reboot the server to fix this issue. Please note that there's no boot disk mirroring. But... (2 Replies)
Discussion started by: prvnrk
2 Replies

2. Solaris

Solaris hangs (Firefox)

dear members, i am a newbie to solaris 11.3, and deeply impressed. On my test pc (Lenovo M90), however, using the internet hangs the system from time to time; initializing the connecting (WLAN) during installation (via the live cd) took two attempts as well. Maybe someone has a solution... (7 Replies)
Discussion started by: RichardLichten
7 Replies

3. Solaris

Solaris 10 - hangs during boot

Power server up runs through diagnostics. Goes to the boot section and then the bit where the cursor is spinning and thats it. Hangs completely. Any ideas? (10 Replies)
Discussion started by: psychocandy
10 Replies

4. Solaris

Solaris ssh client hangs when connecting to another Solaris machine

Got a strange problem. I have 4 Solaris servers all configured the same, Solaris 10 x86 update 10. When I try to ssh from one Solaris 10 server to another server ssh hangs. I have an identical server and when I try this everything works fine. The weird thing is if I am root on the server... (1 Reply)
Discussion started by: ccj4467
1 Replies

5. Solaris

solaris hangs after booting

Hi, We've got a Sun Fire V245 that's failing to boot from either boot disk (mirrored using SVM). It simply hangs after loading a few modules: Sun Fire V245, No Keyboard Copyright 2006 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.22.22, 1024 MB memory installed, Serial... (5 Replies)
Discussion started by: badoshi
5 Replies

6. Solaris

Unable to login Solaris 10 Sparc - Data Access Error

Hello everyone, This is the first time I am installing Solaris. I have SunBlade 1500 Wrkstn. I installed Solaris 10 Sparc. The installation went successfully but I don't get the login screen. I get the following error message: Boot device: disk:a File and agrs:- Data Access Error Ok ... (4 Replies)
Discussion started by: mfsaeed
4 Replies

7. HP-UX

Can't access external disk after reboot

Hi, I have an HP RX4640 running HP-UX 11iv3 with two internal disks and its connected to a HP disk system 2400 with fibre cable. If the storage system is online before the server is on then the server can't find the disks. And I get the following error messages: vgchange: Warning: Couldn't... (3 Replies)
Discussion started by: hoff
3 Replies

8. Solaris

Boot hangs up on solaris

Hi all, I have a SUN server Sun-Fire-V890 running solaris 9, with a remote system console . Both the IP adresses of the server and the console are in a private network (address 10.67.xxx). For some reasons I need to give the access to the server from outside address 194.xxx, but I prefer that... (2 Replies)
Discussion started by: aribault
2 Replies

9. SCO

System hangs on data compression

Hi, We are having an automated system installed on SCO unixware, which runs a End Of Day routine during midnight. Since few days system is hanging at data backup, I have checked the log, system hangs exactly on data compression, any idea or help that why system is hanging only on compression....... (10 Replies)
Discussion started by: tayyabq8
10 Replies

10. UNIX for Advanced & Expert Users

Data Access Error

Dear Reader, My Sun Machine comes to halt with a message 'Data Access Error'. What / Where could be wrong..?? Thanks in Advance.... (5 Replies)
Discussion started by: joseph_shibu
5 Replies
Login or Register to Ask a Question