The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Operating Systems > SUN Solaris
.
google unix.com



SUN Solaris The Solaris Operating System, usually known simply as Solaris, is a free Unix-based operating system introduced by Sun Microsystems .

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Solaris 10 in vmware : vim hangs Tex-Twil SUN Solaris 1 04-15-2008 08:10 AM
Boot hangs up on solaris aribault SUN Solaris 2 07-24-2007 10:45 AM
System hangs on data compression tayyabq8 SCO 10 08-02-2005 08:45 AM
Solaris 9 server hangs, when rebooting smohd UNIX for Advanced & Expert Users 3 09-03-2002 02:24 PM
Data Access Error joseph_shibu UNIX for Advanced & Expert Users 5 11-13-2001 05:22 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 08-13-2008
upengan78 upengan78 is offline
Registered User
  
 

Join Date: Jun 2008
Location: Texas
Posts: 140
solaris 8 hangs and data access error on reboot

Hi

using solaris 5.8 on UltraSPARC-IIi 360MHz.

Quote:
uname -a
SunOS sparc5 5.8 Generic_108528-07 sun4u sparc SUNW,Ultra-5_10

I know it is an old hardware but similiar hardware is running fine. here is the issue,

System booted : works okay for some time then display was hung so system was rebooted ..it gave data access error. again rebooted and it came up fine..

I have pasted the /var/adm/messages here..

pastebin - collaborative debugging tool

pasting here in short ..

Quote:
***********
#
Aug 11 22:41:29 sparc5 UDBH 0x0233<UE> UDBH.ESYND 0x33 UDBL 0x0000 UDBL.ESYND 0x00
#
Aug 11 22:41:29 sparc5 UDBH Syndrome 0x33 Memory Module DIMM3
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 740919 kern.info] [AFT2] errID 0x00001802.ecb80126 E$tag != PA from AFAR; E$line was victimized
#
Aug 11 22:41:33 sparc5 dumping memory from PA 0x00000000.1a2edac0 instead
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.000010d0
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0xc00001ff.e198071a
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0xc00001ff.e198071a
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.10007244
#
Aug 11 22:41:33 sparc5 SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x000002a1.003f1ba0
#
Aug 11 22:41:33 sparc5 unix: [ID 836849 kern.notice]
#
Aug 11 22:41:33 sparc5 panic[cpu0]/thread=30000c1d000:
#
Aug 11 22:41:33 sparc5 unix: [ID 424568 kern.notice] [AFT1] errID 0x00001802.ecb80126 UE Error(s)
#
Aug 11 22:41:33 sparc5 See previous message(s) for details
#
Aug 11 22:41:34 sparc5 unix: [ID 100000 kern.notice]
#
Aug 11 22:41:34 sparc5
#
Aug 11 22:41:35 sparc5 genunix: [ID 723222 kern.notice] 000002a1003f

**************
test-all at OBP did not show any error.. passed all.

Is it a CPU problem or Hard disk or memory issue ? or something else ?

what could be solution , a kernel patch is an option ?


Thanks

Last edited by upengan78; 08-13-2008 at 11:50 AM..
  #2 (permalink)  
Old 08-13-2008
rhfrommn rhfrommn is offline Forum Advisor  
Registered User
  
 

Join Date: Nov 2003
Location: Minnesota
Posts: 420
This is 99% sure to be failed memory. The 2nd line of your log is pointing to a memory error on DIMM 3, and later the CPU causes a panic to crash the box due to this unrecoverable memory error.

Latest versions of Sol 8 kernel patches had better error correction routines, but if it truly is failing memory that's not going to help much. I'd say replace the bad DIMM, or if the box has enough you can just remove the bad memory and keep it running on less try that and see if it helps.
  #3 (permalink)  
Old 08-13-2008
upengan78 upengan78 is offline
Registered User
  
 

Join Date: Jun 2008
Location: Texas
Posts: 140
Smile

Quote:
Originally Posted by rhfrommn View Post
This is 99% sure to be failed memory. The 2nd line of your log is pointing to a memory error on DIMM 3, and later the CPU causes a panic to crash the box due to this unrecoverable memory error.

Latest versions of Sol 8 kernel patches had better error correction routines, but if it truly is failing memory that's not going to help much. I'd say replace the bad DIMM, or if the box has enough you can just remove the bad memory and keep it running on less try that and see if it helps.
Okay I will try this now and keep you posted, thanks much
  #4 (permalink)  
Old 08-14-2008
incredible incredible is offline Forum Advisor  
Registered User
  
 

Join Date: May 2008
Location: s'pore
Posts: 2,024
Its DEFINITELY memory module break down
  #5 (permalink)  
Old 08-14-2008
upengan78 upengan78 is offline
Registered User
  
 

Join Date: Jun 2008
Location: Texas
Posts: 140
Smile

Quote:
Originally Posted by upengan78 View Post
Okay I will try this now and keep you posted, thanks much
I said I will try memory modules, but it was a hassle to take out memory chips due to a floppy drive on top of memory...anyways I have swapped the disk in to a similiar hardware sparc machine and machine is going fine since yesterday ...

still curious to know if memory or hard drive is problem..

let me see how long this system is up...


Thanks all for responding !!
  #6 (permalink)  
Old 08-15-2008
incredible incredible is offline Forum Advisor  
Registered User
  
 

Join Date: May 2008
Location: s'pore
Posts: 2,024
Of course it will be fine, cos the problem is with the memory on the other H/W. Its the memory module dude
  #7 (permalink)  
Old 08-15-2008
upengan78 upengan78 is offline
Registered User
  
 

Join Date: Jun 2008
Location: Texas
Posts: 140
Wink

Quote:
Originally Posted by incredible View Post
Of course it will be fine, cos the problem is with the memory on the other H/W. Its the memory module dude
That fills me with confidance

Thanks .....
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 09:53 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language translation by Google.
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0