The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
How do I know which HBA cards' hardware I have (on Solaris 10) ? ronbarak SUN Solaris 4 03-19-2008 08:27 AM
How do I know which HBA cards' hardware I have (on Solaris 10) ? ronbarak UNIX for Advanced & Expert Users 3 03-06-2008 02:33 AM
Migrating Solaris 9 to different hardware snerta UNIX for Advanced & Expert Users 5 12-24-2006 05:58 AM
hardware support for solaris 9 or 10 rjay.com SUN Solaris 2 12-08-2006 04:24 AM
[need help] about ip hardware error bucci SUN Solaris 1 11-24-2006 12:25 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 06-21-2005
giriplug giriplug is offline
Registered User
  
 

Join Date: Feb 2005
Posts: 13
Solaris hardware error

hi guys,

need some help on this error message.
im running solaris 2.6 on a e3500 and lately i encountered this error:-

lp[28679]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
[AFT0] Multiple Softerrors:
2 Intermittent, 4 Persistent, and 0 Sticky Softerrors accumulated
from Memory Module Board 7 J3300
[AFT0] Enabling verbose CE messages.
[AFT0] errID 0x000054d0.90e01b51 Corrected Memory Error on Board 7 J3300 is Intermittent
[AFT0] errID 0x000054d0.90e01b51 ECC Data Bit 3 was in error and corrected
lp[29236]: Warning: Received SIGPIPE; continuing
lp[28869]: Warning: Received SIGPIPE; continuing
lp[29257]: Warning: Received SIGPIPE; continuing
lp[29506]: Warning: Received SIGPIPE; continuing
lp[29647]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
[AFT0] Corrected Memory Error on CPU18, errID 0x00005869.ace8892a
AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x00005869.ace8892a Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x00005869.ace8892a ECC Data Bit 3 was in error and corrected
[AFT0] Corrected Memory Error on CPU19, errID 0x000058c4.581f31c0
AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x000058c4.581f31c0 Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x000058c4.581f31c0 ECC Data Bit 3 was in error and corrected AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x000058f8.deed0852 Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x000058f8.deed0852 ECC Data Bit 3 was in error and corrected
[AFT0] Corrected Memory Error on CPU14, errID 0x0000597e.b5a4f601
AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x0000597e.b5a4f601 Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x0000597e.b5a4f601 ECC Data Bit 3 was in error and corrected
lp[3024]: Warning: Received SIGPIPE; continuing
lp[3185]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
lp[5781]: Warning: Received SIGPIPE; continuing
lp[5885]: Warning: Received SIGPIPE; continuing
lp[5845]: Warning: Received SIGPIPE; continuing
lp[5872]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
lp[7756]: Warning: Received SIGPIPE; continuing
lp[8184]: Warning: Received SIGPIPE; continuing




is there anyone out there who can tell me whats wrong with the machine,i cant go to sunsolve because i dont have sun contract account to solve this problem....it looks like a memory error....

thx in advance....

Last edited by giriplug; 06-21-2005 at 03:43 AM..
  #2 (permalink)  
Old 06-21-2005
blowtorch's Avatar
blowtorch blowtorch is offline Forum Advisor  
Supporter
  
 

Join Date: Dec 2004
Location: Singapore
Posts: 2,350
Quote:
Quote from docs.sun.com - man signal.h
Signal No. Default Action Reason
SIGPIPE 13 Exit Broken Pipe
Now from your syslog output,
Quote:
Originally Posted by giriplug
lp[5781]: Warning: Received SIGPIPE; continuing
lp[5885]: Warning: Received SIGPIPE; continuing
lp[5845]: Warning: Received SIGPIPE; continuing
lp[5872]: Warning: Received SIGPIPE; continuing
The process is probably missing some pipefile that it is trying to read from. You could try shutting down lp and starting it up again.
This definitely does not look like a hardware problem.
  #3 (permalink)  
Old 06-21-2005
pressy's Avatar
pressy pressy is offline Forum Staff  
solaris cultist
  
 

Join Date: Aug 2003
Location: Vienna / Austria (Europe) [EARTH]
Posts: 726
well, it says that your memory is becoming dead:

UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x000054d0.90e01b51 Corrected Memory Error on Board 7 J3300 is Intermittent
[AFT0] errID 0x000054d0.90e01b51 ECC Data Bit 3 was in error and corrected

that's a memory bank on one of your systemboards.... :
http://www.sun.com/products-n-soluti...02-5032-15.pdf

gP
  #4 (permalink)  
Old 06-21-2005
blowtorch's Avatar
blowtorch blowtorch is offline Forum Advisor  
Supporter
  
 

Join Date: Dec 2004
Location: Singapore
Posts: 2,350
Oops, Pressy, I missed the forest for the trees there!
  #5 (permalink)  
Old 06-21-2005
rhfrommn rhfrommn is offline Forum Advisor  
Registered User
  
 

Join Date: Nov 2003
Location: Minnesota
Posts: 424
Unfortunately, due to one of my previous employers thinking that buying production servers from eBay was a good idea, I have TONS of experience with this kind of error.

This is definately a memory error, on dimm J3300 on system board 7. If you notice, it reports CPUs seeing errors several places, but it is a different CPU each time. If the CPU was failing it would always be the same one. But each time it says the memory module the error came from is the same one, which tells you that is the root cause of the error.

Also note, at the top of your output the error was intermittent, but by the bottom the error message said it is persistent. This isn't a good sign . . . Solaris can accomodate occasional memory errors, but if it is the same dimm constantly doing it like that you'll panic your box eventually. I would either replace that memory, or at least remove that bank and run with less memory. Better to run short than crash your box.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 06:01 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0