Solaris hardware error


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Solaris hardware error
# 1  
Old 06-21-2005
Solaris hardware error

hi guys,

need some help on this error message.
im running solaris 2.6 on a e3500 and lately i encountered this error:-

lp[28679]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
[AFT0] Multiple Softerrors:
2 Intermittent, 4 Persistent, and 0 Sticky Softerrors accumulated
from Memory Module Board 7 J3300
[AFT0] Enabling verbose CE messages.
[AFT0] errID 0x000054d0.90e01b51 Corrected Memory Error on Board 7 J3300 is Intermittent
[AFT0] errID 0x000054d0.90e01b51 ECC Data Bit 3 was in error and corrected
lp[29236]: Warning: Received SIGPIPE; continuing
lp[28869]: Warning: Received SIGPIPE; continuing
lp[29257]: Warning: Received SIGPIPE; continuing
lp[29506]: Warning: Received SIGPIPE; continuing
lp[29647]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
[AFT0] Corrected Memory Error on CPU18, errID 0x00005869.ace8892a
AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x00005869.ace8892a Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x00005869.ace8892a ECC Data Bit 3 was in error and corrected
[AFT0] Corrected Memory Error on CPU19, errID 0x000058c4.581f31c0
AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x000058c4.581f31c0 Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x000058c4.581f31c0 ECC Data Bit 3 was in error and corrected AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x000058f8.deed0852 Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x000058f8.deed0852 ECC Data Bit 3 was in error and corrected
[AFT0] Corrected Memory Error on CPU14, errID 0x0000597e.b5a4f601
AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8
UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x0000597e.b5a4f601 Corrected Memory Error on Board 7 J3300 is Persistent
[AFT0] errID 0x0000597e.b5a4f601 ECC Data Bit 3 was in error and corrected
lp[3024]: Warning: Received SIGPIPE; continuing
lp[3185]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
lp[5781]: Warning: Received SIGPIPE; continuing
lp[5885]: Warning: Received SIGPIPE; continuing
lp[5845]: Warning: Received SIGPIPE; continuing
lp[5872]: Warning: Received SIGPIPE; continuing
last message repeated 1 time
lp[7756]: Warning: Received SIGPIPE; continuing
lp[8184]: Warning: Received SIGPIPE; continuing




is there anyone out there who can tell me whats wrong with the machine,i cant go to sunsolve because i dont have sun contract account to solve this problem....it looks like a memory error....

thx in advance....

Last edited by giriplug; 06-21-2005 at 03:43 AM..
# 2  
Old 06-21-2005
Quote:
Quote from docs.sun.com - man signal.h
Signal No. Default Action Reason
SIGPIPE 13 Exit Broken Pipe
Now from your syslog output,
Quote:
Originally Posted by giriplug
lp[5781]: Warning: Received SIGPIPE; continuing
lp[5885]: Warning: Received SIGPIPE; continuing
lp[5845]: Warning: Received SIGPIPE; continuing
lp[5872]: Warning: Received SIGPIPE; continuing
The process is probably missing some pipefile that it is trying to read from. You could try shutting down lp and starting it up again.
This definitely does not look like a hardware problem.
# 3  
Old 06-21-2005
well, it says that your memory is becoming dead:

UDBH Syndrome 0xc8 Memory Module Board 7 J3300
[AFT0] errID 0x000054d0.90e01b51 Corrected Memory Error on Board 7 J3300 is Intermittent
[AFT0] errID 0x000054d0.90e01b51 ECC Data Bit 3 was in error and corrected

that's a memory bank on one of your systemboards.... :
http://www.sun.com/products-n-soluti...02-5032-15.pdf

gP
# 4  
Old 06-21-2005
Oops, Pressy, I missed the forest for the trees there!
# 5  
Old 06-21-2005
Unfortunately, due to one of my previous employers thinking that buying production servers from eBay was a good idea, I have TONS of experience with this kind of error. Smilie

This is definately a memory error, on dimm J3300 on system board 7. If you notice, it reports CPUs seeing errors several places, but it is a different CPU each time. If the CPU was failing it would always be the same one. But each time it says the memory module the error came from is the same one, which tells you that is the root cause of the error.

Also note, at the top of your output the error was intermittent, but by the bottom the error message said it is persistent. This isn't a good sign . . . Solaris can accomodate occasional memory errors, but if it is the same dimm constantly doing it like that you'll panic your box eventually. I would either replace that memory, or at least remove that bank and run with less memory. Better to run short than crash your box.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

Solaris 11 no sound despite finding hardware

Hello Everyone, I'm new to Solaris, less than a week to give an idea how green I am.:eek: Although new to UNIX, I've been running Linux (i.e. OpenSUSE, Ultimate Edition, Arch, and obviously Ubuntu) for many years, so, I decided to put Solaris 11 on my 12 core opteron. I had a bit of difficulty... (23 Replies)
Discussion started by: Nostradamus1973
23 Replies

2. Solaris

solaris X86 Hardware Checkup (HP HW)

Hi ... i am new about HP -Solaris 10 x86 I have fresh hardware, I need to check the RAM , CPU, BUS , Hard drives mounted Can you please help me with commnds. Thanks in advance (3 Replies)
Discussion started by: anand87
3 Replies

3. UNIX for Advanced & Expert Users

solaris 9 hardware mulfunction

Hello, Im working on solaris 9 and I need to write script which monitoring several hardware componenets for any failures such as memory , Disks , power supply etc. I using prtdiag to extract this info. What should I check in the output for : Memory (Is the block of "Memory Module Groups"... (1 Reply)
Discussion started by: Alalush
1 Replies

4. Solaris

regarding sun solaris hardware

hi :), i am new to this forum and i am in need of some help. one of my friend i having a Sun Ultra 2 UPA/Sbus (UltraSPARC-II 296Mhz) 640 MB ram. i am not sure if that configuration will be helpful to try out some sysadmin commands. this machine has a 21-inch monster monitor, which i think... (5 Replies)
Discussion started by: sudhiroracle
5 Replies

5. Solaris

How do I know which HBA cards' hardware I have (on Solaris 10) ?

Hi, I'm trying to determine which HBA cards are installed on my Solaris 10 (Sun-Fire-V240) machine. The relevant data I have is below, but from it I cannot ascertain for sure if I have Emulex 10000 or 11000 HBA cards. Can anyone suggest how to determine my HBA hardware ? Thanks, Ron. ... (4 Replies)
Discussion started by: ronbarak
4 Replies

6. UNIX for Advanced & Expert Users

How do I know which HBA cards' hardware I have (on Solaris 10) ?

Hi, I'm trying to determine which HBA cards are installed on my Solaris 10 (Sun-Fire-V240) machine. The relevant data I have is below, but from it I cannot ascertain for sure if I have Emulex 10000 or 11000 HBA cards. Can anyone suggest how to determine my HBA hardware ? Thanks, Ron. ... (3 Replies)
Discussion started by: ronbarak
3 Replies

7. Solaris

Solaris Hardware and ROHS

I'm trying to determine the End of Service Life for some of our Solaris servers, and I'm not sure if our servers are ROHS compliant. Is there a command to determine if the server is ROHS? If not, is there somewhere on the chasis where I can find this information? (2 Replies)
Discussion started by: dangral
2 Replies

8. UNIX for Advanced & Expert Users

Migrating Solaris 9 to different hardware

Hi, I am new to this forum and hope someone can help. Does anyone know how to restore a Solaris 9 backup tape from server A to a completely different server B hardware. Both boxes are sun 64bit. Your help is much appreciated. Thank you Jan. (5 Replies)
Discussion started by: snerta
5 Replies

9. Solaris

hardware support for solaris 9 or 10

Hi boss, i want to purches one second hand system intel P3 for practice.vender having mercury mother board.it will support sun 9 or 10. if any problem will create while configure like (ex:-apache) .tell me which configuration (intel) is best for practice.In intel system what are the... (2 Replies)
Discussion started by: rjay.com
2 Replies
Login or Register to Ask a Question