![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How do I know which HBA cards' hardware I have (on Solaris 10) ? | ronbarak | SUN Solaris | 4 | 03-19-2008 08:27 AM |
| How do I know which HBA cards' hardware I have (on Solaris 10) ? | ronbarak | UNIX for Advanced & Expert Users | 3 | 03-06-2008 02:33 AM |
| Migrating Solaris 9 to different hardware | snerta | UNIX for Advanced & Expert Users | 5 | 12-24-2006 05:58 AM |
| hardware support for solaris 9 or 10 | rjay.com | SUN Solaris | 2 | 12-08-2006 04:24 AM |
| [need help] about ip hardware error | bucci | SUN Solaris | 1 | 11-24-2006 12:25 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
Solaris hardware error
hi guys,
need some help on this error message. im running solaris 2.6 on a e3500 and lately i encountered this error:- lp[28679]: Warning: Received SIGPIPE; continuing last message repeated 1 time [AFT0] Multiple Softerrors: 2 Intermittent, 4 Persistent, and 0 Sticky Softerrors accumulated from Memory Module Board 7 J3300 [AFT0] Enabling verbose CE messages. [AFT0] errID 0x000054d0.90e01b51 Corrected Memory Error on Board 7 J3300 is Intermittent [AFT0] errID 0x000054d0.90e01b51 ECC Data Bit 3 was in error and corrected lp[29236]: Warning: Received SIGPIPE; continuing lp[28869]: Warning: Received SIGPIPE; continuing lp[29257]: Warning: Received SIGPIPE; continuing lp[29506]: Warning: Received SIGPIPE; continuing lp[29647]: Warning: Received SIGPIPE; continuing last message repeated 1 time [AFT0] Corrected Memory Error on CPU18, errID 0x00005869.ace8892a AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8 UDBH Syndrome 0xc8 Memory Module Board 7 J3300 [AFT0] errID 0x00005869.ace8892a Corrected Memory Error on Board 7 J3300 is Persistent [AFT0] errID 0x00005869.ace8892a ECC Data Bit 3 was in error and corrected [AFT0] Corrected Memory Error on CPU19, errID 0x000058c4.581f31c0 AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8 UDBH Syndrome 0xc8 Memory Module Board 7 J3300 [AFT0] errID 0x000058c4.581f31c0 Corrected Memory Error on Board 7 J3300 is Persistent [AFT0] errID 0x000058c4.581f31c0 ECC Data Bit 3 was in error and corrected AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8 UDBH Syndrome 0xc8 Memory Module Board 7 J3300 [AFT0] errID 0x000058f8.deed0852 Corrected Memory Error on Board 7 J3300 is Persistent [AFT0] errID 0x000058f8.deed0852 ECC Data Bit 3 was in error and corrected [AFT0] Corrected Memory Error on CPU14, errID 0x0000597e.b5a4f601 AFSR 0x00000000.00100000<CE> AFAR 0x00000000.88857730 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000c4e8 UDBH Syndrome 0xc8 Memory Module Board 7 J3300 [AFT0] errID 0x0000597e.b5a4f601 Corrected Memory Error on Board 7 J3300 is Persistent [AFT0] errID 0x0000597e.b5a4f601 ECC Data Bit 3 was in error and corrected lp[3024]: Warning: Received SIGPIPE; continuing lp[3185]: Warning: Received SIGPIPE; continuing last message repeated 1 time lp[5781]: Warning: Received SIGPIPE; continuing lp[5885]: Warning: Received SIGPIPE; continuing lp[5845]: Warning: Received SIGPIPE; continuing lp[5872]: Warning: Received SIGPIPE; continuing last message repeated 1 time lp[7756]: Warning: Received SIGPIPE; continuing lp[8184]: Warning: Received SIGPIPE; continuing is there anyone out there who can tell me whats wrong with the machine,i cant go to sunsolve because i dont have sun contract account to solve this problem....it looks like a memory error.... thx in advance.... Last edited by giriplug; 06-21-2005 at 03:43 AM.. |
|
|||||
|
well, it says that your memory is becoming dead:
UDBH Syndrome 0xc8 Memory Module Board 7 J3300 [AFT0] errID 0x000054d0.90e01b51 Corrected Memory Error on Board 7 J3300 is Intermittent [AFT0] errID 0x000054d0.90e01b51 ECC Data Bit 3 was in error and corrected that's a memory bank on one of your systemboards.... : http://www.sun.com/products-n-soluti...02-5032-15.pdf gP |
|
||||
|
Unfortunately, due to one of my previous employers thinking that buying production servers from eBay was a good idea, I have TONS of experience with this kind of error.
This is definately a memory error, on dimm J3300 on system board 7. If you notice, it reports CPUs seeing errors several places, but it is a different CPU each time. If the CPU was failing it would always be the same one. But each time it says the memory module the error came from is the same one, which tells you that is the root cause of the error. Also note, at the top of your output the error was intermittent, but by the bottom the error message said it is persistent. This isn't a good sign . . . Solaris can accomodate occasional memory errors, but if it is the same dimm constantly doing it like that you'll panic your box eventually. I would either replace that memory, or at least remove that bank and run with less memory. Better to run short than crash your box. |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|