CPU0 error and anything else??


 
Thread Tools Search this Thread
Operating Systems Solaris CPU0 error and anything else??
# 1  
Old 03-26-2008
CPU0 error and anything else??

Dear Expert,
My english not very good,but i have problem with our server like below:


Mar 11 06:51:00 SUNW,UltraSPARC-II: [ID 785325 kern.info] [AFT3] errID 0x00000198.aa7aa31c Above Error is in User Mode
Mar 11 06:51:00 unix: [ID 855177 kern.warning] WARNING: [AFT1] initiating reboot due to above error in pid 1635 (sar)
Mar 11 06:51:04 unix: [ID 221039 kern.notice] NOTICE: Previously reported error on page 0x00000000.15fca000 cleared


Can any body explaint about these messages and how I can do to make it better? thank's for your help and attention

warm regards

fredginting

Last edited by fredginting; 04-11-2008 at 04:33 AM..
# 2  
Old 03-26-2008
looks like an memory error... what kind of hardware is it?

please post the output of:

Code:
# prtdiag -v
# psrinfo -v
# uname -a

# 3  
Old 03-26-2008
thank's for your reply and attention.....
show from psrinfo -v:
psrinfo -v
Status of processor 0 as of: 03/26/08 16:24:35
Processor has been on-line since 03/26/08 16:21:41.
The sparcv9 processor operates at 400 MHz,
and has a sparcv9 floating point processor.
Status of processor 1 as of: 03/26/08 16:24:35
Processor has been on-line since 03/26/08 16:21:42.
The sparcv9 processor operates at 400 MHz,
and has a sparcv9 floating point processor.

#prtdiag -v
System Configuration: Sun Microsystems sun4u Sun (TM) Enterprise 250 (2 X UltraSPARC-II 400MHz)
System clock frequency: 100 MHz
Memory size: 512 Megabytes

========================= CPUs =========================

Run Ecache CPU CPU
Brd CPU Module MHz MB Impl. Mask
--- --- ------- ----- ------ ------ ----
SYS 0 0 400 2.0 US-II 10.0
SYS 1 1 400 2.0 US-II 10.0


========================= Memory =========================

Interlv. Socket Size
Bank Group Name (MB) Status
---- ----- ------ ---- ------
0 none U0701 128 OK
0 none U0801 128 OK
0 none U0901 128 OK
0 none U1001 128 OK


========================= IO Cards =========================

Bus Freq
Brd Type MHz Slot Name Model
--- ---- ---- ---- -------------------------------- ----------------------
SYS PCI 33 0 TSI,gfxp GFXP

No failures found in System
[lsav.kentut ] $ prtdiag -v
System Configuration: Sun Microsystems sun4u Sun (TM) Enterprise 250 (2 X UltraSPARC-II 400MHz)
System clock frequency: 100 MHz
Memory size: 512 Megabytes

========================= CPUs =========================

Run Ecache CPU CPU
Brd CPU Module MHz MB Impl. Mask
--- --- ------- ----- ------ ------ ----
SYS 0 0 400 2.0 US-II 10.0
SYS 1 1 400 2.0 US-II 10.0


========================= Memory =========================

Interlv. Socket Size
Bank Group Name (MB) Status
---- ----- ------ ---- ------
0 none U0701 128 OK
0 none U0801 128 OK
0 none U0901 128 OK
0 none U1001 128 OK


========================= IO Cards =========================

Bus Freq
Brd Type MHz Slot Name Model
--- ---- ---- ---- -------------------------------- ----------------------
SYS PCI 33 0 TSI,gfxp GFXP

No failures found in System
===========================

========================= Environmental Status =========================

System Temperatures (Celsius):
------------------------------
CPU0 40
CPU1 40
MB0 30
MB1 26
PDB 30
SCSI 25

=================================

Front Status Panel:
-------------------
Keyswitch position is in On mode.

System LED Status: DISK ERROR POWER
[OFF] [ ON]
POWER SUPPLY ERROR ACTIVITY
[OFF] [ ON]
GENERAL ERROR THERMAL ERROR
[OFF] [OFF]

=================================

Disk LED Status: OK = GREEN ERROR = YELLOW
DISK 5: [EMPTY] DISK 3: [EMPTY] DISK 1: [EMPTY]
DISK 4: [EMPTY] DISK 2: [EMPTY] DISK 0: [OK]

=================================

Fan Bank :
----------

Bank Speed Status
(0-255)
---- ----- ------
SYS 162 OK

=================================

Power Supplies:
---------------

Supply Status
------ ------
0 OK

========================= HW Revisions =========================

ASIC Revisions:
---------------
STP2223BGA: Rev 4
STP2003QFP: Rev 1

System PROM revisions:
----------------------
OBP 3.30.0 2003/11/11 10:37 POST 6.1.0 2003/11/11 10:49

#uname -i
SUNW,Ultra-250

Last edited by fredginting; 04-11-2008 at 04:35 AM..
# 4  
Old 03-26-2008
didn't look bad... maybe (if you can reboot) do an extended post to check if hardware gives errors....
# 5  
Old 03-26-2008
Quote:
Originally Posted by fredginting
kern.warning] WARNING: [AFT1] WP event on CPU0, errID 0x00000198.a9bb1655
Mar 11 06:51:00 PALEMBANG4 SUNW,UltraSPARC-II: [ID 881188 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data access at TL=0, errID 0x00000198.aa7aa31c
Mar 11 06:51:00 PALEMBANG4 SUNW,UltraSPARC-II: [ID 229985 kern.warning] WARNING: [AFT1] errID 0x00000198.aa7aa31c Syndrome 0x3 indicates that this may not be a memory module problem
The lines above indicate that there is probably a hardware failure on CPU0 since the error code says it might not be a memory module problem. There's a good chance you need to replace the failing CPU - I've seen these exact type of errors on many old Ultrasparc II boxes and that was almost always the case.

The fact that it caused your box to reboot is another hint it is probably hardware. If it was a one-time correctable memory error the kernel should have caught it and fixed the error without crashing the box.

If you want to, try upgrading to the latest kernel patch for whichever version of Solaris you're running. I know as the US II platform was getting older they did come up with some much better error checking/fixing routines and incorporated them into the Solaris 8 and 9 kernel patches. But if it truly is a hardware failure that won't help.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print Error in Console and both Error & Output in Log file - UNIX

I am writing a shell script with 2 run time arguments. During the execution if i got any error, then it needs to redirected to a error file and in console. Also both error and output to be redirected to a log file. But i am facing the below error. #! /bin/sh errExit () { errMsg=`cat... (1 Reply)
Discussion started by: sarathy_a35
1 Replies

2. Solaris

Rpcinfo: can't contact portmapper: RPC: Authentication error; why = Failed (unspecified error)

I have two servers with a fresh install of Solaris 11, and having problems when doing rpcinfo between them. There is no firewall involved, so everything should theoretically be getting through. Does anyone have any ideas? I did a lot of Google searches, and haven't found a working solution yet. ... (2 Replies)
Discussion started by: christr
2 Replies

3. Shell Programming and Scripting

What is this error log = hda: irq timeout: error=0x00 and how to solve?

what is this error log = hda: irq timeout: error=0x00 and how to solve? every day upon checking the logs i see this error. hda: irq timeout: error=0x00 hda: irq timeout: error=0x00 hda: irq timeout: error=0x00 hda: irq timeout: error=0x00 hw_client: segfault at 0000000000000046 rip... (3 Replies)
Discussion started by: avtalan
3 Replies

4. Solaris

SYS/MB/CPU0 Fault on mainboard

System: Sun T5440 SunOS servername 5.10 Generic_144488-17 sun4u sparc SUNW,SPARC-Enterprise Solaris We have noticed a fault on the mainboard that seemed to coincide with one of our applications failing with a 'Bus Error.' Oracle has not been of much help thus far, we upgraded to the latest... (1 Reply)
Discussion started by: dangral
1 Replies

5. UNIX for Dummies Questions & Answers

> 5 ")syntax error: operand expected (error token is " error

im kinda new to shell scripting so i need some help i try to run this script and get the error code > 5 ")syntax error: operand expected (error token is " the code for the script is #!/bin/sh # # script to see if the given value is correct # # Define errors ER_AF=86 # Var is... (4 Replies)
Discussion started by: metal005
4 Replies

6. UNIX for Advanced & Expert Users

ssh error: Error reading response length from authentication socket

Hi - I am getting the error `Error reading response length from authentication socket' when I ssh from my cluster to another cluster, and then back to my cluster. It doesn't seem to affect anything, but it's just annoying that it always pops up and tends to confuse new users of the cluster. I... (1 Reply)
Discussion started by: cpp6f
1 Replies

7. AIX

nim mksysb error :/usr/bin/savevg[33]: 1016,07: syntax error

-------------------------------------------------------------------------------- Hello, help me please. I am trying to create a mksysb bakup using nim. I am geting this error, how to correct it ? : Command : failed stdout: yes stderr: no... (9 Replies)
Discussion started by: astjen
9 Replies

8. UNIX for Advanced & Expert Users

VSI-FAX error - Cannot login to server and Connecto error to host

I encounters a VSIFAX related error: vfxstat: Cannot login to server on rsac3: Connect error to host 172.16.1.45: Invalid argument It started happening last night with a core dump. Then we can't start VSIFAX again. I am runing VSI-FAX 4.2 on AIX box (0 Replies)
Discussion started by: b_jin
0 Replies

9. UNIX for Dummies Questions & Answers

awk Shell Script error : "Syntax Error : `Split' unexpected

hi there i write one awk script file in shell programing the code is related to dd/mm/yy to month, day year format but i get an error please can anybody help me out in this problem ?????? i give my code here including error awk ` # date-month -- convert mm/dd/yy to month day,... (2 Replies)
Discussion started by: Herry
2 Replies

10. UNIX for Dummies Questions & Answers

Error: Internal system error: Unable to initialize standard output file

Hey guys, need some help. Running AIX Version 5.2 and one of our cron jobs is writing errors to a log file. Any ideas on the following error message. Error: Internal system error: Unable to initialize standard output file I'm guessing more info might be needed, so let me know. Thanks (2 Replies)
Discussion started by: firkus
2 Replies
Login or Register to Ask a Question