Running Sun Fire V480R on a single cpu-Urgent


 
Thread Tools Search this Thread
Operating Systems Solaris Running Sun Fire V480R on a single cpu-Urgent
# 1  
Old 01-25-2010
Running Sun Fire V480R on a single cpu-Urgent

I have a sun fire v480r server. it has 2 cpu. One of cpu have become faulty. Now i want to run server on a single cpu. My service provider says it cant be done, you have to replace the motherboard. Can it be done? If so do some settings need to be changed in solaris. I have search for documentation on the net but have found nothing. I will appreciate if somebody helps me.
Thanks
# 2  
Old 01-25-2010
depends on the error, they might have noticed that the motherboard failed maybe?
what are you asking here w/o any outputs? How would we know to justify what you're saying is true, that CPU failed?
I think what they say is true. The board comes with the processor, if Im not wrong. You will only need to swap the DIMMs over to the new board
# 3  
Old 01-25-2010
A 'prtdiag -v' would help! Did you look in /var/adm/messages? What kind of errors does that list?

You can offline CPU's in Solaris; take a look at the man pages for 'psradm'
# 4  
Old 01-26-2010
Quote:
Originally Posted by mainegeek
A 'prtdiag -v' would help! Did you look in /var/adm/messages? What kind of errors does that list?

You can offline CPU's in Solaris; take a look at the man pages for 'psradm'
Exactly what Im looking for.. the errors in the messages file.Smilie
# 5  
Old 01-26-2010
Thanks for replies. This is diagnostic log

Code:
<*>
Hardware Power On
FATAL: ???power-on-reset with FATAL flags set

@(#)OBP 4.13.0 2004/01/19 18:26 Sun Fire 480R
Front Panel Keyswitch is in Diagnostic position.
Online: CPU2*
Validating JTAG integrity...Done
Disabling DAR error circuitry...Done
Clearing DCS error circuitry state...Done
Initializing DTL circuitry state...Done
Initializing CDX via JTAG...Done
Enabling DAR error circuitry...Done

Probing Centerplane....part# 501-6790-02 serial# 014892
  Safari min 100MHz, cumulative 100MHz;  max 150MHz, cumulative 150MHz
  'STICK' clock 10MHz; BootBus timing 014f.99fd.a7e6.3f29
Probing I/O Riser......part# 501-5820-04 serial# 075970
Probing System RSC.....part# 501-5856-06 serial# 273318
Probing PwrDistBoard...part# 375-3006-05 serial# M79563
Probing PowerSupply0...part# 300-1480-05 serial# N62553
Probing PowerSupply1...part# 300-1480-05 serial# N62464
Probing FCAL BPlane0...part# 501-5822-04 serial# 078804
Probing GPTwo Slot A...part# 501-6164-02 serial# 071451
  Safari min 100MHz, cumulative 100MHz;  max 150MHz, cumulative 150MHz
  CPU rated speed 1200MHz; ECache 8MB 3.3ns
Probing GPTwo Slot B...No module detected

Desired Safari Bus speed 150MHz, selecting 150MHz
Configuring CPUs..........
... CPU2 Rated Speed 1200MHz, Safari 150MHz, want 8:1, got 8:1 ==> CPU 1200MHz
         Ecache 8MB 3.3ns mode=5-4-4 2-way ECCR: 0000.0000.0343.4c00 Done
Setting system speed (and resetting)...
<*>
Set Speed Reset

@(#)OBP 4.13.0 2004/01/19 18:26 Sun Fire 480R
Front Panel Keyswitch is in Diagnostic position.
Online: *CPU2 Ultra-III+ (v11.1) 8:1 1200MHz 8MB 4:1 ECache 
Executing Power On SelfTest w/%o0 = 0000.0000.0001.4042
2:0>

2:0>@(#) Sun Fire[TM] V480 POST 4.13.0 2004/02/12 19:17 

       /export/common-source/firmware_re/post/post-build-4.13.0/Camelot/cstone/integrated  (firmware_re)  

2:0>Copyright © 2004 Sun Microsystems, Inc. All rights reserved

  SUN PROPRIETARY/CONFIDENTIAL.

  Use is subject to license terms.

2:0>Jump from OBP->POST.

2:0>Keyswitch in DIAGNOSTIC POSITION.

2:0>Diag level set to MAX.

2:0>Verbosity level set to 0.

2:0>MFG scrpt mode set NORM 

2:0>I/O port set to serial TTYA.

2:0>

2:0>Start selftest...

2:0>CPUs present in system: 2:0

2:0>Test CPU(s).....

2:0>Init CPU

2:0>    UltraSparc_III_plus Version 11.1

2:0>DMMU Registers Access

2:0>DMMU TLB DATA RAM Access

2:0>DMMU TLB TAGS Access

2:0>IMMU Registers Access

2:0>IMMU TLB DATA RAM Access

2:0>IMMU TLB TAGS Access

2:0>Probe Ecache

2:0>    Size = 00000000.00800000...

2:0>Ecache Data Bitwalk

2:0>Ecache Address Bitwalk

2:0>Scrub and Setup Ecache

2:0>Setup and Enable DMMU

2:0>Setup DMMU Miss Handler

2:0>Test and Init Temp Mailbox

2:0>

2:0>ERROR: TEST = Check cpu synch var CPU 0

2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = CPU 0:0 failed BBC SRAM access, offline cpu.

2:0>END_ERROR


2:0>

2:0>ERROR: TEST = Check cpu synch var CPU 0

2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = 

     *** Test Failed!! ***


2:0>END_ERROR


2:0>

2:0>ERROR: TEST = Check cpu synch var CPU 0

2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = ERROR:    Fatal CPU error on master, rolling over to new master.

2:0>END_ERROR


2:0>Soft Reset.

2:0>

2:0>ERROR: TEST = Power on Reset Initialization

2:0>H/W under test = CPU2 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = Watchdog timeout, Master CPU Failure on 2:0, rollover to 2:0.

2:0>END_ERROR


2:0>ERROR:    No good CPUs left!  Calling debug menu.

2:0>    0    Peek/Poke interface

2:0>    1    Dump DAR Error Bits

2:0>    2    Dump Scan Chain

2:0>    3    Dump CPU Regs

2:0>    4    Dump BBC Regs

2:0>    5    Dump Mem Controller Regs

2:0>    6    Dump Valid DMMU entries

2:0>    7    Dump IMMU entries

2:0>    8    Dump Struct Info

2:0>    9    Dump Mailbox

2:0>    a    Dump IO-Bridge regs unit 0 

2:0>    b    Dump IO-Bridge regs unit 1 

2:0>    c    Allow other CPUs to print

2:0>    d    Do soft reset

2:0>    ?    Help

2:0>

2:0>Selection:



---------- Post updated at 07:54 AM ---------- Previous update was at 07:52 AM ----------

This is syslog when system used to reboot. Now system does not start and the maintainence light is on
Code:
2010-01-15 10:34:54    Kernel.Notice    172.16.5.192    Jan 15 10:36:15 genunix: [ID 540533 kern.notice] <013>SunOS Release 5.8 Version Generic_108528-29 64-bit
2010-01-15 10:34:54    Kernel.Notice    172.16.5.192    Jan 15 10:36:15 genunix: [ID 913632 kern.notice] Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 pcisch: [ID 104478 kern.warning] WARNING: pcisch1: ino 0x5 has been blocked
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 pcisch: [ID 486037 kern.warning] WARNING: ce0: interrupt #1 has been blocked
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 pcisch: [ID 559693 kern.warning] WARNING: pcisch1: ib_ino_add_intr: ino 0x5 has been unblocked



---------- Post updated at 07:56 AM ---------- Previous update was at 07:54 AM ----------

system does not start now and maintenance light is on. This is syslog when system used to reboot automatically.
Code:
2010-01-15 10:34:54    Kernel.Notice    172.16.5.192    Jan 15 10:36:15 genunix: [ID 540533 kern.notice] <013>SunOS Release 5.8 Version Generic_108528-29 64-bit
2010-01-15 10:34:54    Kernel.Notice    172.16.5.192    Jan 15 10:36:15 genunix: [ID 913632 kern.notice] Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 pcisch: [ID 104478 kern.warning] WARNING: pcisch1: ino 0x5 has been blocked
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 pcisch: [ID 486037 kern.warning] WARNING: ce0: interrupt #1 has been blocked
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 unix: [ID 882636 kern.warning] WARNING: interrupt level 6 not serviced
2010-01-15 10:34:55    Kernel.Warning    172.16.5.192    Jan 15 10:36:21 pcisch: [ID 559693 kern.warning] WARNING: pcisch1: ib_ino_add_intr: ino 0x5 has been unblocked



---------- Post updated at 08:00 AM ---------- Previous update was at 07:56 AM ----------

This is the error messages in diagnostic log

Quote:
2:0>ERROR: TEST = Check cpu synch var CPU 0

2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = CPU 0:0 failed BBC SRAM access, offline cpu.

2:0>END_ERROR


2:0>

2:0>ERROR: TEST = Check cpu synch var CPU 0

2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG =

*** Test Failed!! ***


2:0>END_ERROR


2:0>

2:0>ERROR: TEST = Check cpu synch var CPU 0

2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = ERROR: Fatal CPU error on master, rolling over to new master.

2:0>END_ERROR


2:0>Soft Reset.

2:0>

2:0>ERROR: TEST = Power on Reset Initialization

2:0>H/W under test = CPU2 Basic, Motherboard/Centerplane

2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.

2:0>MSG = Watchdog timeout, Master CPU Failure on 2:0, rollover to 2:0.

2:0>END_ERROR


2:0>ERROR: No good CPUs left! Calling debug


---------- Post updated at 08:04 AM ---------- Previous update was at 08:00 AM ----------

Thanks for replies. These are the errors in diagnostic log

Quote:
2:0>ERROR: TEST = Check cpu synch var CPU 0
2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane
2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.
2:0>MSG = CPU 0:0 failed BBC SRAM access, offline cpu.
2:0>END_ERROR
2:0>
2:0>ERROR: TEST = Check cpu synch var CPU 0
2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane
2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.
2:0>MSG =

*** Test Failed!! ***

2:0>END_ERROR
2:0>
2:0>ERROR: TEST = Check cpu synch var CPU 0
2:0>H/W under test = CPU0 Basic, Motherboard/Centerplane
2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.
2:0>MSG = ERROR: Fatal CPU error on master, rolling over to new master.
2:0>END_ERROR
2:0>Soft Reset.
2:0>
2:0>ERROR: TEST = Power on Reset Initialization
2:0>H/W under test = CPU2 Basic, Motherboard/Centerplane
2:0>Repair Instructions: Replace items in order listed by 'H/W under test' above.
2:0>MSG = Watchdog timeout, Master CPU Failure on 2:0, rollover to 2:0.
2:0>END_ERROR
2:0>ERROR: No good CPUs left! Calling debug
# 6  
Old 01-26-2010
When system is powered up, its maintence light is on. I have attached the diagnostic log and syslog.
# 7  
Old 01-26-2010
Quote:
ERROR: No good CPUs left! Calling debug
You either lost all CPU's or the mainboard is my guess
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

Sun fire x2270

Hello, I have purchaced an old SUn fire x2270 server . I wanted to make ILOM upgrade to the latest version of software : ILOM 3.0.9.18.a r126592 BIOS vers. 2.09 Server 2.2.3 (10-Aug-2018) Because my version is very outdated. But i can't download the updatebecause it's require... (4 Replies)
Discussion started by: LouisLakoute
4 Replies

2. Solaris

Sun Fire V210 CPU Fan Temp too high?

Hey, I have a V210 with a failed CPU fan. The temperature is currently at 84C and I've been asked to wait a few weeks before replacing as its a production system and it cant be shut down yet. Is it too hot? Do I risk killing the CPU at this temp? Its been like this for a few weeks now... (5 Replies)
Discussion started by: magarvo
5 Replies

3. Solaris

Sun Fire 4800 is not powering-on

I switched on the power to the server. But, the server did not power on i.e., none of the 3 LEDs on the front panel is lighted. (Power supplies are showing only amber LEDs with "Ready to remove" sign). I tried to turn on the power supplies via System Controller menu (platform shell), but it... (6 Replies)
Discussion started by: solind
6 Replies

4. Solaris

Removing a disk from SUN Fire V440 running Solaris 8

Hi, I have a SUN Fire V440 server running Solaris 8. One of the 4 disks do not appear when issued the format command. The "ready to remove" LED is not on either. Metastat command warns that this disk "Needs maintenace". Can I just shutdown and power off the machine and then insert an... (5 Replies)
Discussion started by: Echo68
5 Replies

5. UNIX for Dummies Questions & Answers

Is it possible to combine multiple CPU to act as a single CPU on the same server?

We have a single threaded application which is restricted by CPU usage even though there are multiple CPUs on the server, hence leading to significant performance issues. Is it possible to merge / combine multiple CPUs at OS level so it appear as a single CPU for the application? (6 Replies)
Discussion started by: Dissa
6 Replies

6. Solaris

Sun Fire 280R Sun Solaris CRT/Monitor requirements

I am new to Sun. I brought Sun Fire 280R to practice UNIX. What are the requirements for the monitor/CRT? Will it burn out old non-Sun CRTs? Does it need LCD monitor? Thanks. (3 Replies)
Discussion started by: bramptonmt
3 Replies

7. Solaris

Sun Fire v440 keeps shutting down

Hello, I hope you can help me. I am new to Sun servers and we have a Sun Fire v440 server in which one power supply failed, we are waiting for new one. But now our server is shutting down constantly. Is there any setting with which we can prevent this behaviour? (1 Reply)
Discussion started by: Tibor
1 Replies

8. Solaris

Weird behavior on a Sun Fire V120 running solaris 10.

All, After a power loss I went to power on our sun fire v120 that is running solaris 10 and now it will not boot. I tried power cycling it from the lom and pulling the cord but nothing works. All it does is after a power cycle it will start to boot and then start to spit out a bunch of hex... (2 Replies)
Discussion started by: jsandova
2 Replies

9. UNIX for Dummies Questions & Answers

Sun Fire 280R

Hello all, I'm lost and can't figure this problem out. I have a Sun fire 280R running Solaris 8. Everything was working great. I have one drive in bay 1(not 0). But when I reboot the system it trys to open files in /dev/rdsk/c1t1d0s0. Should it have been opeing /dev/rdsk/c1t0d0s0, the... (4 Replies)
Discussion started by: larryase
4 Replies
Login or Register to Ask a Question