T2000 Sparc server fails boot


 
Thread Tools Search this Thread
Operating Systems Solaris T2000 Sparc server fails boot
# 1  
Old 01-04-2015
T2000 Sparc server fails boot

I have a T2000 enterprise SPARC server that's no longer on contract with Oracle. It's on old firmware (6.3.x). After a power-down this weekend, it won't boot normally. Boot snapshot at the bottom of the post.

It can boot to cdrom, and it'll boot to failsafe mode, but it won't do a regular boot, nor will it boot to single user mode. It's ZFS.

Nothing has changed, BUT it would appear that the onboard battery had failed, and the time reverted to 1999. ALOM works. I set the date using ALOM and rebooted, and it fails in the same way.

I thought perhaps CPU1 failed, but disabling the CPU in ALOM and rebooting just moved the problem to CPU2.

Code:
Loading: /platform/SUNW,SPARC-Enterprise-T2000/kernel/sparcv9/unix
Loading: /platform/sun4v/kernel/sparcv9/unix
SunOS Release 5.10 Version Generic_141414-02 64-bit
Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
os-io panic: failed to stop cpu1

panic[cpu0]/thread=180e000: send_one_mondo: unexpected hypervisor error 0x2 while sending a mondo to cpuid: 0x1

000000000180b460 unix:send_one_mondo+14c (1, 10aeb58, 0, 2, 180c5e8, 1)
  %l0-3: 000000000187c000 0000000001866e18 0000000000000000 0000001d5e36ff20
  %l4-7: 000000000187c2c0 000000000187f980 0000000000000003 0000001d16b07320
000000000180b510 unix:xt_one_unchecked+c8 (1, 100ff74, 70020000, 0, 0, 1)
  %l0-3: 000000000000000b 0000000001866e18 0000000000000000 000000000180b5c0
  %l4-7: 0000000000000000 0000000000000002 0000000000000001 000000000180b5e0
000000000180b5e0 unix:setbackdq+3f0 (2a10121fca0, ffffffffffffffff, 300043cc000, 0, 1, 6d)
  %l0-3: 0000060038f02b10 0000000000000000 0000060039a6d4b8 0000000000000001
  %l4-7: 0000000000000000 0000060039a6ca80 0000000000000002 0000000000000a38
000000000180b690 unix:cpu_pause_start+9c (0, 185e800, 185f400, 1, 1847958, 1)
  %l0-3: 0000000000000001 0000000000000002 0000000001847859 000000000185f768
  %l4-7: 000000000000001b 000002a10121fca0 000000000000006d 000000000000006e
000000000180b740 unix:pause_cpus+6c (0, 1, 5, 1847858, 182ac00, 1847800)
  %l0-3: 0000000000000000 0000000001861000 000000000180c000 000000000186b000
  %l4-7: 0000000000000001 000000000185e800 0000000001064000 ffffffffffffffff
000000000180b7f0 unix:cpu_add_unit+28 (300043d0000, 1826400, a, 187bf60, 5f50, 187bc00)
  %l0-3: 0000000001831c00 0000000001861000 00000000010c9800 000000000186b000
  %l4-7: 0000000000000001 000000000185e800 0000000001064000 ffffffffffffffff
000000000180b8a0 unix:setup_cpu_common+14c (4, 1000, 0, 300043d0000, 1b, 180c000)
  %l0-3: 0000000001831c00 000000000189ec00 00000000010c9800 0000060038f02b70
  %l4-7: 0000000000000001 000000000185e800 0000000001064000 ffffffffffffffff
000000000180b960 unix:start_other_cpus+19c (190c400, 1, 0, 18631e0, 185f770, 186b3e8)
  %l0-3: 0000000000000002 0000000000000002 00000000010ac400 0000000000000000
  %l4-7: 000000000190c400 0000000000000003 000000000101b000 00000000018fe800
000000000180ba10 genunix:main+1e4 (18fe840, 18fa400, 185eb40, 18acc00, 0, 1906c00)
  %l0-3: 0000000000000000 0000000000000001 0000000001906c00 0000000000000002
  %l4-7: 0000000001907aa0 0000000001907800 00000000018fe850 00000000018fe800

syncing file systems... done
skipping system dump - no dump device configured
rebooting...

Moderator's Comments:
Mod Comment Please use CODE tags when displaying input, output, and code. Without CODE tags, HTML processing coalesces spaces and tabs.

Last edited by Don Cragun; 01-04-2015 at 06:43 PM.. Reason: Add CODE tags.
# 2  
Old 01-05-2015
What is your ALOM version? Did you apply patches to the Solaris installation?
# 3  
Old 01-05-2015
Boot verbose - both CD and normal. What's different?
# 4  
Old 01-05-2015
@dukenuke2
no patches - because I'm not under oracle contract, can't download them. I would love to get my hands on SysFW 6.7.13 and 139434-10.

alom version:
Code:
sc> showsc version -v
Advanced Lights Out Manager CMT v1.3.8
SC Firmware version: CMT 1.3.8
SC Bootmon version: CMT 1.3.8

VBSC 1.3.5
VBSC firmware built Apr  6 2008, 15:09:33

SC Bootmon Build Release: 01
SC bootmon checksum: 13AA267E
SC Bootmon built Apr  6 2008, 15:17:23

SC Build Release: 01
SC firmware checksum: 12914608

SC firmware built Apr  6 2008, 15:17:37
SC firmware flashupdate FRI MAY 22 23:55:22 2009

SC System Memory Size: 32 MB
SC NVRAM Version = 12
SC hardware type: 4

FPGA Version: 4.2.4.7

---------- Post updated at 08:43 AM ---------- Previous update was at 08:08 AM ----------

@achnele
The post differs with a panic after CPU2, but it doesn't seem to help (I disabled CPU1 because it failed here last time). If I disable CPU2, the panic moves to CPU3.

In case it matters, I'm still seeing this - as I haven't replaced the battery yet. That's next. The date has been set manually through ALOM.
Code:
SC Alert: BATTERY at SC/BAT/V_BAT has exceeded low warning threshold.

From CDROM with -v -s
Code:
PCI-device: usb@6, ohci1
ohci1 is /pci@7c0/pci@0/pci@1/pci@0/usb@6
cpu0: UltraSPARC-T1 (cpuid 0 clock 1200 MHz)
cpu2: UltraSPARC-T1 (cpuid 2 clock 1200 MHz)
cpu3: UltraSPARC-T1 (cpuid 3 clock 1200 MHz)
cpu4: UltraSPARC-T1 (cpuid 4 clock 1200 MHz)
cpu5: UltraSPARC-T1 (cpuid 5 clock 1200 MHz)
PCI-device: pci@8, pxb_plx8
pxb_plx8 is /pci@7c0/pci@0/pci@8
cpu6: UltraSPARC-T1 (cpuid 6 clock 1200 MHz)
USB 1.10 device (usb3eb,3301) operating at full speed (USB 1.x) on USB 1.10 root hub: hub@1, hubd1 
at bus address 2
hubd1 is /pci@7c0/pci@0/pci@1/pci@0/usb@6/hub@1
/pci@7c0/pci@0/pci@1/pci@0/usb@6/hub@1 (hubd1) online
cpu7: UltraSPARC-T1 (cpuid 7 clock 1200 MHz)
cpu8: UltraSPARC-T1 (cpuid 8 clock 1200 MHz)
cpu9: UltraSPARC-T1 (cpuid 9 clock 1200 MHz)
cpu10: UltraSPARC-T1 (cpuid 10 clock 1200 MHz)
cpu11: UltraSPARC-T1 (cpuid 11 clock 1200 MHz)
cpu12: UltraSPARC-T1 (cpuid 12 clock 1200 MHz)
cpu13: UltraSPARC-T1 (cpuid 13 clock 1200 MHz)
cpu14: UltraSPARC-T1 (cpuid 14 clock 1200 MHz)
cpu15: UltraSPARC-T1 (cpuid 15 clock 1200 MHz)
cpu16: UltraSPARC-T1 (cpuid 16 clock 1200 MHz)
cpu17: UltraSPARC-T1 (cpuid 17 clock 1200 MHz)
cpu18: UltraSPARC-T1 (cpuid 18 clock 1200 MHz)
cpu19: UltraSPARC-T1 (cpuid 19 clock 1200 MHz)
cpu20: UltraSPARC-T1 (cpuid 20 clock 1200 MHz)
cpu21: UltraSPARC-T1 (cpuid 21 clock 1200 MHz)
cpu22: UltraSPARC-T1 (cpuid 22 clock 1200 MHz)
cpu23: UltraSPARC-T1 (cpuid 23 clock 1200 MHz)
cpu24: UltraSPARC-T1 (cpuid 24 clock 1200 MHz)
cpu25: UltraSPARC-T1 (cpuid 25 clock 1200 MHz)
cpu26: UltraSPARC-T1 (cpuid 26 clock 1200 MHz)
cpu27: UltraSPARC-T1 (cpuid 27 clock 1200 MHz)
PCI-device: SUNW,qlc@0, qlc0
qlc0 is /pci@7c0/pci@0/pci@8/SUNW,qlc@0
cpu28: UltraSPARC-T1 (cpuid 28 clock 1200 MHz)
cpu29: UltraSPARC-T1 (cpuid 29 clock 1200 MHz)
PCI-device: pci@9, pxb_plx9
pxb_plx9 is /pci@7c0/pci@0/pci@9
cpu30: UltraSPARC-T1 (cpuid 30 clock 1200 MHz)
cpu31: UltraSPARC-T1 (cpuid 31 clock 1200 MHz)
Booting to milestone "milestone/single-user:default".

from local (with -v -s)
Code:
PCI-device: usb@5, ohci0
ohci0 is /pci@7c0/pci@0/pci@1/pci@0/usb@5
PCI-device: usb@6, ohci1
ohci1 is /pci@7c0/pci@0/pci@1/pci@0/usb@6
cpu0: UltraSPARC-T1 (chipid 0, clock 1200 MHz)
cpu2: UltraSPARC-T1 (chipid 0, clock 1200 MHz)
panic: failed to stop cpu2

panic[cpu0]/thread=180e000: send_one_mondo: unexpected hypervisor error 0x2 while sending a mondo to cpuid: 0x2

000000000180b460 unix:send_one_mondo+14c (2, 10aeb58, 0, 2, 180c5e8, 1)
  %l0-3: 000000000187c000 0000000001866e18 0000000000000000 0000002316b37abc
  %l4-7: 000000000187c2c0 000000000187f980 0000000000000003 00000022cf2ceebc
000000000180b510 unix:xt_one_unchecked+c8 (2, 100ff74, 70020000, 0, 0, 1)
  %l0-3: 000000000000000b 0000000001866e18 0000000000000000 000000000180b5c0
  %l4-7: 0000000000000000 0000000000000004 0000000000000002 000000000180b5e0
000000000180b5e0 unix:setbackdq+3f0 (2a10121fca0, ffffffffffffffff, 30004504000, 0, 1, 6d)
  %l0-3: 0000060039444b10 0000000000000000 0000060039bc34b8 0000000000000001
  %l4-7: 0000000000000000 0000060039bc2a80 0000000000000002 0000000000000a38
000000000180b690 unix:cpu_pause_start+9c (0, 185e800, 185f400, 1, 1847958, 1)
  %l0-3: 0000000000000002 0000000000000002 000000000184785a 000000000185f770
  %l4-7: 000000000000001b 000002a10121fca0 000000000000006d 000000000000006e
000000000180b740 unix:pause_cpus+6c (0, 1, 5, 1847858, 182ac00, 1847800)
  %l0-3: 0000000000000000 0000000001861000 000000000180c000 000000000186b000
  %l4-7: 0000000000000001 000000000185e800 0000000001064000 ffffffffffffffff
000000000180b7f0 unix:cpu_add_unit+28 (30004508000, 1826400, a, 187bf60, 5f50, 187bc00)
  %l0-3: 0000000001831c00 0000000001861000 00000000010c9800 000000000186b000
  %l4-7: 0000000000000001 000000000185e800 0000000001064000 ffffffffffffffff
000000000180b8a0 unix:setup_cpu_common+14c (4, 1000, 0, 30004508000, 1b, 180c000)
  %l0-3: 0000000001831c00 000000000189ec00 00000000010c9800 0000060039444b70
  %l4-7: 0000000000000001 000000000185e800 0000000001064000 ffffffffffffffff
000000000180b960 unix:start_other_cpus+19c (190c400, 1, 0, 18631e0, 185f778, 186b490)
  %l0-3: 0000000000000003 0000000000000003 00000000010ac400 0000000000000000
  %l4-7: 000000000190c400 0000000000000004 000000000101b000 00000000018fe800
000000000180ba10 genunix:main+1e4 (18fe840, 18fa400, 185eb40, 18acc00, 0, 1906c00)
  %l0-3: 0000000000000000 0000000000000001 0000000001906c00 0000000000000002
  %l4-7: 0000000001907aa0 0000000001907800 00000000018fe850 00000000018fe800

syncing file systems... done

# 5  
Old 01-05-2015
There is something wrong with your hypervisor firmware.
Maybe it is too old, not supported by the current Solaris?
Try to update all firmware:
Firmware Downloads and Release History for Sun Systems
# 6  
Old 01-05-2015
@madeingermany
Firmware updated.

Code:
sc> showhost
SPARC-Enterprise-T2000 System Firmware 6.7.12  2011/07/06 20:03
Host flash versions:
   OBP 4.30.4.d 2011/07/06 14:29
   Hypervisor 1.7.3.c 2010/07/09 15:14
   POST 4.30.4.b 2010/07/09 14:24

still no difference.

Last edited by val riverwalk; 01-05-2015 at 08:01 PM..
# 7  
Old 01-05-2015
See also Known Issues - Oracle VM Server for SPARC 2.0 Release Notes
Maybe you lost your ldm settings during the power-cycle?
See the chapter "Logical Domains Variable Persistence"
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

M5000 fails to boot

I have an m5000 that Is down. I have access to XSCF> and was able to get to the ok prompt. However, on >ok boot it failed to boot. erred with : svc.startd: svc:/system/device/fc-fabric:default: Method "/lib/svc/method/fc-fabric" failed due to signal KILL. Nov 29... (3 Replies)
Discussion started by: goya
3 Replies

2. Shell Programming and Scripting

Grep command Fails on SunOS Sparc

Hi, This command works ggrep -v -F -x -f app1.txt app2.txt But, I don't have ggrep on SunOS Sparc so I tried using grep instead but it errors out grep: illegal option -- F bash-2.03$ uname -a SunOS mymac 5.8 Generic_Virtual sun4v sparc sun4v Can you help me with a grep command that... (6 Replies)
Discussion started by: mohtashims
6 Replies

3. Hardware

Hardware RAID on Sun T2000 Server

Hi All I have a Sun T2000 server. Couple of years ago I had configured and mirrored the boot drive with an other drive using hardware RAID 1 using raidctl command. Following is the hardware RAID output. root@oracledatabaseserver / $ raidctl RAID Volume RAID RAID Disk... (0 Replies)
Discussion started by: Tirmazi
0 Replies

4. Solaris

T2000 Server cannot power on

Hi All, I recently had issues with my new T2000 server. I purchased a new mainboard and swapped the CPU and DIMMs to the new board. Now when i power on the server im unable to do so. I get the following error: ‘SC Alert: Host system poweron failed due to fault at MB/FF_POK.' When i do... (2 Replies)
Discussion started by: Caully
2 Replies

5. Solaris

Auto boot problem on Sun T2000

Hi All I have a problem on T2000 server. bash-3.00# uname -a SunOS 5.10 aaa Generic_120011-14 sun4v sparc SUNW,Netra-T2000After the initial solaris installation server failed to boot: Rebooting with command: boot Boot device: disk:b File and args: The file just loaded does not appear... (1 Reply)
Discussion started by: ouzist
1 Replies

6. Solaris

Solaris 10 install fails on sparc

I'm trying to do an upgrade/install from Solaris 8 to Solaris 10 on a SUN Sparc system. I halt the system, as documented, but when I attempt to boot off the distribution DVD; i.e. halt : : OK> boot cdrom The system indicates that the device is 'unrecognizable' These are SUN... (5 Replies)
Discussion started by: imagtek
5 Replies

7. Solaris

Accessing a StorageTek 2530 Disk array from SUN, SPARC Enterprise T2000

Hello, Wondering if anyone can help me with mounting a file share from my Sun T2000 server running Solaris 10 to my connected 2530 disk array? I believe I've connected the disk array correctly and I have created a volume on the array using the filesystem (Sun_SAM-FS, RAID-5). The T2000... (15 Replies)
Discussion started by: DundeeDancer
15 Replies

8. Solaris

Sun Server T2000 occasionally reboot

Hi, i am really 'fresh' to Solaris or any UNIX OS. My role as web developer but i need slightly involve to Solaris support. It is harder for me to understand it and i recently encounter a problems. /var file system (/dev/md/rdsk/d425) is being checked. run fsck -F ufs /dev/md/rdsk/d425 ... (8 Replies)
Discussion started by: webster5u
8 Replies

9. Solaris

Error while trying to boot from cdrom on Sun Fire T2000

Hi Guyz, I recently downloaded sol-10-u6-ga1-sparc-dvd.iso from Sun. I burned the disk as cd .iso. When trying to boot at the ok> boot cdrom i get the following error: ----------------------------------------------------------------------- Boot device:... (2 Replies)
Discussion started by: Mwamba
2 Replies

10. Solaris

Sun Sparc T2000 + StorageTek 2540 - Help, I'm lost

I have a Sun Sparc T2000 (Solaris 10 05-08) and have installed a PCI-X 4GB Single Port HBA card in it. I have one StorageTek 2540 array that I would like to connect to the T2000. For the moment it would be a single path connection, but I've ordered a 2nd HBA, so eventually it would be... (4 Replies)
Discussion started by: soliberus
4 Replies
Login or Register to Ask a Question