HPUX Boot failure.


 
Thread Tools Search this Thread
Operating Systems HP-UX HPUX Boot failure.
# 8  
Old 11-29-2013
Quote:
Originally Posted by vbe
What is in the logs?
You seem to have memory issue...
In the console logs I only have the same message(the one I have posted earlier).

In the System Events there are lot of entries. The latest one are below.

Code:
Log Entry 75: 29 Nov 2013 12:16:12
Alert Level 2: Informational
Keyword: MC_BR_TO_OS_HPMC_FAILED
MC_BR_TO_OS_HPMC_FAILED
Logged by: System Firmware  2
Data: Implementation dependent data field
0x5680106402E008D0 FFFFFFF0F0438E70


Log Entry 74: 29 Nov 2013 12:16:12
Alert Level 2: Informational
Keyword: MC_OS_HPMC_MISSING
MC_OS_HPMC_MISSING
Logged by: System Firmware  2
Data: Implementation dependent data field
0x5680104A02E008B0 000000F0F0D09800

Log Entry 73: 29 Nov 2013 12:16:12
Alert Level 2: Informational
Keyword: MEM_PDT_DUP_ENTRY
PDT entry to be added to PDT already exists
Logged by: System Firmware  2
Data: Event detail
0x4E8000D502E00890 0000000000D60000


Log Entry 72: 29 Nov 2013 12:16:12
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  2
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x3B, Extender 0
0x448000CC02E00870 FFFFFFFF003BFF74

Log Entry 71: 29 Nov 2013 12:16:11
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  2
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x3A, Extender 0
0x448000CC02E00850 FFFFFFFF003AFF74


Log Entry 70: 29 Nov 2013 12:16:11
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  2
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x2B, Extender 0
0x448000CC02E00830 FFFFFFFF002BFF74

Log Entry 61: 29 Nov 2013 12:16:10
Alert Level 7: Fatal
Keyword: ERR_CHECK_HPMC
An HPMC has been encountered.
Logged by: System Firmware  0
Data: Code address
0xE880035C00E00710 0000000000024344


Log Entry 60: 29 Nov 2013 12:16:10
Alert Level 7: Fatal
Keyword: MC_HPMC_MONARCH_SELECTED
MC_HPMC_MONARCH_SELECTED
Logged by: System Firmware  2
Data: Implementation dependent data field
0xF680105E02E006F0 FFFFFFF0F0C00000

# 9  
Old 11-29-2013
You have 2 bad dimms, if they are of 4 GB, that means you have no more memory...
In MP, have you tried to use XD ? ( Diag and reset...)
Sorry off again...
# 10  
Old 11-29-2013
Quote:
Originally Posted by vbe
You have 2 bad dimms, if they are of 4 GB, that means you have no more memory...
In MP, have you tried to use XD ? ( Diag and reset...)
Sorry off again...

I have tried all the test in the XD and all were successfull except the Modem selftests
Code:
Diagnostics Menu:
Non destructive tests:
     P - Parameter checksum
     I - I2C access (get BMC Device ID record)
     L - LAN access (PING)
     M - Modem selftests
Destructive tests:
     R - Restart MP

Enter menu item or [Q] to Quit: M
M

   Confirm? (Y/[N]): Y
Y

   Please wait .................


   -> Test result: FAIL

<CR> to continue...

Is this causing the problem? I thought it was some memory issues(after a lot of googling and your inputs)

And I think we are using 8 1GB ram modules.
Code:
 MEMORY STATUS TABLE (MB) (Current Boot Status)

Slot 0a  1024M   Active
Slot 0b  1024M   Active

Slot 1a  1024M   Active
Slot 1b  1024M   Active

Slot 2a  1024M   Active
Slot 2b  1024M   Active

Slot 3a  1024M   Active
Slot 3b  1024M   Active

Slot 4a  0
Slot 4b  0

Slot 5a  0
Slot 5b  0

Subtotal 8192M

   TOTAL =  8192 MB
           ---------

And I think 4 of them are damaged.
Code:

Log Entry 352: 29 Nov 2013 12:54:13
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  0
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x3B, Extender 0
0x448000CC00E02A30 FFFFFFFF003BFF74


Log Entry 351: 29 Nov 2013 12:54:13
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  0
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x3A, Extender 0
0x448000CC00E02A10 FFFFFFFF003AFF74


MP:SL (+,-,<CR>,D, F, L, J, H, K, T, A, U, ? for Help, Q or Ctrl-B to Quit) >



Log Entry 350: 29 Nov 2013 12:54:13
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  0
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x2B, Extender 0
0x448000CC00E029F0 FFFFFFFF002BFF74


Log Entry 349: 29 Nov 2013 12:54:13
Alert Level 5: Critical
Keyword: MEM_MBE_IN_RANK
Uncorrectable (multiple-bit) ECC error in DIMM
Logged by: System Firmware  0
Data: Location - Memory (SIMM or DIMM): DIMM Slot 0x2A, Extender 0
0x448000CC00E029D0 FFFFFFFF002AFF74


MP:SL (+,-,<CR>,D, F, L, J, H, K, T, A, U, ? for Help, Q or Ctrl-B to Quit) >



Log Entry 346: 29 Nov 2013 12:54:12
Alert Level 7: Fatal
Keyword: MC_HPMC_MONARCH_SELECTED
MC_HPMC_MONARCH_SELECTED
Logged by: System Firmware  0
Data: Implementation dependent data field
0xF680105E00E02970 FFFFFFF0F0C00000


Log Entry 340: 29 Nov 2013 12:54:11
Alert Level 7: Fatal
Keyword: ERR_CHECK_HPMC
An HPMC has been encountered.
Logged by: System Firmware  3
Data: Code address
0xE880035C03E028B0 000000F0F0D08068


MP:SL (+,-,<CR>,D, F, L, J, H, K, T, A, U, ? for Help, Q or Ctrl-B to Quit) >



Log Entry 339: 29 Nov 2013 12:54:11
Alert Level 7: Fatal
Keyword: ERR_CHECK_HPMC
An HPMC has been encountered.
Logged by: System Firmware  2
Data: Code address
0xE880035C02E02890 000000F0F0D08068


Log Entry 338: 29 Nov 2013 12:54:11
Alert Level 7: Fatal
Keyword: ERR_CHECK_HPMC
An HPMC has been encountered.
Logged by: System Firmware  0
Data: Code address
0xE880035C00E02870 0000000000024344


MP:SL (+,-,<CR>,D, F, L, J, H, K, T, A, U, ? for Help, Q or Ctrl-B to Quit) >



Log Entry 336: 29 Nov 2013 12:53:42
Alert Level 5: Critical
Keyword: BOOT_NOT_ENOUGH_ERROR_FREE_MEMORY
There was not enough error free memory in the system to run the late selftests
Logged by: System Firmware  0
Data: Data field unused
0xA080132000E02830 0000000000000000

So do I have any option other than replacing the RAM?

Will skipping the selftests help?

Last edited by chacko193; 11-29-2013 at 09:19 AM.. Reason: Adding more info
# 11  
Old 11-29-2013
Well since I dont know what MP uses for memory, one thing I know ( for I had to at a time...) is there is an order in which memory has to be implanted, my idea is that with half the memory it can work if you know the order memory has to be in slots: you remove the bad ones and set what is left as if you had only half, it will then recognize the memory correctly then on ( cross your fingers...)
your issue is as if it removed the bad memory but then what is left isnt installed in the right slots...
# 12  
Old 11-29-2013
Latest firmware for RP3440... you can download ...

http://h20565.www2.hp.com/portal/sit...tte.cachetoken

SUPERSEDES HISTORY:

Enhancements
PDC 46.34

Added BOOT support for the following PCI I/O cards:
AD331A PCI-X, 1-port, GigE 1000Base-T Adapter
AD332A PCI-X, 1-port, GigE 1000Base-SX Adapter
Added BOOT, SWAP, and DUMP support for the following PCI I/O cards:
AB378B PCI-X, 1-port, 4GB Fibre Channel Host Bus Adapter
AB379B PCI-X, 2-port, 4GB Fibre Channel Host Bus Adapter

PDC 45.11

Enabled support for PA-8900 processors.
Added support for a total of 8, 4GB DIMMs increasing the maximum system memory size from 24GB to 32GB.
Added 30 second delay after a "ser pdt clear" command at the BCH Main Menu to allow manual power down of the system and replacement of a bad DIMM without requiring rotation of all remaining DIMMs.
Added support for future 4GB Fiber Channel Host Bus Adapters.


So firmwarre update I would go for, look here:
PDC 45.44

Fixed an issue where the system may HPMC during every other boot just after memory self-test with a MEM_UNEXPECTED_HPMC event when FastBoot was enabled.
In previous versions after a successful reboot following a system fault the System LED may not automatically change from flashing red to flashing yellow.

PDC 45.11

In previous revisions with a four port lan card installed, a "ser scsi default" command at the BCH Main Menu may incorrectly display the following error: "ERROR: failed IODC write for path: 0x1000400".
In previous revisions a "deconfigured:stopped" processor may incorrectly be displayed as being in an "unknown" state when using the BCH Main Menu "in pr" command to view processor status.
In previous revisions when performing a "sea ipl" from the BCH Main Menu, any device that does not have a bootable lif may display a "bad lif magic" message while logging a "BOOT_BAD_LIF_MAGIC_OTHER" event.
Any updates to the system clock at the BCH Main Menu or the Operating System will now always be reflected in the iLO Management Processor without requiring a system reset.
Autoboot will no longer be halted if the System Event Log is full.
In previous versions multiple Single Bit Errors with MEM_CORR_ERR and MEM_MULTIPLE_ERRORS_DETECTED events may cause the system to hang during power on self test.

PDC 44.24

Resolved an issue where the system would fail to dump following a TOC.
ErrorHandler will now send chassis codes to indicate the type of error encountered.
In the case of a DMT entry not being found the system will halt and send out a DMT_ENTRY_NOT_FOUND chassis code.
Added two chassis codes to send out the entire part number of the memory extender.
Resolved an issue which caused HPUX to incorrectly report installed physical memory.
Low Priority Machine Checks are turned off until HPUX boot is complete to avoid improperly registering a Low Priority Machine Check.
Corrected a parity error chassis code from returning incorrect data and triggering attention LED on every BCH boot during a memory ECC test.
Resolved an issue which caused random memory rank deallocations due to a missing resistor on the memory extender.
Corrected an issue that resulted in a stopped processor after running an MP memory test.
Corrected an issue where AUTOSTART flag was incorrectly read from stable store and when set, prevented autoboot and autosearch.
Resolved an issue when booting add-on PCI LAN cards that allowed the first LAN server to respond to boot the machine.
In previous versions, a PDC_ALLOC request to allocate space would return SUCCESS even when there was insufficient storage to do so.
Resolved an issue that resulted in a HPMC when attempting to boot on single core systems.
Corrected an issue that resulted in no PIM for an L1 cache error.
Resolved an issue that caused a memory hang when multiple memory errors are detected.
Corrected an issue that cleared the PDT on hard reset resulting in changes to memory configuration to be lost between DC power cycles.
Added logging of inbound correctable and uncorrectable errors.
Initialized a variable that when uninitialized, caused a CC_MEM_EXTENDER_SPD_ERROR event indicating the memory extender SPD couldn't be read.
Corrected an issue that filled the SEL log with PDCE_CALL_TAKE_TOO_LONG events which would require clearing the log before using autoboot.
Resolved an issue where an ACC card (Z7340A) placed in a PCI slot could not be mapped due to insufficient memory failure.
Corrected an issue where the Tachlite FibreChannel IODC driver (A6795A) failed to come online with the B-Series and M-Series switches at port F set at 2Gbps fixed speed, resulting in an FibreChannel boot failure at the BCH prompt: "IODC ENTRY_INIT failed. Error Status: -4".

Last edited by vbe; 11-29-2013 at 10:14 AM.. Reason: More on firmware update
This User Gave Thanks to vbe For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

boot up failure unix sco after power failure

hi power went out. next day unix sco wont boot up error code 303. any help appreciated as we are clueless. (11 Replies)
Discussion started by: fredthayer
11 Replies

2. Solaris

Failure to boot v445

Hi Guys, I have a small problem with a v445 which I have been informed will only boot with the reconfigure option enabled. It is attached to HP SAN storage using qla2300 FCA's with a Veritas encapsulated rootvoldg (No Laughing here please) when I try a reboot I get the following error. ... (5 Replies)
Discussion started by: gull04
5 Replies

3. HP-UX

fail to boot HPUX

HPUX running in D-Class (L1000), pretty old HPUX version - hpux 11.00 Attempt 1 -- To boot from normal (primary) Unable to boot - system complains failure SYSTEM ALERT System Name : uninitialized DATE : 10/22/2011 Time : 03/41:12 Alert Level 15 = Fatal hardware or configuration... (12 Replies)
Discussion started by: ckwan
12 Replies

4. Debian

clusterKnoppix Live Cd boot failure

I'm trying to run clusterKnoppix live cd from a couple machines but it only works from one computer. The other three, my toshiba laptop, emachine, and hp machine can't find the filesystem.. the live cd stops booting and gives me a error saying it couldn't find knoppix filesystem and then gives me a... (0 Replies)
Discussion started by: iamhe
0 Replies

5. SCO

stage 1 boot failure: error loading hd (40)/boot

hi SCO Unix 5.0.6 doesn't boot due to this problem: not a directory boot not found cannot open stage 1 boot failure: error loading hd (40)/boot Knows someone howto solve it? (16 Replies)
Discussion started by: ccc
16 Replies

6. Solaris

Boot failure

I have installed Solaris 10 OS in Sun Virtual Box that uses x86 32 bit system. After an abnormal shutdown i'm getting the following message on the console when i try to boot. SunOS Release 5.10 Version Generic_127128_11 32-bit Copyright 1983-200 Sun Microsystems, Inc. All rights reserrved. Use... (3 Replies)
Discussion started by: Sesha
3 Replies

7. Solaris

INT18 boot failure

I installed solaris virtually and tried to format the partitions .. I dont know what went wrong, It got rebooted and hangs in the screen "No partitions" and after pressing Enter button it goes to "INT18 boot failure" and hangs there once again . Could any one of you suggest why this has happened... (3 Replies)
Discussion started by: priky
3 Replies

8. Linux

Boot failure

Hi all I used a dual boot operating system and it works fine for me. Now , i install a Ati radeon 9250 Agp card on my system and this results in boot failure of fedora 6. The graphics card is working fine with windows XP , i.e i have no compatibility issues.The system also refuses to boot when i... (2 Replies)
Discussion started by: joshighanshyam
2 Replies

9. UNIX for Dummies Questions & Answers

HP-UX respawning boot failure

Hello. System is a HP Visualize C3600 running X11 and after a power failure machine will not boot (see error messages below) From what I've read, this may be caused by a corrupted etc/inittab file. Solution suggested on other websites is to boot in single user mode and edit file inittab... (2 Replies)
Discussion started by: westcoast
2 Replies
Login or Register to Ask a Question