I am a bit puzzled, and at home with no AIX box to check, I am surprised by the thread itself, if I were to deal with a machine suspicious reboot, I would start by having a look at the error reports:
then after finding the cause or possible cause, and the diag in the report ( permanent /software etc...) I would start to scratch my head and search for more clues if needed, because truly from the core dump its hard work if you have no idea for what you are after..
So what did you find in the error report?
There's a known issue with the 9117-MMC's that can affect system stability. I believe it was ECA337.
I do not believe the kernel has any issues. I've a few AIX 6.1 running some older TL's and they have been rock solid. In fact, looking at your lslpp -l outputs you are using the same TL as myself (6.1 tl7 sp4).
I'd start with looking at your firmware levels of the 9117's, the adapter FW levels as well. Is this lpar part of a HA Cluster? There's a known memory leak that causes a system crash every so many months if your on some versions of HACMP. IBM does offer extended service for AIX 6.1, but there will be no new fixes.
Looking over what has been posted there was a memory exception. Was this running an in house program?
I'm not very good at looking at dumps, and the few times had to rely on IBM/software vendor to address.
I'd start looking at errpt, your firmware, if you are in need of ECA337.
A 9117 isn't a low end frame either, so I'd hope you have IBM support for assistance?
ECA337 was a notification sent out to owners of P770's and P780's regarding a potential fault in the I/O backplane. It's an engineering change, and would have taken about 1.5 hours for a CE to address. This was announced back in March of 2017 I think. I had a pair of 9117-MMC's that were affected.
I'd give your IBM rep a call and see if its still being offered. I haven't looked up the service life on those units yet. I know my small 740's are coming up on their EOS dates.
We have just enabled core dump on our RHEL5.7 OS. the java process is terminating very often so we enable core dump to analysis the issue and find below in core dump file.
Core was generated by `/usr/java/jdk1.6.0_06//bin/java -server -Xms1536m -Xmx1536m -Xmn576m -XX:+Aggre'.
Program... (0 Replies)
in solaris 8 environment,frequently os panic happened and someone advise me check vmcore.:(
for crash dump facility can we use SUNEXPLORER data collector package including with analyse result of vmcore like ?
It may provides panic message included program counter address, perhaps
... (3 Replies)
I'm new to the group and this is my first post. I'm hoping someone can help me out. I have a core dump that I need to analyze from a Unix box and I've never done this sort of thing before. I was told to run a pmap and pstack on the core file which provided two different output files. ... (3 Replies)