Hi,
I am experiencing frequent system hangs, hard kernel panics, etc almost thrice a day. The system would be totally unresponsive and the only way is to reboot is hard power recycling (plug out the power cable and plug in back after 30 secs). I enabled kdump, but unfortunately the kdump files are as huge as 16GB and unable to analyze. The repeated errors I get in the /var/log/messages is
Quote:
Aug 24 18:05:35 blr-cos-mdb01 kernel: BUG: soft lockup - CPU#0 stuck for 10s! [mysqld:5365]
Aug 24 18:05:35 blr-cos-mdb01 kernel: CPU 0:
Aug 24 18:05:45 blr-cos-mdb01 kernel: Modules linked in: ipv6 xfrm_nalgo crypto_api hidp l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy snd_hda_intel sr_mod tpm_infineon snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq cdrom tpm snd_seq_device snd_pcm_oss snd_mixer_oss tpm_bios snd_pcm e1000e snd_timer shpchp serio_raw snd_page_alloc snd_hwdep pcspkr sg snd soundcore dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Aug 24 18:05:45 blr-cos-mdb01 kernel: Pid: 5365, comm: mysqld Not tainted 2.6.18-194.3.1.el5 #1
Aug 24 18:05:45 blr-cos-mdb01 kernel: RIP: 0010:[] [] __d_lookup+0xe2/0xff
Aug 24 18:05:45 blr-cos-mdb01 kernel: RSP: 0018:ffff8101a415fc88 EFLAGS: 00000282
Aug 24 18:05:45 blr-cos-mdb01 kernel: RAX: ffff8103c6c864c8 RBX: ffff8103c6c864c8 RCX: 0000000000000015
Aug 24 18:05:45 blr-cos-mdb01 kernel: RDX: 00000000000db1d6 RSI: ffff8101a415fd28 RDI: ffff8103c83274b0
Aug 24 18:05:45 blr-cos-mdb01 kernel: RBP: ffff810417bbf800 R08: 0000000000008001 R09: ffff81041747e5c0
Aug 24 18:05:45 blr-cos-mdb01 kernel: R10: ffff810188c8c580 R11: ffffffff8002c3e0 R12: ffff81040c0840c0
Aug 24 18:05:45 blr-cos-mdb01 kernel: R13: 0000000000000000 R14: ffff8101a394e348 R15: ffff8101a394e348
Aug 24 18:05:45 blr-cos-mdb01 kernel: FS: 00000000405c8940(0063) GS:ffffffff803ca000(0000) knlGS:0000000000000000
Aug 24 18:05:45 blr-cos-mdb01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 24 18:05:45 blr-cos-mdb01 kernel: CR2: 00002aace6224000 CR3: 0000000401db3000 CR4: 00000000000006e0
Aug 24 18:05:45 blr-cos-mdb01 kernel:
Aug 24 18:05:45 blr-cos-mdb01 kernel: Call Trace:
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] __d_lookup+0xb0/0xff
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_lookup+0x2c/0x1e6
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] __link_path_walk+0xa01/0xf42
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] link_path_walk+0x42/0xb2
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_path_lookup+0x275/0x2f1
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] __path_lookup_intent_open+0x56/0x97
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] open_namei+0x73/0x6d5
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_page_fault+0x4fe/0x874
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_filp_open+0x1c/0x38
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] _atomic_dec_and_lock+0x39/0x57
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_sys_open+0x44/0xbe
Aug 24 18:05:45 blr-cos-mdb01 kernel: [] tracesys+0xd5/0xe0
Aug 24 18:05:45 blr-cos-mdb01 kernel:
Aug 24 18:22:23 blr-cos-mdb01 syslogd 1.4.1: restart.
I have CentOS release 5.5 (Final) with kernel-2.6.18-194.3.1.el5. The hardware is HP dc7900 with 16 GB RAM, Intel Core 2 Duo E8400/3Ghz/4GB RAM, 160GB HDD. I have installed MySQL builds from Percona viz
Percona-XtraDB-1.0.6-10.2-5.1.45-10.2.rhel5
Percona-Server-server-51-5.1.47-rel11.1.51.rhel5
Percona-XtraDB-1.0.3-5-5.1.34-5.rhel5
Percona-Server-shared-compat-5.1.43-3
Percona-Server-client-51-5.1.47-rel11.1.51.rhel5
Percona-Server-test-51-5.1.47-rel11.1.51.rhel5
Percona-Server-devel-51-5.1.47-rel11.1.51.rhel5
Percona-Server-shared-51-5.1.47-rel11.1.51.rhel5
The uname -a produces
Linux blr-cos-mdb01.digi.com 2.6.18-194.3.1.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
What could be the issue and how to resolve it ?
Regards
Prashant