KDUMP(7) User Manuals KDUMP(7)
NAME
kdump - Saving kernel dumps in SUSE
SYNOPSIS
(not applicable)
DESCRIPTION
This manual page gives an overview about kdump configuration on SUSE. This version applies to SUSE LINUX Enterprise 11 and openSUSE 11.1.
Kdump is a technology to save the memory contents of a crashed system and save it to disk or network to analyse it later and find the cause
of the crash. When the system crashes, the mechanism uses kexec to boot a normal Linux kernel (that has been loaded into the system
previously) which then has access to the old memory contents via /proc/vmcore interface and can save that away.
After the memory has been saved, the system reboots (without kexec).
As mentioned above, that panic kernel has to be loaded into the system. That is accomplished via kexec(8) in an init script at system
bootup. To have memory for that panic kernel and also have RAM for the execution of that panic kernel, one has to reserve kernel memory
with a special boot parameter (crashkernel).
While it's possible in theory to boot the full system by that panic kernel, on SUSE we use the approach of having a special initramfs that
saves the dump to disk or network and then reboots. That has the advantage that less memory is necessary and the root file system also must
not be intact if you save to another file system or to network.
AUTOMATIC CONFIGURATION WITH YAST
A simple method to use kdump on SUSE is to use the YaST kdump module. Just install the package yast2-kdump (for example with
# zypper install yast2-kdump
and start the YaST2 module with
# yast2 kdump
Everything should be self-explanatory there.
MANUAL SETUP
Following steps are needed to setup kdump manually (the description of the steps will follow):
1. Install the required software packages,
2. add the crashkernel parameter to bootloader configuration,
3. enable the kdump service,
4. configure kdump (/etc/sysconfig/kdump) and
5. load the kdump kernel.
Required software
Following software packages are required for kdump:
o kexec-tools
o kdump
o makedumpfile
There is no special kernel-kdump required like in earlier versions of SUSE LINUX Enterprise. The technical reason is that the normal kernel
is relocatable now and can be used as kdump kernel, i.e. it's possible to load and execute the normal kernel at any address, not only the
compiled-in address as before.
Bootloader configuration
It's necessary to reserve a certain amount of memory in the normal system at boot time which will be used by kexec(8) to load the panic
kernel. To achieve that, you have to add a parameter called crashkernel in bootloader configuration. The syntax is:
crashkernel=size@offset
The offset is the load offset, i.e. the physical base address on which the memory reservation area starts. Starting with version 2.6.27,
it's not necessary to specify that offset manually since the kernel chooses a suitable base address automatically.
For the size, following values are recommended:
+-------------+---------------------------------------+
|Architecture | Size |
+-------------+---------------------------------------+
|i386 | 64M |
+-------------+---------------------------------------+
|x86_64 | 64M or 128M on large machines |
+-------------+---------------------------------------+
|ppc64 | 128M or 256M on large machines |
+-------------+---------------------------------------+
|ia64 | 512M (or more on very large machines) |
+-------------+---------------------------------------+
Example: crashkernel=64M (on a normal PC system)
Note
There's also a more advanced syntax that makes the amount of memory dependent on system RAM. See the section called "Extended
crashkernel commandline".
Enable kdump service
The kdump runlevel script just loads the kdump kernel at boot. To enable it, use the YaST runlevel editor or simply
# chkconfig boot.kdump on
on a shell. You can also execute it manually with rckdump start.
Configure kdump
The default configuration should work out of the box. You can tweak several configuration options in the /etc/sysconfig/kdump configuration
file.
Important
If you make changes in that configuration file, you always have to execute rckdump restart manually to make that changes apply. If you
don't, that changes will only apply after system reboot.
See the section "CONFIGURATION" later and/or kdump(5) for a description of the configuration options.
Load the kdump kernel
As mentioned above, the init script /etc/init.d/boot.kdump takes the part of loading the kdump kernel. As kdump kernel, the normal system
kernel is used, no special kernel image is required.
However, as initramfs, a special initramfs is built by mkdumprd(8). Normally, you don't have to take care about that step since the init
script checks if the initramfs is up to date (reading the configuration file modification time) and rebuilds it if necessary.
To manually load the kdump kernel (i.e, without the SUSE init script), you have to use the kexec(8) tool with the -p (panic kernel)
parameter like:
# kexec -p /boot/vmlinuz-version --initrd=/boot/initrd-version-kdump
--reuse-cmdline
TESTING
It perfectly makes sense to test the kdump configuration in a sane system state, i.e. not when the system really crashes but to trigger the
dump manually. To perform that, use the SysRq mechanism, i.e. just execute
# echo s > /proc/sysrq-trigger
# echo u > /proc/sysrq-trigger
# echo c > /proc/sysrq-trigger
After that, the panic kernel should boot and the dump should be saved.
CONFIGURATION
The configuration file is /etc/sysconfig/kdump. Just edit this file with a plain text editor to adjust the settings. You can also use the
YaST2 sysconfig editor. All variables are described in kdump(5). Here's a brief overview about some variables that are worth tweaking.
Save Directory
The most important setting is where the dump should be saved. Following methods are available:
o local file,
o FTP,
o SFTP (SSH),
o NFS,
o CIFS.
The recommendation is to use FTP or SFTP for network dumping or the local file dump. The configuration variable KDUMP_SAVEDIR has to be
filled with a URL to where the dump should be saved. The syntax is described in kdump(5).
If the directory does not exist, it will be created. Since the dump is taken in initrd, the network and mount configuration is a bit
different from the normal system. However, the mkdumprd(8) script is designed to do everything automatically for you. If you would like to
use a special network interface, see the KDUMP_NETCONFIG setting.
Example:
o file:///var/log/dump
o ftp://user@host:server/incoming/dumps
Note
If you want to use SFTP with public key authentication, make sure to read the "Secure File Transfer Protocol" section in kdump(5).
Deletion of old dumps
If you save the dumps to your local file system, you may want that kdump deletes automatically old dumps. Set KDUMP_KEEP_OLD_DUMPS to the
value how much old dumps should be preserved. To disable deletion of old dumps, set it to 0, and to delete all old dumps, set it to -1.
If the partition has less than KDUMP_FREE_DISK_SIZE megabytes free disk space after saving the dump, the dump is not copied at all.
Important
That two options don't apply to network dump targets.
Dump Filtering and Compression
The size of kernel dumps is uncompressed and unfiltered as large as your system has RAM. To get smaller files (for example, to send it to
support), you can compress the whole dump file afterwards. However, the drawback is that the dump has to be uncompressed afterwards before
opening, so the disk space needs to be there in any case.
To use page compression which compresses every page and allows dynamic uncompression with the crash(8) debugging tool, set KDUMP_DUMPFORMAT
to compressed (which is actually the default).
To filter the dump, you have to set the KDUMP_DUMPLEVEL. Then not all memory is saved to disk but only memory that does not fulfil some
criteria. I.e. you may want to leave out pages that are completely filled by zeroes as they don't contain any useful information. The
following table lists for each KDUMP_DUMPLEVEL the pages that are skipped, i.e. 0 produces a full dump and 31 is the smallest dump.
+-----------+-----------+------------+---------------+-----------+-----------+
|dump level | zero page | cache page | cache private | user data | free page |
+-----------+-----------+------------+---------------+-----------+-----------+
| 0 | | | | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 1 | X | | | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 2 | | X | | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 3 | X | X | | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 4 | | X | X | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 5 | X | X | X | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 6 | | X | X | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 7 | X | X | X | | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 8 | | | | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 9 | X | | | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 10 | | X | | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 11 | X | X | | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 12 | | X | X | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 13 | X | X | X | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 14 | | X | X | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 15 | X | X | X | X | |
+-----------+-----------+------------+---------------+-----------+-----------+
| 16 | | | | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 17 | X | | | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 18 | | X | | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 19 | X | X | | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 20 | | X | X | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 21 | X | X | X | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 22 | | X | X | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 23 | X | X | X | | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 24 | | | | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 25 | X | | | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 26 | | X | | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 27 | X | X | | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 28 | | X | X | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 29 | X | X | X | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 30 | | X | X | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
| 31 | X | X | X | X | X |
+-----------+-----------+------------+---------------+-----------+-----------+
Automatic Reboot and Error handling
If you want to have your machine rebooted automatically after the dump has been copied, set KDUMP_IMMEDIATE_REBOOT to yes. The variable
KDUMP_CONTINUE_ON_ERROR controls if a shell should be opened if something goes wrong while saving the dump to be able to fixup manually.
This is mostly for debugging.
In production, use that only if you have a serial console since VGA console and keyboard is not reliable in kdump environment.
Notification
If you enable notification support, then you get an email after the dump has been copied (and before the KDUMP_IMMEDIATE_REBOOT takes
place). Because we don't have a mail server running in initrd where the mail has to be sent, you have to configure a SMTP server:
o KDUMP_SMTP_SERVER must hold a hostname (and an optional port, separated by a colon) to a SMTP server.
o KDUMP_STMP_USER and KDUMP_SMTP_PASSWORD must be set to username and password if SMTP AUTH should be used, or empty otherwise (plain
SMTP without authentication will be used).
Then an email will be sent to the address in KDUMP_NOTIFICATION_TO (only one address possible) and KDUMP_NOTIFICATION_CC (multiple
addresses possible).
Debugging options
If something goes wrong and you possibly opened a bug report, you may be asked to increase verbosity to report what's going wrong. Also,
this is useful if you would like to find the cause yourself.
At first, you can increase KDUMP_VERBOSE. The maximum log level is 15. That gives both information when loading the dump kernel (i.e. the
rckdump start command) and also when copying the dump in initrd.
Warning
If you use a VGA console and trigger the dump when X11 is running (i.e. your graphical desktop), you might not see any output. Use a
serial console in that case, or try to trigger the dump from Linux console (i.e. press Ctrl-Alt-F1 in your graphical desktop and log in
there).
If the problem is the makedumpfile(8) filtering tool, then set MAKEDUMPFILE_OPTIONS to -D to get debugging output of makedumpfile.
ADVANCED CONFIGURATION
Trigger Kdump on NMI (i386/x86_64 only)
Some systems (mostly "Enterprise" servers) have a so-called NMI button (physically or via the remote management consoles) that triggers an
NMI manually if the system hangs completely and even SysRQ does not work any more.
If you want to trigger a kdump in that case, you have to execute
# sysctl kernel.panic_on_unrecovered_nmi=1
manually or (if you want to make that a permanent setting) add
kernel.panic_on_unrecovered_nmi=1
in /etc/sysctl.conf.
Extended crashkernel commandline
While the "crashkernel=size[@offset]" syntax is sufficient for most configurations, sometimes it's handy to have the reserved memory
dependent on the value of System RAM -- that's mostly for distributors that pre-setup the kernel command line to avoid a unbootable system
after some memory has been removed from the machine.
The syntax is:
crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
range=start-[end]
while start is inclusive and end is exclusive.
For example:
crashkernel=512M-2G:64M,2G-:128M
This would mean:
1. If the RAM is smaller than 512M, then don't reserve anything (this is the "rescue" case).
2. If the RAM size is between 512M and 2G (exclusive), then reserve 64M.
3. If the RAM size is larger than 2G, then reserve 128M.
BUGS
Known Problems
There are known problems when using Kdump on Xen.
1. Dump filtering does not work. Use a KDUMP_DUMPLEVEL of 0 and set KDUMP_DUMPFORMAT to ELF. That are not the default settings, so you
have to change that settings if you're using Xen.
New Bugs
Please report bugs and enhancement requests at https://bugzilla.novell.com.
COPYING
Copyright (c) 2008 Bernhard Walle <bwalle@suse.de>. Free use of this software is granted under the terms of the GNU General Public License
(GPL), version 2 or later.
SEE ALSO
kexec(8), kdump(5), makedumpfile(8), crash(8) http://en.opensuse.org/Kdump
kdump 0.8.1 07/05/2010 KDUMP(7)