Crash is a tool used to analyse the core dump file created by a tool like
kdump. Crash depends upon
kexec utilities to obtain its input file. A standard Linux kernel, when booted with the
crashkernel argument, reserves a little amount of memory for a standby dump-capture kernel.
Upon a kernel panic, the
kexec utility triggers a warm reboot into a dump kernel, where the memory contents of the panicked kernel are backed up. A warm reboot does not erase the contents of memory, and hence these are accessible across reboots. Once the memory contents are dumped to a preconfigured location, the system cold reboots to the standard kernel. The dump can later be used to analyse the panic.
Installing and configuring Crash
To install the Crash tool, you can either install a distribution-specific RPM/deb package, or you can compile from source as per the following steps (as the root):
wget -c http://people.redhat.com/anderson/crash-5.1.1.tar.gz ##the current version as of this article tar -zxvf crash-5.1.1.tar.gz cd crash-5.1.1.tar.gz make && make install
Apart from this, you need to prepare your target machine for dump capture. You would need to make sure that the kernel running on this machine is compiled with the options
CONFIG_PROC_VMCORE. Apart from that, you need to install the
kexec-tools package, which can be downloaded from here.
Once you compile and install this package, you are provided with
makedumprd binaries, which are used during various phases of the panic and dump capture. For the machine to be able to boot to the dump kernel, we need the following arguments appended to the bootloader’s kernel line. On my Ubuntu system, I see the following arguments appended to my kernel line:
linux /boot/vmlinuz-2.6.35-24-generic crashkernel=384M-2G:64M,2G-:128M
crashkernel is the keyword that is required. The memory settings are as follows:
384M-2G:64M. If installed RAM is between 384 MB and 2 GB, then reserve 64MB. If it’s above 2 GB, then reserve 128 MB (if RAM is less than 384 MB, no memory is reserved). So, depending on your system’s configuration, you can reserve some amount of memory for the dump kernel.
On some Fedora and Red Hat-based distributions, you see syntax like
crashkernel=128M@16M. This means, reserve 128 MB of memory after the first 16 MB. Once these arguments are appended to the bootloader kernel line and saved, the system is rebooted with these settings, and is ready to capture the panic and dump it. Once a panic happens, the following files are fed to the crash utility to perform a dump analysis:
- Kernel (namelist): This is the uncompressed kernel binary (
vmlinux) and not the
vmlinuzfile that you have in the
vmlinuxcan be obtained easily from the compilation directory of the kernel. If you are running a stock kernel, you need to obtain
vmlinuxfrom your vendor.
- Dump Image (dumpfile): This is the
vmcorefile or the
- Map file: This is typically the system map file, which is found in the kernel source directory after compilation. This file is passed to the Crash tool with the
Once the above files are obtained from the panicked system, we are ready to perform dump analysis.
Exploring Crash with a sample dump
Let’s trigger a crash, and use the dump we obtain to understand the Crash utility. Trigger a crash by trying the following command:
echo c > /proc/sysrq-trigger
This will trigger a panic, and the system boots into the crash kernel, and takes a dump of system memory into the directory
/var/crash/<date-time>/. This is named
vmcore. Once done, it boots back to the normal kernel.
With the help of the
system-map files, we will invoke the Crash tool, and view the sample output from it:
[root@DELL-RnD-India linux-2.6]# crash -S System.map vmlinux /var/crash/2011-01-10-12\:23/vmcore crash 5.1.1 ---snip--- crash: overriding /boot/System.map with System.map GNU gdb (GDB) 7.0 This GDB was configured as "x86_64-unknown-linux-gnu"... ---snip------ SYSTEM MAP: System.map DEBUG KERNEL: vmlinux (2.6.36-rc6-ftrace+) DUMPFILE: /var/crash/2011-01-10-12:23/vmcore CPUS: 4 DATE: Mon Jan 10 12:21:33 2011 UPTIME: 00:06:56 LOAD AVERAGE: 0.80, 0.65, 0.31 TASKS: 278 NODENAME: DELL-RnD-India RELEASE: 2.6.36-rc6-ftrace+ VERSION: #2 SMP Wed Sep 29 16:43:59 IST 2010 MACHINE: x86_64 (2666 Mhz) MEMORY: 2 GB PANIC: "Oops: 0002 [#1] SMP " (check log for details) PID: 7203 COMMAND: "bash" TASK: ffff88007b0d0000 [THREAD_INFO: ffff88007a6ba000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash>
The above output shows you details about the kernel, the number of processors on the target machine, the command which caused the panic, etc.
/dev/meminstead of the
vmcorefile. For this to work, you need to disable the
CONFIG_STRICT_DEVMEMoption while compiling the kernel. Stock kernels come with this option enabled, and will not let you use it.
The help command
The most useful command would be the
help command, which gives you all the available commands from within the crash tool:
t gdb p sig waitq btop help ps struct whatis dev irq pte swap wr dis kmem ptob sym q eval list ptov sys exit log rd task extend mach repeat timer crash version: 5.1.1 gdb version: 7.0
To obtain help on any command, run help followed by the command name — for example,
The bt command
bt (backtrace) command gives you the stack trace in the current context. And
bt -a gives you a stack trace of active tasks on all CPUs. Once the crash tool loads the first context, it sets up information of the panicked process. Here we take a look at the sample output of the command:
crash> bt PID: 7203 TASK: ffff88007b0d0000 CPU: 0 COMMAND: "bash" #0 [ffff88007a6bbb00] machine_kexec at ffffffff81027ac7 #1 [ffff88007a6bbb80] crash_kexec at ffffffff810888c9 #2 [ffff88007a6bbc50] oops_end at ffffffff814570c4 #3 [ffff88007a6bbc80] no_context at ffffffff81032ee7 <snipped>
The ps command
This command obtains the status of all the processes, or a selected one. It has an amazing number of options to provide lots of information during dump analysis. Refer to the help section for more details. Here is a sample output:
crash> ps -a 5390 PID: 5390 TASK: ffff8800799ac650 CPU: 2 COMMAND: "httpd" ARG: /usr/sbin/httpd ENV: TERM=linux PATH=/sbin:/usr/sbin:/bin:/usr/bin runlevel=5 \<snipped....>
The set command
You can change the current context using the
set command, which takes the PID of the process (which can be obtained from the
ps command). It takes various other arguments as well, which can be learnt by running
help set. If
set is used without arguments, it shows information about the current stack. For example:
crash> set ffff88007d7c0000 PID: 1 COMMAND: "init" TASK: ffff88007d7c0000 [THREAD_INFO: ffff88007d7ba000] CPU: 0 STATE: TASK_INTERRUPTIBLE
Here, the address is the task pointer of the init process.
The files command
This can be used to get all the open files in the current context; it is a context-sensitive command:
crash> set 1 PID: 1 COMMAND: "init" TASK: ffff88007d7c0000 [THREAD_INFO: ffff88007d7ba000] CPU: 0 STATE: TASK_INTERRUPTIBLE crash> files PID: 1 TASK: ffff88007d7c0000 CPU: 0 COMMAND: "init" ROOT: / CWD: / FD FILE DENTRY INODE TYPE PATH 0 ffff880037a58f00 ffff88007cd5be40 ffff88007d090c90 CHR /dev/null 1 ffff880037a58f00 ffff88007cd5be40 ffff88007d090c90 CHR /dev/null 2 ffff880037a58f00 ffff88007cd5be40 ffff88007d090c90 CHR /dev/null 3 ffff880037a58a80 ffff88003747b000 ffff88003750d540 FIFO 4 ffff880037a586c0 ffff88003747b000 ffff88003750d540 FIFO 5 ffff880037a58c00 ffff880037493240 ffff88007cdc2ca0 UNKN anon_inode:/inotify 6 ffff880037a58180 ffff8800374936c0 ffff88007cdc2ca0 UNKN anon_inode:/inotify 7 ffff880076087a80 ffff8800376d8540 ffff88007ceb87b0 SOCK 8 ffff880079a25d80 ffff88007a205e40 ffff880079eabc30 SOCK 9 ffff88007688b6c0 ffff88007a8f0480 ffff88003752e830 SOCK
We have looked into some regularly used commands. For other commands, kindly refer to the help section.
I referred to the
documentation/kdump/kdump.txt file while writing this article. Apart from that, I also occasionally referred to numerous other articles available on the Web.