CentOS Kdump
Installing kdump
yum install kexec-tools
Configuring the Memory Usage
Changing Memory Options in GRUB2
Open the /etc/default/grub
configuration file as root
using a plain text editor such as vim or Gedit.
In this file, locate the line beginning with GRUB_CMDLINE_LINUX
. The line will look similar to the following:
GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/swap crashkernel=auto rd.lvm.lv=rhel/root rhgb quiet"
Note the highlighted crashkernel=
option; this is where the reserved memory is configured.
Change the value of the crashkernel=
option to the amount of memory you want to reserve. For example, to reserve 128 MB of memory, use the following:
crashkernel=128M
Minimum amount of reserved memory required for kdump: 2 GB and more should reserved 160 MB + 2 bits for every 4 KB of RAM. For a system with 1 TB of memory, 224 MB is the minimum (160 + 64 MB).
Configuring kdump on the Command Line
Configuring the Core Collector
To reduce the size of the vmcore
dump file, kdump
allows you to specify an external application (a core collector) to compress the data, and optionally leave out all irrelevant information. Currently, the only fully supported core collector is makedumpfile
.
To enable the core collector, as root
, open the /etc/kdump.conf
configuration file in a text editor, remove the hash sign (“#”) from the beginning of the #core_collector makedumpfile -l --message-level 1 -d 31
line, and edit the command line options as described below.
To enable the dump file compression, add the -c parameter. For example:
core_collector makedumpfile -c
Configuring the Default Action
default reboot
Enabling the Service
systemctl enable kdump.service
systemctl start kdump.service
Testing the kdump Configuration
~]# systemctl is-active kdump
active
Then type the following commands at a shell prompt:
echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger
This will force the Linux kernel to crash, and the address-YYYY-MM-DD-HH:MM:SS/vmcore
file will be copied to the location you have selected in the configuration (that is, to /var/crash/
by default).
分析 Core Dump
将上面产生的 vmcore 文件放在一个具有相同内核的服务器中进行调试,不要在生产环境中调试,同时确保 /etc/yum.conf
中 exclude
不包含 kernel
,/etc/yum.repos.d/CentOS-Debuginfo.repo
中 base-debuginfo
部分 enabled
为 1。
yum install crash
yum install yum-utils
wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-Debug-7 http://vault.centos.org/RPM-GPG-KEY-CentOS-Debug-7
debuginfo-install kernel
注意: kernel-debuginfo 在国内无镜像站,可以提前从 http://debuginfo.centos.org/7/x86_64/ 下载 rpm 包,使用以下替代方式安装:
yum install kernel-debuginfo-3.10.0-327.el7.x86_64.rpm kernel-debuginfo-common-x86_64-3.10.0-123.el7.x86_64.rpm
使用 crash
分析 kernel crash 的原因:
crash /usr/lib/debug/lib/modules/3.10.0-327.el7.x86_64/vmlinux vmcore
crash 7.1.2-3.el7_2.1
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/3.10.0-327.el7.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 40
DATE: Mon Aug 8 15:02:49 2016
UPTIME: 45 days, 03:42:44
LOAD AVERAGE: 0.63, 0.43, 0.25
TASKS: 934
NODENAME: server-64
RELEASE: 3.10.0-327.el7.x86_64
VERSION: #1 SMP Thu Nov 19 22:10:57 UTC 2015
MACHINE: x86_64 (2299 Mhz)
MEMORY: 63.8 GB
PANIC: "SysRq : Trigger a crash"
PID: 122956
COMMAND: "bash"
TASK: ffff88084e667300 [THREAD_INFO: ffff88078f2a0000]
CPU: 14
STATE: TASK_RUNNING (SYSRQ)
crash>