[[PageOutline]]

= Linux Troubleshooting =

See also:
 * [wiki:linux/profiling linux/profiling]

== Kernel Debugging ==

[=#dmesg]
=== ring buffer ===
The kernel ring buffer is where various messages are sent by the kernel and its modules (ie debug/info/warn) as long as {{{CONFIG_PRINTK}}} is enabled. You can also enable {{{CONFIG_PRINTK_TIME}}} to enable timestamps on these messages.

Typically an application such as {{{syslog}}} will pull messages from here and write them to a log, however the {{{dmesg}}} command will display them directly from the ring buffer:
{{{#!bash
dmesg # display all messages
dmesg -c  # display and clear all messages
}}}

[=#printk]
=== printk and log levels ===
According to Linux Device Drivers, the most common debugging technique is monitoring which is done by calling {{{printf}}} at suitable points. In the kernel you accomplish the same with {{{printk}}}.

The kernel defines various log levels or severity levels which are strings that you can pre-pend to debug statements defined in ​{{{linux/include/linux/kern_levels.h}}}:
* KERN_EMERG (0) - system is unusable
* KERN_ALERT (1) - actino must be taken immediately
* KERN_CRIT (2) - critical conditions
* KERN_ERR (3) - error conditions
* KERN_WARNING (4) - warning conditions
* KERN_NOTICE (5) - normal but significant condition
* KERN_INFO (6) - informational
* KERN_DEBUG (7) - debug-level messages
* KERN_DEFAULT (d) - the default kernel loglevel

To enable various severity levels you can:
1. set {{{DEFAULT_MESSAGE_LOGLEVEL}}} in kernel at build time
2. use {{{/proc/sys/kernel/printk}}}:
{{{#!bash
$ cat /proc/sys/kernel/printk
	7       4       1       7
	current	default	minimum	boot-time-default
$ echo 8 > /proc/sys/kernel/printk  # enable highest level
}}}
3. pass the {{{debug}}} kernel commandline option

References:
* ​[http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/printk-formats.txt Documentation/printk-formats.txt]
* [http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/linux/include/linux/kern_levels.h linux/include/linux/kern_levels.h]
* [http://elinux.org/Debugging_by_printing#Log_Levels Debugging by printing]


[=#pr-debug]
=== pr-debug() and dev_dbg() ===
Instead of using {{{printk()}}} directly it is recommended to use the {{{pr_debug()}}} and {{{dev_dbg()}}} macros for the following reasons:
* they use dynamic debugging (if enabled in the kernel, otherwise those are just printk's) which avoids performance hits caused by rendering the printk's regardless of loglevel tuning
* they can display device-specific details and/or can have easily formatted prefixes

[=#dynamic]
=== dynamic debug ===
Regardless of the system log level each printk() must still being rendered in the kernel (formatted with locking buffers) which is very expensive performance-wise. Dynamic Debug (if enabled via CONFIG_DYNAMIC_DEBUG) will instead conditionally skip debugs before they are rendered. Dynamic Debug allows you to enable/disable various debug messages at runtime based on various levels of scope:
* per source file
* per function
* per module
* format string
* line number

The {{{CONFIG_DYNAMIC_DEBUG}}} option implicitly compiles in all {{{pr_debug()}}} and {{{dev_dbg()}}} calls which can enlarge the kernel text size by about 2%. Note that it does not work with {{{printk()}}} and if not enabled in the kernel all {{{pr_debug()}}} {{{dev_dbg()}}} calls simply become static printk's.

If a source file is compiled with the DEBUG flag set (which can be turned on by many CONFIG_*DEBUG* options), any pr_debug() calls in it are enabled by default, but can be disabled at runtime.

Dynamic debugging is controlled by the 'dynamic_debug/control' file in the debugfs filesystem. The following examples assume that debugfs is enabled and mounted to /sys/kernel/debug.

The format of each line of the file is: {{{filename:lineno [module]function flags format}}}:
* filename : source file of the debug statement
* lineno : line number of the debug statement
* module : module that contains the debug statement
* function : function that contains the debug statement
* flags : '+p' means the line is turned 'on' for printing
* format : the format used for the debug statement

For details and examples please see the kernel ​Documentation/dynamic-debug-howto.txt

References:
* [http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dynamic-debug-howto.txt Documentation/dynamic-debug-howto.txt]
* [https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf Dynamic Debug by Jason Baron of Red Hat (2009)]

== Early Serial Debug ==
Typically the serial drivers don't get initialized until after some of the more critical components. This can result in any kernel crash/hang not reporting anything from the serial console after {{{'Starting kernel ....'}}}

For platforms that early serial support you can enable this in the kernel config. For example on a Ventana board (all ventana boards use ttymxc1 (UART2) for serial console):
 * make menuconfig
  - Kernel hacking -> Kernel low-level debugging functions -> Kernel low-level debugging port (i.MX6Q/DL Debug UART)
  - Kernel hacking -> Kernel low-level debugging functions -> i.MX Debug UART Port Selection : 2
 * results in the following additions to .config:
{{{
CONFIG_DEBUG_LL=y
CONFIG_DEBUG_IMX6Q_UART=y
CONFIG_DEBUG_IMX_UART_PORT=2
CONFIG_DEBUG_LL_INCLUDE="debug/imx.S"
CONFIG_DEBUG_UNCOMPRESS=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DIRECT=y
}}}
 * Finally, some kernels require a special parameter to be passed in through the bootloader.  See below for instructions:
{{{#!bash
setenv extra ${extra} earlyprintk=serial,${console},${baudrate}
saveenv
}}}

For newport, Early console is compiled into the kernel and to enable it make sure you boot with:
{{{#!bash
console=ttyAMA0,115200n8 earlycon=pl011,0x87e028000000 cma=64M
}}}

For venice, Early console is compiled into the kernel and to enable it make sure you boot with:
{{{#!bash
console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
}}}

[=#sysrq]
== Linux Magic System Request Key Hacks ==
There exists a magic key combo you can hit which the kernel will respond to regardless of whatever else it is doing (unless its completely locked up).

To enable this build with CONFIG_MAGIC_SYSRQ when configuring the kernel.

This will enable a /proc/sysrq-trigger file which you can use to send various commands to. Some very useful commands for debugging are:
* 'b' - reboot immediately (without sync)
* 'c' - perform a system crash via a NULL pointer dereference (which will show a crashdump if enabled)
* 'd' - show all locks that are held
* 'e' - send a SIGTERM to all processes except for PID1 (init)
* 'f' - call oom_kill to kill a memory hog process
* 'h' - display help
* 'i' - send a SIGKILL to all processes except for PID1 (init)
* 'l' - show a stack backtrace for all active CPU's

References:
* [https://www.kernel.org/doc/Documentation/sysrq.txt sysrq.txt]