224 lines
7.4 KiB
ReStructuredText
224 lines
7.4 KiB
ReStructuredText
|
.. SPDX-License-Identifier: GPL-2.0
|
||
|
|
||
|
========================================
|
||
|
Debugging advice for driver development
|
||
|
========================================
|
||
|
|
||
|
This document serves as a general starting point and lookup for debugging
|
||
|
device drivers.
|
||
|
While this guide focuses on debugging that requires re-compiling the
|
||
|
module/kernel, the :doc:`userspace debugging guide
|
||
|
</process/debugging/userspace_debugging_guide>` will guide
|
||
|
you through tools like dynamic debug, ftrace and other tools useful for
|
||
|
debugging issues and behavior.
|
||
|
For general debugging advice, see the :doc:`general advice document
|
||
|
</process/debugging/index>`.
|
||
|
|
||
|
.. contents::
|
||
|
:depth: 3
|
||
|
|
||
|
The following sections show you the available tools.
|
||
|
|
||
|
printk() & friends
|
||
|
------------------
|
||
|
|
||
|
These are derivatives of printf() with varying destinations and support for
|
||
|
being dynamically turned on or off, or lack thereof.
|
||
|
|
||
|
Simple printk()
|
||
|
~~~~~~~~~~~~~~~
|
||
|
|
||
|
The classic, can be used to great effect for quick and dirty development
|
||
|
of new modules or to extract arbitrary necessary data for troubleshooting.
|
||
|
|
||
|
Prerequisite: ``CONFIG_PRINTK`` (usually enabled by default)
|
||
|
|
||
|
**Pros**:
|
||
|
|
||
|
- No need to learn anything, simple to use
|
||
|
- Easy to modify exactly to your needs (formatting of the data (See:
|
||
|
:doc:`/core-api/printk-formats`), visibility in the log)
|
||
|
- Can cause delays in the execution of the code (beneficial to confirm whether
|
||
|
timing is a factor)
|
||
|
|
||
|
**Cons**:
|
||
|
|
||
|
- Requires rebuilding the kernel/module
|
||
|
- Can cause delays in the execution of the code (which can cause issues to be
|
||
|
not reproducible)
|
||
|
|
||
|
For the full documentation see :doc:`/core-api/printk-basics`
|
||
|
|
||
|
Trace_printk
|
||
|
~~~~~~~~~~~~
|
||
|
|
||
|
Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>``
|
||
|
|
||
|
It is a tiny bit less comfortable to use than printk(), because you will have
|
||
|
to read the messages from the trace file (See: :ref:`read_ftrace_log`
|
||
|
instead of from the kernel log, but very useful when printk() adds unwanted
|
||
|
delays into the code execution, causing issues to be flaky or hidden.)
|
||
|
|
||
|
If the processing of this still causes timing issues then you can try
|
||
|
trace_puts().
|
||
|
|
||
|
For the full Documentation see trace_printk()
|
||
|
|
||
|
dev_dbg
|
||
|
~~~~~~~
|
||
|
|
||
|
Print statement, which can be targeted by
|
||
|
:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains
|
||
|
additional information about the device used within the context.
|
||
|
|
||
|
**When is it appropriate to leave a debug print in the code?**
|
||
|
|
||
|
Permanent debug statements have to be useful for a developer to troubleshoot
|
||
|
driver misbehavior. Judging that is a bit more of an art than a science, but
|
||
|
some guidelines are in the :ref:`Coding style guidelines
|
||
|
<process/coding-style:13) printing kernel messages>`. In almost all cases the
|
||
|
debug statements shouldn't be upstreamed, as a working driver is supposed to be
|
||
|
silent.
|
||
|
|
||
|
Custom printk
|
||
|
~~~~~~~~~~~~~
|
||
|
|
||
|
Example::
|
||
|
|
||
|
#define core_dbg(fmt, arg...) do { \
|
||
|
if (core_debug) \
|
||
|
printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \
|
||
|
} while (0)
|
||
|
|
||
|
**When should you do this?**
|
||
|
|
||
|
It is better to just use a pr_debug(), which can later be turned on/off with
|
||
|
dynamic debug. Additionally, a lot of drivers activate these prints via a
|
||
|
variable like ``core_debug`` set by a module parameter. However, Module
|
||
|
parameters `are not recommended anymore
|
||
|
<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_.
|
||
|
|
||
|
Ftrace
|
||
|
------
|
||
|
|
||
|
Creating a custom Ftrace tracepoint
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
|
||
|
A tracepoint adds a hook into your code that will be called and logged when the
|
||
|
tracepoint is enabled. This can be used, for example, to trace hitting a
|
||
|
conditional branch or to dump the internal state at specific points of the code
|
||
|
flow during a debugging session.
|
||
|
|
||
|
Here is a basic description of :ref:`how to implement new tracepoints
|
||
|
<trace/tracepoints:usage>`.
|
||
|
|
||
|
For the full event tracing documentation see :doc:`/trace/events`
|
||
|
|
||
|
For the full Ftrace documentation see :doc:`/trace/ftrace`
|
||
|
|
||
|
DebugFS
|
||
|
-------
|
||
|
|
||
|
Prerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>``
|
||
|
|
||
|
DebugFS differs from the other approaches of debugging, as it doesn't write
|
||
|
messages to the kernel log nor add traces to the code. Instead it allows the
|
||
|
developer to handle a set of files.
|
||
|
With these files you can either store values of variables or make
|
||
|
register/memory dumps or you can make these files writable and modify
|
||
|
values/settings in the driver.
|
||
|
|
||
|
Possible use-cases among others:
|
||
|
|
||
|
- Store register values
|
||
|
- Keep track of variables
|
||
|
- Store errors
|
||
|
- Store settings
|
||
|
- Toggle a setting like debug on/off
|
||
|
- Error injection
|
||
|
|
||
|
This is especially useful, when the size of a data dump would be hard to digest
|
||
|
as part of the general kernel log (for example when dumping raw bitstream data)
|
||
|
or when you are not interested in all the values all the time, but with the
|
||
|
possibility to inspect them.
|
||
|
|
||
|
The general idea is:
|
||
|
|
||
|
- Create a directory during probe (``struct dentry *parent =
|
||
|
debugfs_create_dir("my_driver", NULL);``)
|
||
|
- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``)
|
||
|
|
||
|
- In this example the file is found in
|
||
|
``/sys/kernel/debug/my_driver/my_value`` (with read permissions for
|
||
|
user/group/all)
|
||
|
- any read of the file will return the current contents of the variable
|
||
|
``my_variable``
|
||
|
|
||
|
- Clean up the directory when removing the device
|
||
|
(``debugfs_remove_recursive(parent);``)
|
||
|
|
||
|
For the full documentation see :doc:`/filesystems/debugfs`.
|
||
|
|
||
|
KASAN, UBSAN, lockdep and other error checkers
|
||
|
----------------------------------------------
|
||
|
|
||
|
KASAN (Kernel Address Sanitizer)
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
|
||
|
Prerequisite: ``CONFIG_KASAN``
|
||
|
|
||
|
KASAN is a dynamic memory error detector that helps to find use-after-free and
|
||
|
out-of-bounds bugs. It uses compile-time instrumentation to check every memory
|
||
|
access.
|
||
|
|
||
|
For the full documentation see :doc:`/dev-tools/kasan`.
|
||
|
|
||
|
UBSAN (Undefined Behavior Sanitizer)
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
|
||
|
Prerequisite: ``CONFIG_UBSAN``
|
||
|
|
||
|
UBSAN relies on compiler instrumentation and runtime checks to detect undefined
|
||
|
behavior. It is designed to find a variety of issues, including signed integer
|
||
|
overflow, array index out of bounds, and more.
|
||
|
|
||
|
For the full documentation see :doc:`/dev-tools/ubsan`
|
||
|
|
||
|
lockdep (Lock Dependency Validator)
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
|
||
|
Prerequisite: ``CONFIG_DEBUG_LOCKDEP``
|
||
|
|
||
|
lockdep is a runtime lock dependency validator that detects potential deadlocks
|
||
|
and other locking-related issues in the kernel.
|
||
|
It tracks lock acquisitions and releases, building a dependency graph that is
|
||
|
analyzed for potential deadlocks.
|
||
|
lockdep is especially useful for validating the correctness of lock ordering in
|
||
|
the kernel.
|
||
|
|
||
|
PSI (Pressure stall information tracking)
|
||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
|
||
|
Prerequisite: ``CONFIG_PSI``
|
||
|
|
||
|
PSI is a measurement tool to identify excessive overcommits on hardware
|
||
|
resources, that can cause performance disruptions or even OOM kills.
|
||
|
|
||
|
device coredump
|
||
|
---------------
|
||
|
|
||
|
Prerequisite: ``#include <linux/devcoredump.h>``
|
||
|
|
||
|
Provides the infrastructure for a driver to provide arbitrary data to userland.
|
||
|
It is most often used in conjunction with udev or similar userland application
|
||
|
to listen for kernel uevents, which indicate that the dump is ready. Udev has
|
||
|
rules to copy that file somewhere for long-term storage and analysis, as by
|
||
|
default, the data for the dump is automatically cleaned up after 5 minutes.
|
||
|
That data is analyzed with driver-specific tools or GDB.
|
||
|
|
||
|
You can find an example implementation at:
|
||
|
`drivers/media/platform/qcom/venus/core.c
|
||
|
<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__
|
||
|
|
||
|
**Copyright** ©2024 : Collabora
|