Documentation/process/debugging/userspace_debugging_guide.rst

   1 .. SPDX-License-Identifier: GPL-2.0
   2
   3 ==========================
   4 Userspace debugging advice
   5 ==========================
   6
   7 This document provides a brief overview of common tools to debug the Linux
   8 Kernel from userspace.
   9 For debugging advice aimed at driver developers go :doc:`here
  10 </process/debugging/driver_development_debugging_guide>`.
  11 For general debugging advice, see :doc:`general advice document
  12 </process/debugging/index>`.
  13
  14 .. contents::
  15     :depth: 3
  16
  17 The following sections show you the available tools.
  18
  19 Dynamic debug
  20 -------------
  21
  22 Mechanism to filter what ends up in the kernel log by dis-/en-abling log
  23 messages.
  24
  25 Prerequisite: ``CONFIG_DYNAMIC_DEBUG``
  26
  27 Dynamic debug is only able to target:
  28
  29 - pr_debug()
  30 - dev_dbg()
  31 - print_hex_dump_debug()
  32 - print_hex_dump_bytes()
  33
  34 Therefore the usability of this tool is, as of now, quite limited as there is
  35 no uniform rule for adding debug prints to the codebase, resulting in a variety
  36 of ways these prints are implemented.
  37
  38 Also, note that most debug statements are implemented as a variation of
  39 dprintk(), which have to be activated via a parameter in respective module,
  40 dynamic debug is unable to do that step for you.
  41
  42 Here is one example, that enables all available pr_debug()'s within the file::
  43
  44   $ alias ddcmd='echo $* > /proc/dynamic_debug/control'
  45   $ ddcmd '-p; file v4l2-h264.c +p'
  46   $ grep =p /proc/dynamic_debug/control
  47    drivers/media/v4l2-core/v4l2-h264.c:372 [v4l2_h264]print_ref_list_b =p
  48    "ref_pic_list_b%u (cur_poc %u%c) %s"
  49    drivers/media/v4l2-core/v4l2-h264.c:333 [v4l2_h264]print_ref_list_p =p
  50    "ref_pic_list_p (cur_poc %u%c) %s\n"
  51
  52 **When should you use this over Ftrace ?**
  53
  54 - When the code contains one of the valid print statements (see above) or when
  55   you have added multiple pr_debug() statements during development
  56 - When timing is not an issue, meaning if multiple pr_debug() statements in
  57   the code won't cause delays
  58 - When you care more about receiving specific log messages than tracing the
  59   pattern of how a function is called
  60
  61 For the full documentation see :doc:`/admin-guide/dynamic-debug-howto`
  62
  63 Ftrace
  64 ------
  65
  66 Prerequisite: ``CONFIG_DYNAMIC_FTRACE``
  67
  68 This tool uses the tracefs file system for the control files and output files.
  69 That file system will be mounted as a ``tracing`` directory, which can be found
  70 in either ``/sys/kernel/`` or ``/sys/debug/kernel/``.
  71
  72 Some of the most important operations for debugging are:
  73
  74 - You can perform a function trace by adding a function name to the
  75   ``set_ftrace_filter`` file (which accepts any function name found within the
  76   ``available_filter_functions`` file) or you can specifically disable certain
  77   functions by adding their names to the ``set_ftrace_notrace`` file (more info
  78   at: :ref:`trace/ftrace:dynamic ftrace`).
  79 - In order to find out where calls originate from you can activate the
  80   ``func_stack_trace`` option under ``options/func_stack_trace``.
  81 - Tracing the children of a function call and showing the return values are
  82   possible by adding the desired function in the ``set_graph_function`` file
  83   (requires config ``FUNCTION_GRAPH_RETVAL``); more info at
  84   :ref:`trace/ftrace:dynamic ftrace with the function graph tracer`.
  85
  86 For the full Ftrace documentation see :doc:`/trace/ftrace`
  87
  88 Or you could also trace for specific events by :ref:`using event tracing
  89 <trace/events:2. using event tracing>`, which can be defined as described here:
  90 :ref:`Creating a custom Ftrace tracepoint
  91 <process/debugging/driver_development_debugging_guide:ftrace>`.
  92
  93 For the full Ftrace event tracing documentation see :doc:`/trace/events`
  94
  95 .. _read_ftrace_log:
  96
  97 Reading the ftrace log
  98 ~~~~~~~~~~~~~~~~~~~~~~
  99
 100 The ``trace`` file can be read just like any other file (``cat``, ``tail``,
 101 ``head``, ``vim``, etc.), the size of the file is limited by the
 102 ``buffer_size_kb`` (``echo 1000 > buffer_size_kb``). The
 103 :ref:`trace/ftrace:trace_pipe` will behave similarly to the ``trace`` file, but
 104 whenever you read from the file the content is consumed.
 105
 106 Kernelshark
 107 ~~~~~~~~~~~
 108
 109 A GUI interface to visualize the traces as a graph and list view from the
 110 output of the `trace-cmd
 111 <https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/>`__ application.
 112
 113 For the full documentation see `<https://kernelshark.org/Documentation.html>`__
 114
 115 Perf & alternatives
 116 -------------------
 117
 118 The tools mentioned above provide ways to inspect kernel code, results,
 119 variable values, etc. Sometimes you have to find out first where to look and
 120 for those cases, a box of performance tracking tools can help you to frame the
 121 issue.
 122
 123 Why should you do a performance analysis?
 124 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 125
 126 A performance analysis is a good first step when among other reasons:
 127
 128 - you cannot define the issue
 129 - you do not know where it occurs
 130 - the running system should not be interrupted or it is a remote system, where
 131   you cannot install a new module/kernel
 132
 133 How to do a simple analysis with linux tools?
 134 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 135
 136 For the start of a performance analysis, you can start with the usual tools
 137 like:
 138
 139 - ``top`` / ``htop`` / ``atop`` (*get an overview of the system load, see
 140   spikes on specific processes*)
 141 - ``mpstat -P ALL`` (*look at the load distribution among CPUs*)
 142 - ``iostat -x`` (*observe input and output devices utilization and performance*)
 143 - ``vmstat`` (*overview of memory usage on the system*)
 144 - ``pidstat`` (*similar to* ``vmstat`` *but per process, to dial it down to the
 145   target*)
 146 - ``strace -tp $PID`` (*once you know the process, you can figure out how it
 147   communicates with the Kernel*)
 148
 149 These should help to narrow down the areas to look at sufficiently.
 150
 151 Diving deeper with perf
 152 ~~~~~~~~~~~~~~~~~~~~~~~
 153
 154 The **perf** tool provides a series of metrics and events to further dial down
 155 on issues.
 156
 157 Prerequisite: build or install perf on your system
 158
 159 Gather statistics data for finding all files starting with ``gcc`` in ``/usr``::
 160
 161   # perf stat -d find /usr -name 'gcc*' | wc -l
 162
 163    Performance counter stats for 'find /usr -name gcc*':
 164
 165      1277.81 msec    task-clock             #    0.997 CPUs utilized
 166      9               context-switches       #    7.043 /sec
 167      1               cpu-migrations         #    0.783 /sec
 168      704             page-faults            #  550.943 /sec
 169      766548897       cycles                 #    0.600 GHz                         (97.15%)
 170      798285467       instructions           #    1.04  insn per cycle              (97.15%)
 171      57582731        branches               #   45.064 M/sec                       (2.85%)
 172      3842573         branch-misses          #    6.67% of all branches             (97.15%)
 173      281616097       L1-dcache-loads        #  220.390 M/sec                       (97.15%)
 174      4220975         L1-dcache-load-misses  #    1.50% of all L1-dcache accesses   (97.15%)
 175      <not supported> LLC-loads
 176      <not supported> LLC-load-misses
 177
 178    1.281746009 seconds time elapsed
 179
 180    0.508796000 seconds user
 181    0.773209000 seconds sys
 182
 183
 184   52
 185
 186 The availability of events and metrics depends on the system you are running.
 187
 188 For the full documentation see
 189 `<https://perf.wiki.kernel.org/index.php/Main_Page>`__
 190
 191 Perfetto
 192 ~~~~~~~~
 193
 194 A set of tools to measure and analyze how well applications and systems perform.
 195 You can use it to:
 196
 197 * identify bottlenecks
 198 * optimize code
 199 * make software run faster and more efficiently.
 200
 201 **What is the difference between perfetto and perf?**
 202
 203 * perf is tool as part of and specialized for the Linux Kernel and has CLI user
 204   interface.
 205 * perfetto cross-platform performance analysis stack, has extended
 206   functionality into userspace and provides a WEB user interface.
 207
 208 For the full documentation see `<https://perfetto.dev/docs/>`__
 209
 210 Kernel panic analysis tools
 211 ---------------------------
 212
 213   To capture the crash dump please use ``Kdump`` & ``Kexec``. Below you can find
 214   some advice for analysing the data.
 215
 216   For the full documentation see the :doc:`/admin-guide/kdump/kdump`
 217
 218   In order to find the corresponding line in the code you can use `faddr2line
 219   <https://elixir.bootlin.com/linux/v6.11.6/source/scripts/faddr2line>`__; note
 220   that you need to enable ``CONFIG_DEBUG_INFO`` for that to work.
 221
 222   An alternative to using ``faddr2line`` is the use of ``objdump`` (and its
 223   derivatives for the different platforms like ``aarch64-linux-gnu-objdump``).
 224   Take this line as an example:
 225
 226   ``[  +0.000240]  rkvdec_device_run+0x50/0x138 [rockchip_vdec]``.
 227
 228   We can find the corresponding line of code by executing::
 229
 230     aarch64-linux-gnu-objdump -dS drivers/staging/media/rkvdec/rockchip-vdec.ko | grep rkvdec_device_run\>: -A 40
 231     0000000000000ac8 <rkvdec_device_run>:
 232      ac8:       d503201f        nop
 233      acc:       d503201f        nop
 234     {
 235      ad0:       d503233f        paciasp
 236      ad4:       a9bd7bfd        stp     x29, x30, [sp, #-48]!
 237      ad8:       910003fd        mov     x29, sp
 238      adc:       a90153f3        stp     x19, x20, [sp, #16]
 239      ae0:       a9025bf5        stp     x21, x22, [sp, #32]
 240         const struct rkvdec_coded_fmt_desc *desc = ctx->coded_fmt_desc;
 241      ae4:       f9411814        ldr     x20, [x0, #560]
 242         struct rkvdec_dev *rkvdec = ctx->dev;
 243      ae8:       f9418015        ldr     x21, [x0, #768]
 244         if (WARN_ON(!desc))
 245      aec:       b4000654        cbz     x20, bb4 <rkvdec_device_run+0xec>
 246         ret = pm_runtime_resume_and_get(rkvdec->dev);
 247      af0:       f943d2b6        ldr     x22, [x21, #1952]
 248         ret = __pm_runtime_resume(dev, RPM_GET_PUT);
 249      af4:       aa0003f3        mov     x19, x0
 250      af8:       52800081        mov     w1, #0x4                        // #4
 251      afc:       aa1603e0        mov     x0, x22
 252      b00:       94000000        bl      0 <__pm_runtime_resume>
 253         if (ret < 0) {
 254      b04:       37f80340        tbnz    w0, #31, b6c <rkvdec_device_run+0xa4>
 255         dev_warn(rkvdec->dev, "Not good\n");
 256      b08:       f943d2a0        ldr     x0, [x21, #1952]
 257      b0c:       90000001        adrp    x1, 0 <rkvdec_try_ctrl-0x8>
 258      b10:       91000021        add     x1, x1, #0x0
 259      b14:       94000000        bl      0 <_dev_warn>
 260         *bad = 1;
 261      b18:       d2800001        mov     x1, #0x0                        // #0
 262      ...
 263
 264   Meaning, in this line from the crash dump::
 265
 266     [  +0.000240]  rkvdec_device_run+0x50/0x138 [rockchip_vdec]
 267
 268   I can take the ``0x50`` as offset, which I have to add to the base address
 269   of the corresponding function, which I find in this line::
 270
 271     0000000000000ac8 <rkvdec_device_run>:
 272
 273   The result of ``0xac8 + 0x50 = 0xb18``
 274   And when I search for that address within the function I get the
 275   following line::
 276
 277     *bad = 1;
 278     b18:      d2800001        mov     x1, #0x0
 279
 280 **Copyright** ©2024 : Collabora