1. 12 Feb, 2021 2 commits
    • Viktor Rosendahl's avatar
      tracing/tools: Add the latency-collector to tools directory · e23db805
      Viktor Rosendahl authored
      This is a tool that is intended to work around the fact that the
      preemptoff, irqsoff, and preemptirqsoff tracers only work in
      overwrite mode. The idea is to act randomly in such a way that we
      do not systematically lose any latencies, so that if enough testing
      is done, all latencies will be captured. If the same burst of
      latencies is repeated, then sooner or later we will have captured all
      the latencies.
      
      It also works with the wakeup_dl, wakeup_rt, and wakeup tracers.
      However, in that case it is probably not useful to use the random
      sleep functionality.
      
      The reason why it may be desirable to catch all latencies with a long
      test campaign is that for some organizations, it's necessary to test
      the kernel in the field and not practical for developers to work
      iteratively with field testers. Because of cost and project schedules
      it is not possible to start a new test campaign every time a latency
      problem has been fixed.
      
      It uses inotify to detect changes to /sys/kernel/tracing/trace.
      When a latency is detected, it will either sleep or print
      immediately, depending on a function that act as an unfair coin
      toss.
      
      If immediate print is chosen, it means that we open
      /sys/kernel/tracing/trace and thereby cause a blackout period
      that will hide any subsequent latencies.
      
      If sleep is chosen, it means that we wait before opening
      /sys/kernel/tracing/trace, by default for 1000 ms, to see if
      there is another latency during this period. If there is, then we will
      lose the previous latency. The coin will be tossed again with a
      different probability, and we will either print the new latency, or
      possibly a subsequent one.
      
      The probability for the unfair coin toss is chosen so that there
      is equal probability to obtain any of the latencies in a burst.
      However, this assumes that we make an assumption of how many
      latencies there can be. By default  the program assumes that there
      are no more than 2 latencies in a burst, the probability of immediate
      printout will be:
      
      1/2 and 1
      
      Thus, the probability of getting each of the two latencies will be 1/2.
      
      If we ever find that there is more than one latency in a series,
      meaning that we reach the probability of 1, then the table will be
      expanded to:
      
      1/3, 1/2, and 1
      
      Thus, we assume that there are no more than three latencies and each
      with a probability of 1/3 of being captured. If the probability of 1
      is reached in the new table, that is we see more than two closely
      occurring latencies, then the table will again be extended, and so
      on.
      
      On my systems, it seems like this scheme works fairly well, as
      long as the latencies we trace are long enough, 300 us seems to be
      enough. This userspace program receive the inotify event at the end
      of a latency, and it has time until the end of the next latency
      to react, that is to open /sys/kernel/tracing/trace. Thus,
      if we trace latencies that are >300 us, then we have at least 300 us
      to react.
      
      The minimum latency will of course not be 300 us on all systems, it
      will depend on the hardware, kernel version, workload and
      configuration.
      
      Example usage:
      
      In one shell, give the following command:
      sudo latency-collector -rvv -t preemptirqsoff -s 2000 -a 3
      
      This will trace latencies > 2000us with the preemptirqsoff tracer,
      using random sleep with maximum verbosity, with a probability
      table initialized to a size of 3.
      
      In another shell, generate a few bursts of latencies:
      
      root@host:~# modprobe preemptirq_delay_test delay=3000 test_mode=alternate
      burst_size=3
      root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
      root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
      root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
      root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
      
      If all goes well, you should be getting stack traces that shows
      all the different latencies, i.e. you should see all the three
      functions preemptirqtest_0, preemptirqtest_1, preemptirqtest_2 in the
      stack traces.
      
      Link: https://lkml.kernel.org/r/20210212134421.172750-2-Viktor.Rosendahl@bmw.deSigned-off-by: default avatarViktor Rosendahl <Viktor.Rosendahl@bmw.de>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      e23db805
    • Steven Rostedt (VMware)'s avatar
      tracing: Make hash-ptr option default · 99e22ce7
      Steven Rostedt (VMware) authored
      Since the original behavior of the trace events is to hash the %p pointers,
      make that the default, and have developers have to enable the option in
      order to have them unhashed.
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      99e22ce7
  2. 11 Feb, 2021 5 commits
  3. 09 Feb, 2021 9 commits
  4. 05 Feb, 2021 1 commit
  5. 02 Feb, 2021 16 commits
  6. 29 Jan, 2021 4 commits
    • Wang ShaoBo's avatar
      kretprobe: Avoid re-registration of the same kretprobe earlier · 0188b878
      Wang ShaoBo authored
      Our system encountered a re-init error when re-registering same kretprobe,
      where the kretprobe_instance in rp->free_instances is illegally accessed
      after re-init.
      
      Implementation to avoid re-registration has been introduced for kprobe
      before, but lags for register_kretprobe(). We must check if kprobe has
      been re-registered before re-initializing kretprobe, otherwise it will
      destroy the data struct of kretprobe registered, which can lead to memory
      leak, system crash, also some unexpected behaviors.
      
      We use check_kprobe_rereg() to check if kprobe has been re-registered
      before running register_kretprobe()'s body, for giving a warning message
      and terminate registration process.
      
      Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: 1f0ab409 ("kprobes: Prevent re-registration of the same kprobe")
      [ The above commit should have been done for kretprobes too ]
      Acked-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarAnanth N Mavinakayanahalli <ananth@linux.ibm.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarWang ShaoBo <bobo.shaobowang@huawei.com>
      Signed-off-by: default avatarCheng Jian <cj.chengjian@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      0188b878
    • Masami Hiramatsu's avatar
      tracing/kprobe: Fix to support kretprobe events on unloaded modules · 97c753e6
      Masami Hiramatsu authored
      Fix kprobe_on_func_entry() returns error code instead of false so that
      register_kretprobe() can return an appropriate error code.
      
      append_trace_kprobe() expects the kprobe registration returns -ENOENT
      when the target symbol is not found, and it checks whether the target
      module is unloaded or not. If the target module doesn't exist, it
      defers to probe the target symbol until the module is loaded.
      
      However, since register_kretprobe() returns -EINVAL instead of -ENOENT
      in that case, it always fail on putting the kretprobe event on unloaded
      modules. e.g.
      
      Kprobe event:
      /sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
      [   16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
      
      Kretprobe event: (p -> r)
      /sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
      sh: write error: Invalid argument
      /sys/kernel/debug/tracing # cat error_log
      [   41.122514] trace_kprobe: error: Failed to register probe event
        Command: r xfs:xfs_end_io
                   ^
      
      To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
      failure and return -ENOENT in that case. Otherwise it returns -EINVAL
      or 0 (succeeded, given address is on the entry).
      
      Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@devnote2
      
      Cc: stable@vger.kernel.org
      Fixes: 59158ec4 ("tracing/kprobes: Check the probe on unloaded module correctly")
      Reported-by: default avatarJianlin Lv <Jianlin.Lv@arm.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      97c753e6
    • Viktor Rosendahl's avatar
      tracing: Use pause-on-trace with the latency tracers · da7f84cd
      Viktor Rosendahl authored
      Eaerlier, tracing was disabled when reading the trace file. This behavior
      was changed with:
      
      commit 06e0a548 ("tracing: Do not disable tracing when reading the
      trace file").
      
      This doesn't seem to work with the latency tracers.
      
      The above mentioned commit dit not only change the behavior but also added
      an option to emulate the old behavior. The idea with this patch is to
      enable this pause-on-trace option when the latency tracers are used.
      
      Link: https://lkml.kernel.org/r/20210119164344.37500-2-Viktor.Rosendahl@bmw.de
      
      Cc: stable@vger.kernel.org
      Fixes: 06e0a548 ("tracing: Do not disable tracing when reading the trace file")
      Signed-off-by: default avatarViktor Rosendahl <Viktor.Rosendahl@bmw.de>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      da7f84cd
    • Steven Rostedt (VMware)'s avatar
      fgraph: Initialize tracing_graph_pause at task creation · 7e0a9220
      Steven Rostedt (VMware) authored
      On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
      will disable or pause function graph tracing, as there's some paths in
      bringing down the CPU that can have issues with its return address being
      modified. The task_struct structure has a "tracing_graph_pause" atomic
      counter, that when set to something other than zero, the function graph
      tracer will not modify the return address.
      
      The problem is that the tracing_graph_pause counter is initialized when the
      function graph tracer is enabled. This can corrupt the counter for the idle
      task if it is suspended in these architectures.
      
         CPU 1				CPU 2
         -----				-----
        do_idle()
          cpu_suspend()
            pause_graph_tracing()
                task_struct->tracing_graph_pause++ (0 -> 1)
      
      				start_graph_tracing()
      				  for_each_online_cpu(cpu) {
      				    ftrace_graph_init_idle_task(cpu)
      				      task-struct->tracing_graph_pause = 0 (1 -> 0)
      
            unpause_graph_tracing()
                task_struct->tracing_graph_pause-- (0 -> -1)
      
      The above should have gone from 1 to zero, and enabled function graph
      tracing again. But instead, it is set to -1, which keeps it disabled.
      
      There's no reason that the field tracing_graph_pause on the task_struct can
      not be initialized at boot up.
      
      Cc: stable@vger.kernel.org
      Fixes: 380c4b14 ("tracing/function-graph-tracer: append the tracing_graph_flag")
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
      Reported-by: pierre.gondois@arm.com
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      7e0a9220
  7. 25 Jan, 2021 1 commit
  8. 24 Jan, 2021 2 commits
    • Linus Torvalds's avatar
      Merge tag 'sh-for-5.11' of git://git.libc.org/linux-sh · 228a65d4
      Linus Torvalds authored
      Pull arch/sh updates from Rich Felker:
       "Cleanup and warning fixes"
      
      * tag 'sh-for-5.11' of git://git.libc.org/linux-sh:
        sh/intc: Restore devm_ioremap() alignment
        sh: mach-sh03: remove duplicate include
        arch: sh: remove duplicate include
        sh: Drop ARCH_NR_GPIOS definition
        sh: Remove unused HAVE_COPY_THREAD_TLS macro
        sh: remove CONFIG_IDE from most defconfig
        sh: mm: Convert to DEFINE_SHOW_ATTRIBUTE
        sh: intc: Convert to DEFINE_SHOW_ATTRIBUTE
        arch/sh: hyphenate Non-Uniform in Kconfig prompt
        sh: dma: fix kconfig dependency for G2_DMA
      228a65d4
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.11-2021-01-24' of git://git.kernel.dk/linux-block · ef7b1a0e
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Still need a final cancelation fix that isn't quite done done,
        expected in the next day or two. That said, this contains:
      
         - Wakeup fix for IOPOLL requests
      
         - SQPOLL split close op handling fix
      
         - Ensure that any use of io_uring fd itself is marked as inflight
      
         - Short non-regular file read fix (Pavel)
      
         - Fix up bad false positive warning (Pavel)
      
         - SQPOLL fixes (Pavel)
      
         - In-flight removal fix (Pavel)"
      
      * tag 'io_uring-5.11-2021-01-24' of git://git.kernel.dk/linux-block:
        io_uring: account io_uring internal files as REQ_F_INFLIGHT
        io_uring: fix sleeping under spin in __io_clean_op
        io_uring: fix short read retries for non-reg files
        io_uring: fix SQPOLL IORING_OP_CLOSE cancelation state
        io_uring: fix skipping disabling sqo on exec
        io_uring: fix uring_flush in exit_files() warning
        io_uring: fix false positive sqo warning on flush
        io_uring: iopoll requests should also wake task ->in_idle state
      ef7b1a0e