• Douglas Anderson's avatar
    kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function() · 3cd99ac3
    Douglas Anderson authored
    When I had lockdep turned on and dropped into kgdb I got a nice splat
    on my system.  Specifically it hit:
      DEBUG_LOCKS_WARN_ON(current->hardirq_context)
    
    Specifically it looked like this:
      sysrq: SysRq : DEBUG
      ------------[ cut here ]------------
      DEBUG_LOCKS_WARN_ON(current->hardirq_context)
      WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27
      pstate: 604003c9 (nZCv DAIF +PAN -UAO)
      pc : lockdep_hardirqs_on+0xf0/0x160
      ...
      Call trace:
       lockdep_hardirqs_on+0xf0/0x160
       trace_hardirqs_on+0x188/0x1ac
       kgdb_roundup_cpus+0x14/0x3c
       kgdb_cpu_enter+0x53c/0x5cc
       kgdb_handle_exception+0x180/0x1d4
       kgdb_compiled_brk_fn+0x30/0x3c
       brk_handler+0x134/0x178
       do_debug_exception+0xfc/0x178
       el1_dbg+0x18/0x78
       kgdb_breakpoint+0x34/0x58
       sysrq_handle_dbg+0x54/0x5c
       __handle_sysrq+0x114/0x21c
       handle_sysrq+0x30/0x3c
       qcom_geni_serial_isr+0x2dc/0x30c
      ...
      ...
      irq event stamp: ...45
      hardirqs last  enabled at (...44): [...] __do_softirq+0xd8/0x4e4
      hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130
      softirqs last  enabled at (...42): [...] _local_bh_enable+0x2c/0x34
      softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100
      ---[ end trace adf21f830c46e638 ]---
    
    Looking closely at it, it seems like a really bad idea to be calling
    local_irq_enable() in kgdb_roundup_cpus().  If nothing else that seems
    like it could violate spinlock semantics and cause a deadlock.
    
    Instead, let's use a private csd alongside
    smp_call_function_single_async() to round up the other CPUs.  Using
    smp_call_function_single_async() doesn't require interrupts to be
    enabled so we can remove the offending bit of code.
    
    In order to avoid duplicating this across all the architectures that
    use the default kgdb_roundup_cpus(), we'll add a "weak" implementation
    to debug_core.c.
    
    Looking at all the people who previously had copies of this code,
    there were a few variants.  I've attempted to keep the variants
    working like they used to.  Specifically:
    * For arch/arc we passed NULL to kgdb_nmicallback() instead of
      get_irq_regs().
    * For arch/mips there was a bit of extra code around
      kgdb_nmicallback()
    
    NOTE: In this patch we will still get into trouble if we try to round
    up a CPU that failed to round up before.  We'll try to round it up
    again and potentially hang when we try to grab the csd lock.  That's
    not new behavior but we'll still try to do better in a future patch.
    Suggested-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
    Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
    Cc: Vineet Gupta <vgupta@synopsys.com>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Will Deacon <will.deacon@arm.com>
    Cc: Richard Kuo <rkuo@codeaurora.org>
    Cc: Ralf Baechle <ralf@linux-mips.org>
    Cc: Paul Burton <paul.burton@mips.com>
    Cc: James Hogan <jhogan@kernel.org>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
    Cc: Rich Felker <dalias@libc.org>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Acked-by: default avatarWill Deacon <will.deacon@arm.com>
    Signed-off-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
    3cd99ac3
debug_core.c 26.3 KB