• Alexander Potapenko's avatar
    mm: add Kernel Electric-Fence infrastructure · 0ce20dd8
    Alexander Potapenko authored
    Patch series "KFENCE: A low-overhead sampling-based memory safety error detector", v7.
    
    This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
    low-overhead sampling-based memory safety error detector of heap
    use-after-free, invalid-free, and out-of-bounds access errors.  This
    series enables KFENCE for the x86 and arm64 architectures, and adds
    KFENCE hooks to the SLAB and SLUB allocators.
    
    KFENCE is designed to be enabled in production kernels, and has near
    zero performance overhead. Compared to KASAN, KFENCE trades performance
    for precision. The main motivation behind KFENCE's design, is that with
    enough total uptime KFENCE will detect bugs in code paths not typically
    exercised by non-production test workloads. One way to quickly achieve a
    large enough total uptime is when the tool is deployed across a large
    fleet of machines.
    
    KFENCE objects each reside on a dedicated page, at either the left or
    right page boundaries. The pages to the left and right of the object
    page are "guard pages", whose attributes are changed to a protected
    state, and cause page faults on any attempted access to them. Such page
    faults are then intercepted by KFENCE, which handles the fault
    gracefully by reporting a memory access error.
    
    Guarded allocations are set up based on a sample interval (can be set
    via kfence.sample_interval). After expiration of the sample interval,
    the next allocation through the main allocator (SLAB or SLUB) returns a
    guarded allocation from the KFENCE object pool. At this point, the timer
    is reset, and the next allocation is set up after the expiration of the
    interval.
    
    To enable/disable a KFENCE allocation through the main allocator's
    fast-path without overhead, KFENCE relies on static branches via the
    static keys infrastructure. The static branch is toggled to redirect the
    allocation to KFENCE.
    
    The KFENCE memory pool is of fixed size, and if the pool is exhausted no
    further KFENCE allocations occur. The default config is conservative
    with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB
    pages).
    
    We have verified by running synthetic benchmarks (sysbench I/O,
    hackbench) and production server-workload benchmarks that a kernel with
    KFENCE (using sample intervals 100-500ms) is performance-neutral
    compared to a non-KFENCE baseline kernel.
    
    KFENCE is inspired by GWP-ASan [1], a userspace tool with similar
    properties. The name "KFENCE" is a homage to the Electric Fence Malloc
    Debugger [2].
    
    For more details, see Documentation/dev-tools/kfence.rst added in the
    series -- also viewable here:
    
    	https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst
    
    [1] http://llvm.org/docs/GwpAsan.html
    [2] https://linux.die.net/man/3/efence
    
    This patch (of 9):
    
    This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
    low-overhead sampling-based memory safety error detector of heap
    use-after-free, invalid-free, and out-of-bounds access errors.
    
    KFENCE is designed to be enabled in production kernels, and has near
    zero performance overhead. Compared to KASAN, KFENCE trades performance
    for precision. The main motivation behind KFENCE's design, is that with
    enough total uptime KFENCE will detect bugs in code paths not typically
    exercised by non-production test workloads. One way to quickly achieve a
    large enough total uptime is when the tool is deployed across a large
    fleet of machines.
    
    KFENCE objects each reside on a dedicated page, at either the left or
    right page boundaries. The pages to the left and right of the object
    page are "guard pages", whose attributes are changed to a protected
    state, and cause page faults on any attempted access to them. Such page
    faults are then intercepted by KFENCE, which handles the fault
    gracefully by reporting a memory access error. To detect out-of-bounds
    writes to memory within the object's page itself, KFENCE also uses
    pattern-based redzones. The following figure illustrates the page
    layout:
    
      ---+-----------+-----------+-----------+-----------+-----------+---
         | xxxxxxxxx | O :       | xxxxxxxxx |       : O | xxxxxxxxx |
         | xxxxxxxxx | B :       | xxxxxxxxx |       : B | xxxxxxxxx |
         | x GUARD x | J : RED-  | x GUARD x | RED-  : J | x GUARD x |
         | xxxxxxxxx | E :  ZONE | xxxxxxxxx |  ZONE : E | xxxxxxxxx |
         | xxxxxxxxx | C :       | xxxxxxxxx |       : C | xxxxxxxxx |
         | xxxxxxxxx | T :       | xxxxxxxxx |       : T | xxxxxxxxx |
      ---+-----------+-----------+-----------+-----------+-----------+---
    
    Guarded allocations are set up based on a sample interval (can be set
    via kfence.sample_interval). After expiration of the sample interval, a
    guarded allocation from the KFENCE object pool is returned to the main
    allocator (SLAB or SLUB). At this point, the timer is reset, and the
    next allocation is set up after the expiration of the interval.
    
    To enable/disable a KFENCE allocation through the main allocator's
    fast-path without overhead, KFENCE relies on static branches via the
    static keys infrastructure. The static branch is toggled to redirect the
    allocation to KFENCE. To date, we have verified by running synthetic
    benchmarks (sysbench I/O, hackbench) that a kernel compiled with KFENCE
    is performance-neutral compared to the non-KFENCE baseline.
    
    For more details, see Documentation/dev-tools/kfence.rst (added later in
    the series).
    
    [elver@google.com: fix parameter description for kfence_object_start()]
      Link: https://lkml.kernel.org/r/20201106092149.GA2851373@elver.google.com
    [elver@google.com: avoid stalling work queue task without allocations]
      Link: https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
      Link: https://lkml.kernel.org/r/20201110135320.3309507-1-elver@google.com
    [elver@google.com: fix potential deadlock due to wake_up()]
      Link: https://lkml.kernel.org/r/000000000000c0645805b7f982e4@google.com
      Link: https://lkml.kernel.org/r/20210104130749.1768991-1-elver@google.com
    [elver@google.com: add option to use KFENCE without static keys]
      Link: https://lkml.kernel.org/r/20210111091544.3287013-1-elver@google.com
    [elver@google.com: add missing copyright and description headers]
      Link: https://lkml.kernel.org/r/20210118092159.145934-1-elver@google.com
    
    Link: https://lkml.kernel.org/r/20201103175841.3495947-2-elver@google.comSigned-off-by: default avatarMarco Elver <elver@google.com>
    Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
    Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
    Reviewed-by: default avatarSeongJae Park <sjpark@amazon.de>
    Co-developed-by: default avatarMarco Elver <elver@google.com>
    Reviewed-by: default avatarJann Horn <jannh@google.com>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Andrey Konovalov <andreyknvl@google.com>
    Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christopher Lameter <cl@linux.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Joern Engel <joern@purestorage.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    0ce20dd8
core.c 24.7 KB