1. 12 Jan, 2018 7 commits
  2. 11 Jan, 2018 4 commits
    • Chris Wilson's avatar
      drm/i915: Don't adjust priority on an already signaled fence · 5005c851
      Chris Wilson authored
      When we retire a signaled fence, we free the dependency tree. However,
      we skip clearing the list so that if we then try to adjust the priority
      of the signaled fence, we may walk the list of freed dependencies.
      
      [ 3083.156757] ==================================================================
      [ 3083.156806] BUG: KASAN: use-after-free in execlists_schedule+0x199/0x660 [i915]
      [ 3083.156810] Read of size 8 at addr ffff8806bf20f400 by task Xorg/831
      
      [ 3083.156815] CPU: 0 PID: 831 Comm: Xorg Not tainted 4.15.0-rc6-no-psn+ #1
      [ 3083.156817] Hardware name: Notebook                         N24_25BU/N24_25BU, BIOS 5.12 02/17/2017
      [ 3083.156818] Call Trace:
      [ 3083.156823]  dump_stack+0x5c/0x7a
      [ 3083.156827]  print_address_description+0x6b/0x290
      [ 3083.156830]  kasan_report+0x28f/0x380
      [ 3083.156872]  ? execlists_schedule+0x199/0x660 [i915]
      [ 3083.156914]  execlists_schedule+0x199/0x660 [i915]
      [ 3083.156956]  ? intel_crtc_atomic_check+0x146/0x4e0 [i915]
      [ 3083.156997]  ? execlists_submit_request+0xe0/0xe0 [i915]
      [ 3083.157038]  ? i915_vma_misplaced.part.4+0x25/0xb0 [i915]
      [ 3083.157079]  ? __i915_vma_do_pin+0x7c8/0xc80 [i915]
      [ 3083.157121]  ? intel_atomic_state_alloc+0x44/0x60 [i915]
      [ 3083.157130]  ? drm_atomic_helper_page_flip+0x3e/0xb0 [drm_kms_helper]
      [ 3083.157145]  ? drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
      [ 3083.157159]  ? drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157172]  ? drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157211]  i915_gem_object_wait_priority+0x14c/0x2c0 [i915]
      [ 3083.157251]  ? i915_gem_get_aperture_ioctl+0x150/0x150 [i915]
      [ 3083.157290]  ? i915_vma_pin_fence+0x1d8/0x320 [i915]
      [ 3083.157331]  ? intel_pin_and_fence_fb_obj+0x175/0x250 [i915]
      [ 3083.157372]  ? intel_rotation_info_size+0x60/0x60 [i915]
      [ 3083.157413]  ? intel_link_compute_m_n+0x80/0x80 [i915]
      [ 3083.157428]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
      [ 3083.157443]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
      [ 3083.157485]  intel_prepare_plane_fb+0x2f8/0x5a0 [i915]
      [ 3083.157527]  ? intel_crtc_get_vblank_counter+0x80/0x80 [i915]
      [ 3083.157536]  drm_atomic_helper_prepare_planes+0xa0/0x1c0 [drm_kms_helper]
      [ 3083.157587]  intel_atomic_commit+0x12e/0x4e0 [i915]
      [ 3083.157605]  drm_atomic_helper_page_flip+0xa2/0xb0 [drm_kms_helper]
      [ 3083.157621]  drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
      [ 3083.157638]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157652]  ? drm_lease_owner+0x1a/0x30 [drm]
      [ 3083.157668]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157681]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157696]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157711]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157725]  ? drm_getstats+0x20/0x20 [drm]
      [ 3083.157729]  ? timerqueue_del+0x49/0x80
      [ 3083.157732]  ? __remove_hrtimer+0x62/0xb0
      [ 3083.157735]  ? hrtimer_try_to_cancel+0x173/0x210
      [ 3083.157738]  do_vfs_ioctl+0x13b/0x880
      [ 3083.157741]  ? ioctl_preallocate+0x140/0x140
      [ 3083.157744]  ? _raw_spin_unlock_irq+0xe/0x30
      [ 3083.157746]  ? do_setitimer+0x234/0x370
      [ 3083.157750]  ? SyS_setitimer+0x19e/0x1b0
      [ 3083.157752]  ? SyS_alarm+0x140/0x140
      [ 3083.157755]  ? __rcu_read_unlock+0x66/0x80
      [ 3083.157757]  ? __fget+0xc4/0x100
      [ 3083.157760]  SyS_ioctl+0x74/0x80
      [ 3083.157763]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      [ 3083.157765] RIP: 0033:0x7f6135d0c6a7
      [ 3083.157767] RSP: 002b:00007fff01451888 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
      [ 3083.157769] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6135d0c6a7
      [ 3083.157771] RDX: 00007fff01451950 RSI: 00000000c01864b0 RDI: 000000000000000c
      [ 3083.157772] RBP: 00007f613076f600 R08: 0000000000000001 R09: 0000000000000000
      [ 3083.157773] R10: 0000000000000060 R11: 0000000000003246 R12: 0000000000000000
      [ 3083.157774] R13: 0000000000000060 R14: 000000000000001b R15: 0000000000000060
      
      [ 3083.157779] Allocated by task 831:
      [ 3083.157783]  kmem_cache_alloc+0xc0/0x200
      [ 3083.157822]  i915_gem_request_await_dma_fence+0x2c4/0x5d0 [i915]
      [ 3083.157861]  i915_gem_request_await_object+0x321/0x370 [i915]
      [ 3083.157900]  i915_gem_do_execbuffer+0x1165/0x19c0 [i915]
      [ 3083.157937]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
      [ 3083.157950]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157962]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157964]  do_vfs_ioctl+0x13b/0x880
      [ 3083.157966]  SyS_ioctl+0x74/0x80
      [ 3083.157968]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      
      [ 3083.157971] Freed by task 831:
      [ 3083.157973]  kmem_cache_free+0x77/0x220
      [ 3083.158012]  i915_gem_request_retire+0x72c/0xa70 [i915]
      [ 3083.158051]  i915_gem_request_alloc+0x1e9/0x8b0 [i915]
      [ 3083.158089]  i915_gem_do_execbuffer+0xa96/0x19c0 [i915]
      [ 3083.158127]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
      [ 3083.158140]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.158153]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.158155]  do_vfs_ioctl+0x13b/0x880
      [ 3083.158156]  SyS_ioctl+0x74/0x80
      [ 3083.158158]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      
      [ 3083.158162] The buggy address belongs to the object at ffff8806bf20f400
                      which belongs to the cache i915_dependency of size 64
      [ 3083.158166] The buggy address is located 0 bytes inside of
                      64-byte region [ffff8806bf20f400, ffff8806bf20f440)
      [ 3083.158168] The buggy address belongs to the page:
      [ 3083.158171] page:00000000d43decc4 count:1 mapcount:0 mapping:          (null) index:0x0
      [ 3083.158174] flags: 0x17ffe0000000100(slab)
      [ 3083.158179] raw: 017ffe0000000100 0000000000000000 0000000000000000 0000000180200020
      [ 3083.158182] raw: ffffea001afc16c0 0000000500000005 ffff880731b881c0 0000000000000000
      [ 3083.158184] page dumped because: kasan: bad access detected
      
      [ 3083.158187] Memory state around the buggy address:
      [ 3083.158190]  ffff8806bf20f300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158192]  ffff8806bf20f380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158195] >ffff8806bf20f400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158196]                    ^
      [ 3083.158199]  ffff8806bf20f480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158201]  ffff8806bf20f500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158203] ==================================================================
      Reported-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: default avatarMike Keehan <mike@keehan.net>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104436
      Fixes: 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Reviewed-by: default avatarMichał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180106105618.13532-1-chris@chris-wilson.co.uk
      (cherry picked from commit c218ee03)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      5005c851
    • Kenneth Graunke's avatar
      drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake. · 4636bda8
      Kenneth Graunke authored
      Geminilake requires the 3D driver to select whether barriers are
      intended for compute shaders, or tessellation control shaders, by
      whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
      switching pipelines.  Failure to do this properly can result in GPU
      hangs.
      
      Unfortunately, this means it needs to switch mid-batch, so only
      userspace can properly set it.  To facilitate this, the kernel needs
      to whitelist the register.
      
      The workarounds page currently tags this as applying to Broxton only,
      but that doesn't make sense.  The documentation for the register it
      references says the bit userspace is supposed to toggle only exists on
      Geminilake.  Empirically, the Mesa patch to toggle this bit appears to
      fix intermittent GPU hangs in tessellation control shader barrier tests
      on Geminilake; we haven't seen those hangs on Broxton.
      
      v2: Mention WA #0862 in the comment (it doesn't have a name).
      Signed-off-by: default avatarKenneth Graunke <kenneth@whitecape.org>
      Acked-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180105085905.9298-1-kenneth@whitecape.org
      (cherry picked from commit ab062639)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      4636bda8
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · cbd0a6a2
      Linus Torvalds authored
      Pull vfs regression fix from Al Viro/
      
      Fix a leak in socket() introduced by commit 8e1611e2 ("make
      sock_alloc_file() do sock_release() on failures").
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        Fix a leak in socket(2) when we fail to allocate a file descriptor.
      cbd0a6a2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 64fce444
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) BPF speculation prevention and BPF_JIT_ALWAYS_ON, from Alexei
          Starovoitov.
      
       2) Revert dev_get_random_name() changes as adjust the error code
          returns seen by userspace definitely breaks stuff.
      
       3) Fix TX DMA map/unmap on older iwlwifi devices, from Emmanuel
          Grumbach.
      
       4) From wrong AF family when requesting sock diag modules, from Andrii
          Vladyka.
      
       5) Don't add new ipv6 routes attached to the null_entry, from Wei Wang.
      
       6) Some SCTP sockopt length fixes from Marcelo Ricardo Leitner.
      
       7) Don't leak when removing VLAN ID 0, from Cong Wang.
      
       8) Hey there's a potential leak in ipv6_make_skb() too, from Eric
          Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
        ipv6: sr: fix TLVs not being copied using setsockopt
        ipv6: fix possible mem leaks in ipv6_make_skb()
        mlxsw: spectrum_qdisc: Don't use variable array in mlxsw_sp_tclass_congestion_enable
        mlxsw: pci: Wait after reset before accessing HW
        nfp: always unmask aux interrupts at init
        8021q: fix a memory leak for VLAN 0 device
        of_mdio: avoid MDIO bus removal when a PHY is missing
        caif_usb: use strlcpy() instead of strncpy()
        doc: clarification about setting SO_ZEROCOPY
        net: gianfar_ptp: move set_fipers() to spinlock protecting area
        sctp: make use of pre-calculated len
        sctp: add a ceiling to optlen in some sockopts
        sctp: GFP_ATOMIC is not needed in sctp_setsockopt_events
        bpf: introduce BPF_JIT_ALWAYS_ON config
        bpf: avoid false sharing of map refcount with max_entries
        ipv6: remove null_entry before adding default route
        SolutionEngine771x: add Ether TSU resource
        SolutionEngine771x: fix Ether platform data
        docs-rst: networking: wire up msg_zerocopy
        net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()
        ...
      64fce444
  3. 10 Jan, 2018 22 commits
  4. 09 Jan, 2018 7 commits
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.15-rc8_cleanups' of... · cf1fb158
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.15-rc8_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
      
      Pull RISC-V updates from Palmer Dabbelt:
       "This contains what I hope are the last RISC-V changes to go into 4.15.
        I know it's a bit last minute, but I think they're all fairly small
        changes:
      
         - SR_* constants have been renamed to match the latest ISA
           specification.
      
         - Some CONFIG_MMU #ifdef cruft has been removed. We've never
           supported !CONFIG_MMU.
      
         - __NR_riscv_flush_icache is now visible to userspace. We were hoping
           to avoid making this public in order to force userspace to call the
           vDSO entry, but it looks like QEMU's user-mode emulation doesn't
           want to emulate a vDSO. In order to allow glibc to fall back to a
           system call when the vDSO entry doesn't exist we're just
      
         - Our defconfig is no long empty. This is another one that just
           slipped through the cracks. The defconfig isn't perfect, but it's
           at least close to what users will want for the first RISC-V
           development board. Getting closer is kind of splitting hairs here:
           none of the RISC-V specific drivers are in yet, so it's not like
           things will boot out of the box.
      
        The only one that's strictly necessary is the __NR_riscv_flush_icache
        change, as I want that to be part of the public API starting from our
        first kernel so nobody has to worry about it. The others are nice to
        haves, but they seem sane for 4.15 to me"
      
      * tag 'riscv-for-linus-4.15-rc8_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux:
        riscv: rename SR_* constants to match the spec
        riscv: remove CONFIG_MMU ifdefs
        RISC-V: Make __NR_riscv_flush_icache visible to userspace
        RISC-V: Add a basic defconfig
      cf1fb158
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 44cae9b2
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
       "Another round of MIPS fixes for 4.15.
      
         - Maciej Rozycki found another series of FP issues which requires a
           seven part series to restructure and fix.
      
         - James fixes a warning about .set mt which gas doesn't like when
           building for R1 processors"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task
        MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses
        MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET
        MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA
        MIPS: Consistently handle buffer counter with PTRACE_SETREGSET
        MIPS: Guard against any partial write attempt with PTRACE_SETREGSET
        MIPS: Factor out NT_PRFPREG regset access helpers
        MIPS: CPS: Fix r1 .set mt assembler warning
      44cae9b2
    • Alexei Starovoitov's avatar
      bpf: introduce BPF_JIT_ALWAYS_ON config · 290af866
      Alexei Starovoitov authored
      The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
      
      A quote from goolge project zero blog:
      "At this point, it would normally be necessary to locate gadgets in
      the host kernel code that can be used to actually leak data by reading
      from an attacker-controlled location, shifting and masking the result
      appropriately and then using the result of that as offset to an
      attacker-controlled address for a load. But piecing gadgets together
      and figuring out which ones work in a speculation context seems annoying.
      So instead, we decided to use the eBPF interpreter, which is built into
      the host kernel - while there is no legitimate way to invoke it from inside
      a VM, the presence of the code in the host kernel's text section is sufficient
      to make it usable for the attack, just like with ordinary ROP gadgets."
      
      To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
      option that removes interpreter from the kernel in favor of JIT-only mode.
      So far eBPF JIT is supported by:
      x64, arm64, arm32, sparc64, s390, powerpc64, mips64
      
      The start of JITed program is randomized and code page is marked as read-only.
      In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
      
      v2->v3:
      - move __bpf_prog_ret0 under ifdef (Daniel)
      
      v1->v2:
      - fix init order, test_bpf and cBPF (Daniel's feedback)
      - fix offloaded bpf (Jakub's feedback)
      - add 'return 0' dummy in case something can invoke prog->bpf_func
      - retarget bpf tree. For bpf-next the patch would need one extra hunk.
        It will be sent when the trees are merged back to net-next
      
      Considered doing:
        int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
      but it seems better to land the patch as-is and in bpf-next remove
      bpf_jit_enable global variable from all JITs, consolidate in one place
      and remove this jit_init() function.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      290af866
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · d476c533
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A set of fixes that should go into this release. This contains:
      
         - An NVMe pull request from Christoph, with a few critical fixes for
           NVMe.
      
         - A block drain queue fix from Ming.
      
         - The concurrent lo_open/release fix for loop"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        loop: fix concurrent lo_open/lo_release
        block: drain queue before waiting for q_usage_counter becoming zero
        nvme-fcloop: avoid possible uninitialized variable warning
        nvme-mpath: fix last path removal during traffic
        nvme-rdma: fix concurrent reset and reconnect
        nvme: fix sector units when going between formats
        nvme-pci: move use_sgl initialization to nvme_init_iod()
      d476c533
    • Daniel Borkmann's avatar
      bpf: avoid false sharing of map refcount with max_entries · be95a845
      Daniel Borkmann authored
      In addition to commit b2157399 ("bpf: prevent out-of-bounds
      speculation") also change the layout of struct bpf_map such that
      false sharing of fast-path members like max_entries is avoided
      when the maps reference counter is altered. Therefore enforce
      them to be placed into separate cachelines.
      
      pahole dump after change:
      
        struct bpf_map {
              const struct bpf_map_ops  * ops;                 /*     0     8 */
              struct bpf_map *           inner_map_meta;       /*     8     8 */
              void *                     security;             /*    16     8 */
              enum bpf_map_type          map_type;             /*    24     4 */
              u32                        key_size;             /*    28     4 */
              u32                        value_size;           /*    32     4 */
              u32                        max_entries;          /*    36     4 */
              u32                        map_flags;            /*    40     4 */
              u32                        pages;                /*    44     4 */
              u32                        id;                   /*    48     4 */
              int                        numa_node;            /*    52     4 */
              bool                       unpriv_array;         /*    56     1 */
      
              /* XXX 7 bytes hole, try to pack */
      
              /* --- cacheline 1 boundary (64 bytes) --- */
              struct user_struct *       user;                 /*    64     8 */
              atomic_t                   refcnt;               /*    72     4 */
              atomic_t                   usercnt;              /*    76     4 */
              struct work_struct         work;                 /*    80    32 */
              char                       name[16];             /*   112    16 */
              /* --- cacheline 2 boundary (128 bytes) --- */
      
              /* size: 128, cachelines: 2, members: 17 */
              /* sum members: 121, holes: 1, sum holes: 7 */
        };
      
      Now all entries in the first cacheline are read only throughout
      the life time of the map, set up once during map creation. Overall
      struct size and number of cachelines doesn't change from the
      reordering. struct bpf_map is usually first member and embedded
      in map structs in specific map implementations, so also avoid those
      members to sit at the end where it could potentially share the
      cacheline with first map values e.g. in the array since remote
      CPUs could trigger map updates just as well for those (easily
      dirtying members like max_entries intentionally as well) while
      having subsequent values in cache.
      
      Quoting from Google's Project Zero blog [1]:
      
        Additionally, at least on the Intel machine on which this was
        tested, bouncing modified cache lines between cores is slow,
        apparently because the MESI protocol is used for cache coherence
        [8]. Changing the reference counter of an eBPF array on one
        physical CPU core causes the cache line containing the reference
        counter to be bounced over to that CPU core, making reads of the
        reference counter on all other CPU cores slow until the changed
        reference counter has been written back to memory. Because the
        length and the reference counter of an eBPF array are stored in
        the same cache line, this also means that changing the reference
        counter on one physical CPU core causes reads of the eBPF array's
        length to be slow on other physical CPU cores (intentional false
        sharing).
      
      While this doesn't 'control' the out-of-bounds speculation through
      masking the index as in commit b2157399, triggering a manipulation
      of the map's reference counter is really trivial, so lets not allow
      to easily affect max_entries from it.
      
      Splitting to separate cachelines also generally makes sense from
      a performance perspective anyway in that fast-path won't have a
      cache miss if the map gets pinned, reused in other progs, etc out
      of control path, thus also avoids unintentional false sharing.
      
        [1] https://googleprojectzero.blogspot.ch/2018/01/reading-privileged-memory-with-side.htmlSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      be95a845
    • Wei Wang's avatar
      ipv6: remove null_entry before adding default route · 4512c43e
      Wei Wang authored
      In the current code, when creating a new fib6 table, tb6_root.leaf gets
      initialized to net->ipv6.ip6_null_entry.
      If a default route is being added with rt->rt6i_metric = 0xffffffff,
      fib6_add() will add this route after net->ipv6.ip6_null_entry. As
      null_entry is shared, it could cause problem.
      
      In order to fix it, set fn->leaf to NULL before calling
      fib6_add_rt2node() when trying to add the first default route.
      And reset fn->leaf to null_entry when adding fails or when deleting the
      last default route.
      
      syzkaller reported the following issue which is fixed by this commit:
      
      WARNING: suspicious RCU usage
      4.15.0-rc5+ #171 Not tainted
      -----------------------------
      net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      4 locks held by swapper/0/0:
       #0:  ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
       #0:  ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
       #1:  (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
       #1:  (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
       #2:  (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
       #3:  (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
       #3:  (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
       fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
       fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
       fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
       fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
       fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
       __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
       fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
       fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
       fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
       call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
       expire_timers kernel/time/timer.c:1357 [inline]
       __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
       run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
       __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1cc/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:540 [inline]
       smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
       </IRQ>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Fixes: 66f5d6ce ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4512c43e
    • David S. Miller's avatar
      Merge branch 'Ether-fixes-for-the-SolutionEngine771x-boards' · 22dd8e6b
      David S. Miller authored
      Sergei Shtylyov says:
      
      ====================
      Ether fixes for the SolutionEngine771x boards
      
      Here's the series of 2 patches against Linus' repo. This series should
      (hoplefully) fix the Ether support on the SolutionEngine771x boards...
      
      [1/2] SolutionEngine771x: fix Ether platform data
      [2/2] SolutionEngine771x: add Ether TSU resource
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22dd8e6b