1. 22 Aug, 2017 13 commits
  2. 21 Aug, 2017 27 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 6470812e
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "Just a couple small fixes, two of which have to do with gcc-7:
      
         1) Don't clobber kernel fixed registers in __multi4 libgcc helper.
      
         2) Fix a new uninitialized variable warning on sparc32 with gcc-7,
            from Thomas Petazzoni.
      
         3) Adjust pmd_t initializer on sparc32 to make gcc happy.
      
         4) If ATU isn't available, don't bark in the logs. From Tushar Dave"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
        sparc64: remove unnecessary log message
        sparc64: Don't clibber fixed registers in __multi4.
        mm: add pmd_t initializer __pmd() to work around a GCC bug.
      6470812e
    • Thomas Petazzoni's avatar
      sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus() · 2dc77533
      Thomas Petazzoni authored
      When building the kernel for Sparc using gcc 7.x, the build fails
      with:
      
      arch/sparc/kernel/pcic.c: In function ‘pcibios_fixup_bus’:
      arch/sparc/kernel/pcic.c:647:8: error: ‘cmd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
          cmd |= PCI_COMMAND_IO;
              ^~
      
      The simplified code looks like this:
      
      unsigned int cmd;
      [...]
      pcic_read_config(dev->bus, dev->devfn, PCI_COMMAND, 2, &cmd);
      [...]
      cmd |= PCI_COMMAND_IO;
      
      I.e, the code assumes that pcic_read_config() will always initialize
      cmd. But it's not the case. Looking at pcic_read_config(), if
      bus->number is != 0 or if the size is not one of 1, 2 or 4, *val will
      not be initialized.
      
      As a simple fix, we initialize cmd to zero at the beginning of
      pcibios_fixup_bus.
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2dc77533
    • Gao Feng's avatar
      net: sched: Add the invalid handle check in qdisc_class_find · 7d3f0cd4
      Gao Feng authored
      Add the invalid handle "0" check to avoid unnecessary search, because
      the qdisc uses the skb->priority as the handle value to look up, and
      it is "0" usually.
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d3f0cd4
    • Jon Paul Maloy's avatar
      tipc: don't reset stale broadcast send link · 40501f90
      Jon Paul Maloy authored
      When the broadcast send link after 100 attempts has failed to
      transfer a packet to all peers, we consider it stale, and reset
      it. Thereafter it needs to re-synchronize with the peers, something
      currently done by just resetting and re-establishing all links to
      all peers. This has turned out to be overkill, with potentially
      unwanted consequences for the remaining cluster.
      
      A closer analysis reveals that this can be done much simpler. When
      this kind of failure happens, for reasons that may lie outside the
      TIPC protocol, it is typically only one peer which is failing to
      receive and acknowledge packets. It is hence sufficient to identify
      and reset the links only to that peer to resolve the situation, without
      having to reset the broadcast link at all. This solution entails a much
      lower risk of negative consequences for the own node as well as for
      the overall cluster.
      
      We implement this change in this commit.
      Reviewed-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40501f90
    • Linus Torvalds's avatar
      Merge tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 05ab303b
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - PAE40 related updates
      
       - SLC errata for region ops
      
       - intc line masking by default
      
      * tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        arc: Mask individual IRQ lines during core INTC init
        ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
        ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
        ARC: dma: implement dma_unmap_page and sg variant
        ARCv2: SLC: Make sure busy bit is set properly for region ops
        ARC: [plat-sim] Include this platform unconditionally
        ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103
        ARC: defconfig: Cleanup from old Kconfig options
      05ab303b
    • Linus Torvalds's avatar
      Merge tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 0b3baec8
      Linus Torvalds authored
      Pull RTC fix from Alexandre Belloni:
       "Fix regmap configuration for ds1307"
      
      * tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        rtc: ds1307: fix regmap config
      0b3baec8
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e3181f2c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix IGMP handling wrt VRF, from David Ahern.
      
       2) Fix timer access to freed object in dccp, from Eric Dumazet.
      
       3) Use kmalloc_array() in ptr_ring to avoid overflow cases which are
          triggerable by userspace. Also from Eric Dumazet.
      
       4) Fix infinite loop in unmapping cleanup of nfp driver, from Colin Ian
          King.
      
       5) Correct datagram peek handling of empty SKBs, from Matthew Dawson.
      
       6) Fix use after free in TIPC, from Eric Dumazet.
      
       7) When replacing a route in ipv6 we need to reset the round robin
          pointer, from Wei Wang.
      
       8) Fix bug in pci_find_pcie_root_port() which was unearthed by the
          relaxed ordering changes, from Thierry Redding. I made sure to get
          an explicit ACK from Bjorn this time around :-)
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
        ipv6: repair fib6 tree in failure case
        net_sched: fix order of queue length updates in qdisc_replace()
        tools lib bpf: improve warning
        switchdev: documentation: minor typo fixes
        bpf, doc: also add s390x as arch to sysctl description
        net: sched: fix NULL pointer dereference when action calls some targets
        rxrpc: Fix oops when discarding a preallocated service call
        irda: do not leak initialized list.dev to userspace
        net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
        PCI: Allow PCI express root ports to find themselves
        tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
        net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
        ipv6: reset fn->rr_ptr when replacing route
        sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
        tipc: fix use-after-free
        tun: handle register_netdevice() failures properly
        datagram: When peeking datagrams with offset < 0 don't skip empty skbs
        bpf, doc: improve sysctl knob description
        netxen: fix incorrect loop counter decrement
        nfp: fix infinite loop on umapping cleanup
        ...
      e3181f2c
    • Oleg Nesterov's avatar
      pids: make task_tgid_nr_ns() safe · dd1c1f2f
      Oleg Nesterov authored
      This was reported many times, and this was even mentioned in commit
      52ee2dfd ("pids: refactor vnr/nr_ns helpers to make them safe") but
      somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is
      not safe because task->group_leader points to nowhere after the exiting
      task passes exit_notify(), rcu_read_lock() can not help.
      
      We really need to change __unhash_process() to nullify group_leader,
      parent, and real_parent, but this needs some cleanups.  Until then we
      can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and
      fix the problem.
      Reported-by: default avatarTroy Kensinger <tkensinger@google.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd1c1f2f
    • David Lamparter's avatar
      net: check type when freeing metadata dst · e65a4955
      David Lamparter authored
      Commit 3fcece12 ("net: store port/representator id in metadata_dst")
      added a new type field to metadata_dst, but metadata_dst_free() wasn't
      updated to check it before freeing the METADATA_IP_TUNNEL specific dst
      cache entry.
      
      This is not currently causing problems since it's far enough back in the
      struct to be zeroed for the only other type currently in existance
      (METADATA_HW_PORT_MUX), but nevertheless it's not correct.
      
      Fixes: 3fcece12 ("net: store port/representator id in metadata_dst")
      Signed-off-by: default avatarDavid Lamparter <equinox@diac24.net>
      Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
      Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e65a4955
    • David Ahern's avatar
      net: ipv6: put host and anycast routes on device with address · 4832c30d
      David Ahern authored
      One nagging difference between ipv4 and ipv6 is host routes for ipv6
      addresses are installed using the loopback device or VRF / L3 Master
      device. e.g.,
      
          2001:db8:1::/120 dev veth0 proto kernel metric 256 pref medium
          local 2001:db8:1::1 dev lo table local proto kernel metric 0 pref medium
      
      Using the loopback device is convenient -- necessary for local tx, but
      has some nasty side effects, most notably setting the 'lo' device down
      causes all host routes for all local IPv6 address to be removed from the
      FIB and completely breaks IPv6 networking across all interfaces.
      
      This patch puts FIB entries for IPv6 routes against the device. This
      simplifies the routes in the FIB, for example by making dst->dev and
      rt6i_idev->dev the same (a future patch can look at removing the device
      reference taken for rt6i_idev for FIB entries).
      
      When copies are made on FIB lookups, the cloned route has dst->dev
      set to loopback (or the L3 master device). This is needed for the
      local Tx of packets to local addresses.
      
      With fib entries allocated against the real network device, the addrconf
      code that reinserts host routes on admin up of 'lo' is no longer needed.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4832c30d
    • Florian Westphal's avatar
      dsa: remove unused net_device arg from handlers · 89e49506
      Florian Westphal authored
      compile tested only, but saw no warnings/errors with
      allmodconfig build.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89e49506
    • David S. Miller's avatar
      Merge branch 'bpf-mips-jit-improvements' · d1ef551a
      David S. Miller authored
      David Daney says:
      
      ====================
      MIPS,bpf: Improvements for MIPS eBPF JIT
      
      Here are several improvements and bug fixes for the MIPS eBPF JIT.
      
      The main change is the addition of support for JLT, JLE, JSLT and JSLE
      ops, that were recently added.
      
      Also fix WARN output when used with preemptable kernel, and a small
      cleanup/optimization in the use of BPF_OP(insn->code).
      
      I suggest that the whole thing go via the BPF/net-next path as there
      are dependencies on code that is not yet merged to Linus' tree.
      
      Still pending are changes to reduce stack usage when the verifier can
      determine the maximum stack size.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1ef551a
    • David Daney's avatar
      MIPS,bpf: Cache value of BPF_OP(insn->code) in eBPF JIT. · 6035b3fa
      David Daney authored
      The code looks a little cleaner if we replace BPF_OP(insn->code) with
      the local variable bpf_op.  Caching the value this way also saves 300
      bytes (about 1%) in the code size of the JIT.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6035b3fa
    • David Daney's avatar
    • David Daney's avatar
      MIPS,bpf: Fix using smp_processor_id() in preemptible splat. · 8d8d18c3
      David Daney authored
      If the kernel is configured with preemption enabled we were getting
      warning stack traces for use of current_cpu_type().
      
      Fix by moving the test between preempt_disable()/preempt_enable() and
      caching the results of the CPU type tests for use during code
      generation.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d8d18c3
    • Bhumika Goyal's avatar
      qlogic: make device_attribute const · da6817eb
      Bhumika Goyal authored
      Make these const as they are only passed as an argument to the
      function device_create_file and device_remove_file and the corresponding
      arguments are of type const.
      Done using Coccinelle
      Signed-off-by: default avatarBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da6817eb
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · a43dce93
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net-next): ipsec-next 2017-08-21
      
      1) Support RX checksum with IPsec crypto offload for esp4/esp6.
         From Ilan Tayari.
      
      2) Fixup IPv6 checksums when doing IPsec crypto offload.
         From Yossi Kuperman.
      
      3) Auto load the xfrom offload modules if a user installs
         a SA that requests IPsec offload. From Ilan Tayari.
      
      4) Clear RX offload informations in xfrm_input to not
         confuse the TX path with stale offload informations.
         From Ilan Tayari.
      
      5) Allow IPsec GSO for local sockets if the crypto operation
         will be offloaded.
      
      6) Support setting of an output mark to the xfrm_state.
         This mark can be used to to do the tunnel route lookup.
         From Lorenzo Colitti.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a43dce93
    • Heiner Kallweit's avatar
      rtc: ds1307: fix regmap config · 03619844
      Heiner Kallweit authored
      Current max_register setting breaks reading nvram on certain chips and
      also reading the standard registers on RX8130 where register map starts
      at 0x10.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Fixes: 11e5890b "rtc: ds1307: convert driver to regmap"
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      03619844
    • Rick Farrington's avatar
      liquidio: fix use of pf in pass-through mode in a virtual machine · 0c45d7fe
      Rick Farrington authored
      Fix problem when PF is used in pass-through mode in a VM (w/embedded f/w).
      
      If host error reading PF num from CN23XX_PCIE_SRIOV_FDL reg,
      try to retrieve PF num from SLI_PKT(0)_INPUT_CONTROL (initialized by f/w).
      Signed-off-by: default avatarRick Farrington <ricardo.farrington@cavium.com>
      Signed-off-by: default avatarFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c45d7fe
    • Wei Wang's avatar
      ipv6: repair fib6 tree in failure case · 348a4002
      Wei Wang authored
      In fib6_add(), it is possible that fib6_add_1() picks an intermediate
      node and sets the node's fn->leaf to NULL in order to add this new
      route. However, if fib6_add_rt2node() fails to add the new
      route for some reason, fn->leaf will be left as NULL and could
      potentially cause crash when fn->leaf is accessed in fib6_locate().
      This patch makes sure fib6_repair_tree() is called to properly repair
      fn->leaf in the above failure case.
      
      Here is the syzkaller reported general protection fault in fib6_locate:
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 0 PID: 40937 Comm: syz-executor3 Not tainted
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      task: ffff8801d7d64100 ti: ffff8801d01a0000 task.ti: ffff8801d01a0000
      RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] __ipv6_prefix_equal64_half include/net/ipv6.h:475 [inline]
      RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] ipv6_prefix_equal include/net/ipv6.h:492 [inline]
      RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate_1 net/ipv6/ip6_fib.c:1210 [inline]
      RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate+0x281/0x3c0 net/ipv6/ip6_fib.c:1233
      RSP: 0018:ffff8801d01a36a8  EFLAGS: 00010202
      RAX: 0000000000000020 RBX: ffff8801bc790e00 RCX: ffffc90002983000
      RDX: 0000000000001219 RSI: ffff8801d01a37a0 RDI: 0000000000000100
      RBP: ffff8801d01a36f0 R08: 00000000000000ff R09: 0000000000000000
      R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
      R13: dffffc0000000000 R14: ffff8801d01a37a0 R15: 0000000000000000
      FS:  00007f6afd68c700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004c6340 CR3: 00000000ba41f000 CR4: 00000000001426f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Stack:
       ffff8801d01a37a8 ffff8801d01a3780 ffffed003a0346f5 0000000c82a23ea0
       ffff8800b7bd7700 ffff8801d01a3780 ffff8800b6a1c940 ffffffff82a23ea0
       ffff8801d01a3920 ffff8801d01a3748 ffffffff82a223d6 ffff8801d7d64988
      Call Trace:
       [<ffffffff82a223d6>] ip6_route_del+0x106/0x570 net/ipv6/route.c:2109
       [<ffffffff82a23f9d>] inet6_rtm_delroute+0xfd/0x100 net/ipv6/route.c:3075
       [<ffffffff82621359>] rtnetlink_rcv_msg+0x549/0x7a0 net/core/rtnetlink.c:3450
       [<ffffffff8274c1d1>] netlink_rcv_skb+0x141/0x370 net/netlink/af_netlink.c:2281
       [<ffffffff82613ddf>] rtnetlink_rcv+0x2f/0x40 net/core/rtnetlink.c:3456
       [<ffffffff8274ad38>] netlink_unicast_kernel net/netlink/af_netlink.c:1206 [inline]
       [<ffffffff8274ad38>] netlink_unicast+0x518/0x750 net/netlink/af_netlink.c:1232
       [<ffffffff8274b83e>] netlink_sendmsg+0x8ce/0xc30 net/netlink/af_netlink.c:1778
       [<ffffffff82564aff>] sock_sendmsg_nosec net/socket.c:609 [inline]
       [<ffffffff82564aff>] sock_sendmsg+0xcf/0x110 net/socket.c:619
       [<ffffffff82564d62>] sock_write_iter+0x222/0x3a0 net/socket.c:834
       [<ffffffff8178523d>] new_sync_write+0x1dd/0x2b0 fs/read_write.c:478
       [<ffffffff817853f4>] __vfs_write+0xe4/0x110 fs/read_write.c:491
       [<ffffffff81786c38>] vfs_write+0x178/0x4b0 fs/read_write.c:538
       [<ffffffff817892a9>] SYSC_write fs/read_write.c:585 [inline]
       [<ffffffff817892a9>] SyS_write+0xd9/0x1b0 fs/read_write.c:577
       [<ffffffff82c71e32>] entry_SYSCALL_64_fastpath+0x12/0x17
      
      Note: there is no "Fixes" tag as this seems to be a bug introduced
      very early.
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      348a4002
    • Bhumika Goyal's avatar
      net: dsa: mv88e6xxx: make irq_chip const · 6eb15e21
      Bhumika Goyal authored
      Make this const as it is only used in a copy operation.
      Done using Coccinelle.
      Signed-off-by: default avatarBhumika Goyal <bhumirks@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6eb15e21
    • Konstantin Khlebnikov's avatar
      net_sched: fix order of queue length updates in qdisc_replace() · 68a66d14
      Konstantin Khlebnikov authored
      This important to call qdisc_tree_reduce_backlog() after changing queue
      length. Parent qdisc should deactivate class in ->qlen_notify() called from
      qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.
      
      Missed class deactivations leads to crashes/warnings at picking packets
      from empty qdisc and corrupting state at reactivating this class in future.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Fixes: 86a7996c ("net_sched: introduce qdisc_replace() helper")
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68a66d14
    • Christophe Jaillet's avatar
      net: ibm: emac: Fix some error handling path in 'emac_probe()' · 138b57f0
      Christophe Jaillet authored
      If 'irq_of_parse_and_map()' or 'of_address_to_resource()' fail, 'err' is
      known to be 0 at this point.
      So return -ENODEV instead in the first case and use 'of_iomap()' instead of
      the equivalent 'of_address_to_resource()/ioremap()' combinaison in the 2nd
      case.
      
      Doing so, the 'rsrc_regs' field of the 'emac_instance struct' becomes
      redundant and is removed.
      
      While at it, turn a 'err != 0' test into an equivalent 'err' to be more
      consistent.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      138b57f0
    • Ganesh Goudar's avatar
      cxgb4/cxgbvf: Handle 32-bit fw port capabilities · c3168cab
      Ganesh Goudar authored
      Implement new 32-bit Firmware Port Capabilities in order to
      handle new speeds which couldn't be represented in the old 16-bit
      Firmware Port Capabilities values.
      
      Based on the original work of Casey Leedom <leedom@chelsio.com>
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3168cab
    • Eric Leblond's avatar
    • Chris Packham's avatar
      switchdev: documentation: minor typo fixes · 5a784498
      Chris Packham authored
      Two typos in switchdev.txt
      Signed-off-by: default avatarChris Packham <chris.packham@alliedtelesis.co.nz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a784498
    • Daniel Borkmann's avatar
      bpf: fix double free from dev_map_notification() · 274043c6
      Daniel Borkmann authored
      In the current code, dev_map_free() can still race with dev_map_notification().
      In dev_map_free(), we remove dtab from the list of dtabs after we purged
      all entries from it. However, we don't do xchg() with NULL or the like,
      so the entry at that point is still pointing to the device. If a unregister
      notification comes in at the same time, we therefore risk a double-free,
      since the pointer is still present in the map, and then pushed again to
      __dev_map_entry_free().
      
      All this is completely unnecessary. Just remove the dtab from the list
      right before the synchronize_rcu(), so all outstanding readers from the
      notifier list have finished by then, thus we don't need to deal with this
      corner case anymore and also wouldn't need to nullify dev entires. This is
      fine because we iterate over the map releasing all entries and therefore
      dev references anyway.
      
      Fixes: 4cc7b954 ("bpf: devmap fix mutex in rcu critical section")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      274043c6