1. 03 Dec, 2015 24 commits
    • Michael Chan's avatar
      bnxt_en: Setup uc_list mac filters after resetting the chip. · b664f008
      Michael Chan authored
      Call bnxt_cfg_rx_mode() in bnxt_init_chip() to setup uc_list and
      mc_list mac address filters.  Before the patch, uc_list is not
      setup again after chip reset (such as ethtool ring size change)
      and macvlans don't work any more after that.
      
      Modify bnxt_cfg_rx_mode() to return error codes appropriately so
      that the init chip sequence can detect any failures.
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b664f008
    • Jeffrey Huang's avatar
      bnxt_en: enforce proper storing of MAC address · bdd4347b
      Jeffrey Huang authored
      For PF, the bp->pf.mac_addr always holds the permanent MAC
      addr assigned by the HW.  For VF, the bp->vf.mac_addr always
      holds the administrator assigned VF MAC addr. The random
      generated VF MAC addr should never get stored to bp->vf.mac_addr.
      This way, when the VF wants to change the MAC address, we can tell
      if the adminstrator has already set it and disallow the VF from
      changing it.
      
      v2: Fix compile error if CONFIG_BNXT_SRIOV is not set.
      Signed-off-by: default avatarJeffrey Huang <huangjw@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdd4347b
    • Jeffrey Huang's avatar
      bnxt_en: Fixed incorrect implementation of ndo_set_mac_address · 1fc2cfd0
      Jeffrey Huang authored
      The existing ndo_set_mac_address only copies the new MAC addr
      and didn't set the new MAC addr to the HW. The correct way is
      to delete the existing default MAC filter from HW and add
      the new one. Because of RFS filters are also dependent on the
      default mac filter l2 context, the driver must go thru
      close_nic() to delete the default MAC and RFS filters, then
      open_nic() to set the default MAC address to HW.
      Signed-off-by: default avatarJeffrey Huang <huangjw@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fc2cfd0
    • Vladimir Zapolskiy's avatar
      net: lpc_eth: remove irq > NR_IRQS check from probe() · 39198ec9
      Vladimir Zapolskiy authored
      If the driver is used on an ARM platform with SPARSE_IRQ defined,
      semantics of NR_IRQS is different (minimal value of virtual irqs) and
      by default it is set to 16, see arch/arm/include/asm/irq.h.
      
      This value may be less than the actual number of virtual irqs, which
      may break the driver initialization. The check removal allows to use
      the driver on such a platform, and, if irq controller driver works
      correctly, the check is not needed on legacy platforms.
      
      Fixes a runtime problem:
      
          lpc-eth 31060000.ethernet: error getting resources.
          lpc_eth: lpc-eth: not found (-6).
      Signed-off-by: default avatarVladimir Zapolskiy <vz@mleia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39198ec9
    • Eric Dumazet's avatar
      net_sched: fix qdisc_tree_decrease_qlen() races · 4eaf3b84
      Eric Dumazet authored
      qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
      devices.
      
      One problem is that it updates sch->q.qlen and sch->qstats.drops
      on the mq/mqprio root qdisc, while it should not : Daniele
      reported underflows errors :
      [  681.774821] PAX: sch->q.qlen: 0 n: 1
      [  681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
      [  681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G           O    4.2.6.201511282239-1-grsec #1
      [  681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
      [  681.774956]  ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
      [  681.774959]  ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
      [  681.774960]  ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
      [  681.774962] Call Trace:
      [  681.774967]  [<ffffffffa95d2810>] dump_stack+0x4c/0x7f
      [  681.774970]  [<ffffffffa91a44f4>] report_size_overflow+0x34/0x50
      [  681.774972]  [<ffffffffa94d17e2>] qdisc_tree_decrease_qlen+0x152/0x160
      [  681.774976]  [<ffffffffc02694b1>] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
      [  681.774978]  [<ffffffffc02680a0>] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
      [  681.774980]  [<ffffffffa94cd92d>] __qdisc_run+0x4d/0x1d0
      [  681.774983]  [<ffffffffa949b2b2>] net_tx_action+0xc2/0x160
      [  681.774985]  [<ffffffffa90664c1>] __do_softirq+0xf1/0x200
      [  681.774987]  [<ffffffffa90665ee>] run_ksoftirqd+0x1e/0x30
      [  681.774989]  [<ffffffffa90896b0>] smpboot_thread_fn+0x150/0x260
      [  681.774991]  [<ffffffffa9089560>] ? sort_range+0x40/0x40
      [  681.774992]  [<ffffffffa9085fe4>] kthread+0xe4/0x100
      [  681.774994]  [<ffffffffa9085f00>] ? kthread_worker_fn+0x170/0x170
      [  681.774995]  [<ffffffffa95d8d1e>] ret_from_fork+0x3e/0x70
      
      mq/mqprio have their own ways to report qlen/drops by folding stats on
      all their queues, with appropriate locking.
      
      A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
      without proper locking : concurrent qdisc updates could corrupt the list
      that qdisc_match_from_root() parses to find a qdisc given its handle.
      
      Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
      qdisc_tree_decrease_qlen() can use to abort its tree traversal,
      as soon as it meets a mq/mqprio qdisc children.
      
      Second problem can be fixed by RCU protection.
      Qdisc are already freed after RCU grace period, so qdisc_list_add() and
      qdisc_list_del() simply have to use appropriate rcu list variants.
      
      A future patch will add a per struct netdev_queue list anchor, so that
      qdisc_tree_decrease_qlen() can have more efficient lookups.
      Reported-by: default avatarDaniele Fucini <dfucini@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Cong Wang <cwang@twopensource.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4eaf3b84
    • Paolo Abeni's avatar
      openvswitch: fix hangup on vxlan/gre/geneve device deletion · 13175303
      Paolo Abeni authored
      Each openvswitch tunnel vport (vxlan,gre,geneve) holds a reference
      to the underlying tunnel device, but never released it when such
      device is deleted.
      Deleting the underlying device via the ip tool cause the kernel to
      hangup in the netdev_wait_allrefs() loop.
      This commit ensure that on device unregistration dp_detach_port_notify()
      is called for all vports that hold the device reference, properly
      releasing it.
      
      Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device")
      Fixes: b2acd1dc ("openvswitch: Use regular GRE net_device instead of vport")
      Fixes: 6b001e68 ("openvswitch: Use Geneve device.")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarFlavio Leitner <fbl@sysclose.org>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13175303
    • Andrew Lunn's avatar
      ipv4: igmp: Allow removing groups from a removed interface · 4eba7bb1
      Andrew Lunn authored
      When a multicast group is joined on a socket, a struct ip_mc_socklist
      is appended to the sockets mc_list containing information about the
      joined group.
      
      If the interface is hot unplugged, this entry becomes stale. Prior to
      commit 52ad353a ("igmp: fix the problem when mc leave group") it
      was possible to remove the stale entry by performing a
      IP_DROP_MEMBERSHIP, passing either the old ifindex or ip address on
      the interface. However, this fix enforces that the interface must
      still exist. Thus with time, the number of stale entries grows, until
      sysctl_igmp_max_memberships is reached and then it is not possible to
      join and more groups.
      
      The previous patch fixes an issue where a IP_DROP_MEMBERSHIP is
      performed without specifying the interface, either by ifindex or ip
      address. However here we do supply one of these. So loosen the
      restriction on device existence to only apply when the interface has
      not been specified. This then restores the ability to clean up the
      stale entries.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Fixes: 52ad353a "(igmp: fix the problem when mc leave group")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4eba7bb1
    • Eric Dumazet's avatar
      ipv6: sctp: implement sctp_v6_destroy_sock() · 602dd62d
      Eric Dumazet authored
      Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets.
      
      We need to call inet6_destroy_sock() to properly release
      inet6 specific fields.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      602dd62d
    • David S. Miller's avatar
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 79aecc72
      David S. Miller authored
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth 2015-12-01
      
      Here's a Bluetooth fix for the 4.4-rc series that fixes a memory leak of
      the Security Manager L2CAP channel that'll happen for every LE
      connection.
      
      Please let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79aecc72
    • Yang Shi's avatar
      arm64: bpf: add 'store immediate' instruction · df849ba3
      Yang Shi authored
      aarch64 doesn't have native store immediate instruction, such operation
      has to be implemented by the below instruction sequence:
      
      Load immediate to register
      Store register
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      CC: Zi Shen Lim <zlim.lnx@gmail.com>
      CC: Xi Wang <xi.wang@gmail.com>
      Reviewed-by: default avatarZi Shen Lim <zlim.lnx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df849ba3
    • Eric Dumazet's avatar
      ipv6: kill sk_dst_lock · 6bd4f355
      Eric Dumazet authored
      While testing the np->opt RCU conversion, I found that UDP/IPv6 was
      using a mixture of xchg() and sk_dst_lock to protect concurrent changes
      to sk->sk_dst_cache, leading to possible corruptions and crashes.
      
      ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
      way to fix the mess is to remove sk_dst_lock completely, as we did for
      IPv4.
      
      __ip6_dst_store() and ip6_dst_store() share same implementation.
      
      sk_setup_caps() being called with socket lock being held or not,
      we have to use sk_dst_set() instead of __sk_dst_set()
      
      Note that I had to move the "np->dst_cookie = rt6_get_cookie(rt);"
      in ip6_dst_store() before the sk_setup_caps(sk, dst) call.
      
      This is because ip6_dst_store() can be called from process context,
      without any lock held.
      
      As soon as the dst is installed in sk->sk_dst_cache, dst can be freed
      from another cpu doing a concurrent ip6_dst_store()
      
      Doing the dst dereference before doing the install is needed to make
      sure no use after free would trigger.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6bd4f355
    • Eric Dumazet's avatar
      ipv6: sctp: add rcu protection around np->opt · c836a8ba
      Eric Dumazet authored
      This patch completes the work I did in commit 45f6fad8
      ("ipv6: add complete rcu protection around np->opt"), as I missed
      sctp part.
      
      This simply makes sure np->opt is used with proper RCU locking
      and accessors.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c836a8ba
    • Konstantin Khlebnikov's avatar
      net/neighbour: fix crash at dumping device-agnostic proxy entries · 6adc5fd6
      Konstantin Khlebnikov authored
      Proxy entries could have null pointer to net-device.
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Fixes: 84920c14 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6adc5fd6
    • Marcelo Ricardo Leitner's avatar
      sctp: use GFP_USER for user-controlled kmalloc · cacc0621
      Marcelo Ricardo Leitner authored
      Dmitry Vyukov reported that the user could trigger a kernel warning by
      using a large len value for getsockopt SCTP_GET_LOCAL_ADDRS, as that
      value directly affects the value used as a kmalloc() parameter.
      
      This patch thus switches the allocation flags from all user-controllable
      kmalloc size to GFP_USER to put some more restrictions on it and also
      disables the warn, as they are not necessary.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cacc0621
    • Marcelo Ricardo Leitner's avatar
      sctp: convert sack_needed and sack_generation to bits · 38ee8fb6
      Marcelo Ricardo Leitner authored
      They don't need to be any bigger than that and with this we start a new
      bitfield for tracking association runtime stuff, like zero window
      situation.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38ee8fb6
    • Eric Dumazet's avatar
      ipv6: add complete rcu protection around np->opt · 45f6fad8
      Eric Dumazet authored
      This patch addresses multiple problems :
      
      UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
      while socket is not locked : Other threads can change np->opt
      concurrently. Dmitry posted a syzkaller
      (http://github.com/google/syzkaller) program desmonstrating
      use-after-free.
      
      Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
      and dccp_v6_request_recv_sock() also need to use RCU protection
      to dereference np->opt once (before calling ipv6_dup_options())
      
      This patch adds full RCU protection to np->opt
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45f6fad8
    • Alexei Starovoitov's avatar
      bpf: fix allocation warnings in bpf maps and integer overflow · 01b3f521
      Alexei Starovoitov authored
      For large map->value_size the user space can trigger memory allocation warnings like:
      WARNING: CPU: 2 PID: 11122 at mm/page_alloc.c:2989
      __alloc_pages_nodemask+0x695/0x14e0()
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff82743b56>] dump_stack+0x68/0x92 lib/dump_stack.c:50
       [<ffffffff81244ec9>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:460
       [<ffffffff812450f9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:493
       [<     inline     >] __alloc_pages_slowpath mm/page_alloc.c:2989
       [<ffffffff81554e95>] __alloc_pages_nodemask+0x695/0x14e0 mm/page_alloc.c:3235
       [<ffffffff816188fe>] alloc_pages_current+0xee/0x340 mm/mempolicy.c:2055
       [<     inline     >] alloc_pages include/linux/gfp.h:451
       [<ffffffff81550706>] alloc_kmem_pages+0x16/0xf0 mm/page_alloc.c:3414
       [<ffffffff815a1c89>] kmalloc_order+0x19/0x60 mm/slab_common.c:1007
       [<ffffffff815a1cef>] kmalloc_order_trace+0x1f/0xa0 mm/slab_common.c:1018
       [<     inline     >] kmalloc_large include/linux/slab.h:390
       [<ffffffff81627784>] __kmalloc+0x234/0x250 mm/slub.c:3525
       [<     inline     >] kmalloc include/linux/slab.h:463
       [<     inline     >] map_update_elem kernel/bpf/syscall.c:288
       [<     inline     >] SYSC_bpf kernel/bpf/syscall.c:744
      
      To avoid never succeeding kmalloc with order >= MAX_ORDER check that
      elem->value_size and computed elem_size are within limits for both hash and
      array type maps.
      Also add __GFP_NOWARN to kmalloc(value_size | elem_size) to avoid OOM warnings.
      Note kmalloc(key_size) is highly unlikely to trigger OOM, since key_size <= 512,
      so keep those kmalloc-s as-is.
      
      Large value_size can cause integer overflows in elem_size and map.pages
      formulas, so check for that as well.
      
      Fixes: aaac3ba9 ("bpf: charge user for creation of BPF maps and programs")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01b3f521
    • David S. Miller's avatar
      Merge branch 'mvneta-fixes' · a64f3f83
      David S. Miller authored
      Marcin Wojtas says:
      
      ====================
      Marvell Armada XP/370/38X Neta fixes
      
      I'm sending v4 with corrected commit log of the last patch, in order to
      avoid possible conflicts between the branches as suggested by Gregory
      Clement.
      
      Best regards,
      Marcin Wojtas
      
      Changes from v4:
      * Correct commit log of patch 6/6
      
      Changes from v2:
      * Style fixes in patch updating mbus protection
      * Remove redundant stable notifications except for patch 4/6
      
      Changes from v1:
      * update MBUS windows access protection register once, after whole loop
      * add fixing value of MVNETA_RXQ_INTR_ENABLE_ALL_MASK
      * add fixing error path for skb_build()
      * add possibility of setting custom TX IP checksum limit in DT property
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a64f3f83
    • Marcin Wojtas's avatar
      mvebu: dts: enable IP checksum with jumbo frames for Armada 38x on Port0 · c4a25007
      Marcin Wojtas authored
      The Ethernet controller found in the Armada 38x SoC's family support
      TCP/IP checksumming with frame sizes larger than 1600 bytes, however
      only on port 0.
      
      This commit enables it by setting 'tx-csum-limit' to 9800B in
      'ethernet@70000' node.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4a25007
    • Marcin Wojtas's avatar
      net: mvneta: enable setting custom TX IP checksum limit · 9110ee07
      Marcin Wojtas authored
      Since Armada 38x SoC can support IP checksum for jumbo frames only on
      a single port, it means that this feature should be enabled per-port,
      rather than for the whole SoC.
      
      This patch enables setting custom TX IP checksum limit by adding new
      optional property to the mvneta device tree node. If not used, by
      default 1600B is set for "marvell,armada-370-neta" and 9800B for other
      strings, which ensures backward compatibility. Binding documentation
      is updated accordingly.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9110ee07
    • Marcin Wojtas's avatar
      net: mvneta: fix error path for building skb · 26c17a17
      Marcin Wojtas authored
      In the actual RX processing, there is same error path for both descriptor
      ring refilling and building skb fails. This is not correct, because after
      successful refill, the ring is already updated with newly allocated
      buffer. Then, in case of build_skb() fail, hitherto code left the original
      buffer unmapped.
      
      This patch fixes above situation by swapping error check of skb build with
      DMA-unmap of original buffer.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Acked-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Cc: <stable@vger.kernel.org> # v4.2+
      Fixes a84e3289 ("net: mvneta: fix refilling for Rx DMA buffers")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26c17a17
    • Marcin Wojtas's avatar
      net: mvneta: fix bit assignment for RX packet irq enable · dc1aadf6
      Marcin Wojtas authored
      A value originally defined in the driver was inappropriate. Even though
      the ingress was somehow working, writing MVNETA_RXQ_INTR_ENABLE_ALL_MASK
      to MVNETA_INTR_ENABLE didn't make any effect, because the bits [31:16]
      are reserved and read-only.
      
      This commit updates MVNETA_RXQ_INTR_ENABLE_ALL_MASK to be compliant with
      the controller's documentation.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      
      Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network
      unit")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc1aadf6
    • Marcin Wojtas's avatar
      net: mvneta: fix bit assignment in MVNETA_RXQ_CONFIG_REG · e5bdf689
      Marcin Wojtas authored
      MVNETA_RXQ_HW_BUF_ALLOC bit which controls enabling hardware buffer
      allocation was mistakenly set as BIT(1). This commit fixes the assignment.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Reviewed-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      
      Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network
      unit")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5bdf689
    • Marcin Wojtas's avatar
      net: mvneta: add configuration for MBUS windows access protection · db6ba9a5
      Marcin Wojtas authored
      This commit adds missing configuration of MBUS windows access protection
      in mvneta_conf_mbus_windows function - a dedicated variable for that
      purpose remained there unused since v3.8 initial mvneta support. Because
      of that the register contents were inherited from the bootloader.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Reviewed-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      
      Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network
      unit")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db6ba9a5
  2. 02 Dec, 2015 8 commits
  3. 01 Dec, 2015 5 commits
    • Eric Dumazet's avatar
      net: fix sock_wake_async() rcu protection · ceb5d58b
      Eric Dumazet authored
      Dmitry provided a syzkaller (http://github.com/google/syzkaller)
      triggering a fault in sock_wake_async() when async IO is requested.
      
      Said program stressed af_unix sockets, but the issue is generic
      and should be addressed in core networking stack.
      
      The problem is that by the time sock_wake_async() is called,
      we should not access the @flags field of 'struct socket',
      as the inode containing this socket might be freed without
      further notice, and without RCU grace period.
      
      We already maintain an RCU protected structure, "struct socket_wq"
      so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
      is the safe route.
      
      It also reduces number of cache lines needing dirtying, so might
      provide a performance improvement anyway.
      
      In followup patches, we might move remaining flags (SOCK_NOSPACE,
      SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
      being mostly read and let it being shared between cpus.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ceb5d58b
    • Eric Dumazet's avatar
      net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA · 9cd3e072
      Eric Dumazet authored
      This patch is a cleanup to make following patch easier to
      review.
      
      Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
      from (struct socket)->flags to a (struct socket_wq)->flags
      to benefit from RCU protection in sock_wake_async()
      
      To ease backports, we rename both constants.
      
      Two new helpers, sk_set_bit(int nr, struct sock *sk)
      and sk_clear_bit(int net, struct sock *sk) are added so that
      following patch can change their implementation.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cd3e072
    • Alexey Khoroshilov's avatar
      vmxnet3: fix checks for dma mapping errors · 5738a09d
      Alexey Khoroshilov authored
      vmxnet3_drv does not check dma_addr with dma_mapping_error()
      after mapping dma memory. The patch adds the checks and
      tries to handle failures.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Acked-by: default avatarShrikrishna Khare <skhare@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5738a09d
    • Peter Hurley's avatar
      wan/x25: Fix use-after-free in x25_asy_open_tty() · ee9159dd
      Peter Hurley authored
      The N_X25 line discipline may access the previous line discipline's closed
      and already-freed private data on open [1].
      
      The tty->disc_data field _never_ refers to valid data on entry to the
      line discipline's open() method. Rather, the ldisc is expected to
      initialize that field for its own use for the lifetime of the instance
      (ie. from open() to close() only).
      
      [1]
          [  634.336761] ==================================================================
          [  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at addr ffff8800a743efd0
          [  634.339558] Read of size 4 by task syzkaller_execu/8981
          [  634.340359] =============================================================================
          [  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
          ...
          [  634.405018] Call Trace:
          [  634.405277] dump_stack (lib/dump_stack.c:52)
          [  634.405775] print_trailer (mm/slub.c:655)
          [  634.406361] object_err (mm/slub.c:662)
          [  634.406824] kasan_report_error (mm/kasan/report.c:138 mm/kasan/report.c:236)
          [  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
          [  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 (discriminator 1))
          [  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
          [  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
          [  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 drivers/tty/tty_io.c:2879)
          [  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
          [  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
          [  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)
      Reported-and-tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee9159dd
    • Nicolas Dichtel's avatar
      Revert "ipv6: ndisc: inherit metadata dst when creating ndisc requests" · 304d888b
      Nicolas Dichtel authored
      This reverts commit ab450605.
      
      In IPv6, we cannot inherit the dst of the original dst. ndisc packets
      are IPv6 packets and may take another route than the original packet.
      
      This patch breaks the following scenario: a packet comes from eth0 and
      is forwarded through vxlan1. The encapsulated packet triggers an NS
      which cannot be sent because of the wrong route.
      
      CC: Jiri Benc <jbenc@redhat.com>
      CC: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      304d888b
  4. 30 Nov, 2015 3 commits