1. 14 Apr, 2016 19 commits
    • David S. Miller's avatar
      Merge branch 'ipv6-dgram-dst-cache' · 97cc931f
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      ipv6: datagram: Update dst cache of a connected udp sk during pmtu update
      
      v2:
      ~ Protect __sk_dst_get() operations with rcu_read_lock in
        release_cb() because another thread may do ip6_dst_store()
        for a udp sk without taking the sk lock (e.g. in sendmsg).
      ~ Do a ipv6_addr_v4mapped(&sk->sk_v6_daddr) check before
        calling ip6_datagram_dst_update() in patch 3 and 4.  It is
        similar to how __ip6_datagram_connect handles it.
      ~ One fix in ip6_datagram_dst_update() in patch 2.  It needs
        to check (np->flow_label & IPV6_FLOWLABEL_MASK) before
        doing fl6_sock_lookup.  I was confused with the naming
        of IPV6_FLOWLABEL_MASK and IPV6_FLOWINFO_MASK.
      ~ Check dst->obsolete just on the safe side, although I think it
        should at least have DST_OBSOLETE_FORCE_CHK by now.
      ~ Add Fixes tag to patch 3 and 4
      ~ Add some points from the previous discussion about holding
        sk lock to the commit message in patch 3.
      
      v1:
      There is a case in connected UDP socket such that
      getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
      sequence could be the following:
      1. Create a connected UDP socket
      2. Send some datagrams out
      3. Receive a ICMPV6_PKT_TOOBIG
      4. No new outgoing datagrams to trigger the sk_dst_check()
         logic to update the sk->sk_dst_cache.
      5. getsockopt(IPV6_MTU) returns the mtu from the invalid
         sk->sk_dst_cache instead of the newly created RTF_CACHE clone.
      
      Patch 1 and 2 are the prep work.
      Patch 3 and 4 are the fixes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97cc931f
    • Martin KaFai Lau's avatar
      ipv6: udp: Do a route lookup and update during release_cb · e646b657
      Martin KaFai Lau authored
      This patch adds a release_cb for UDPv6.  It does a route lookup
      and updates sk->sk_dst_cache if it is needed.  It picks up the
      left-over job from ip6_sk_update_pmtu() if the sk was owned
      by user during the pmtu update.
      
      It takes a rcu_read_lock to protect the __sk_dst_get() operations
      because another thread may do ip6_dst_store() without taking the
      sk lock (e.g. sendmsg).
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reported-by: default avatarWei Wang <weiwan@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e646b657
    • Martin KaFai Lau's avatar
      ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update · 33c162a9
      Martin KaFai Lau authored
      There is a case in connected UDP socket such that
      getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
      sequence could be the following:
      1. Create a connected UDP socket
      2. Send some datagrams out
      3. Receive a ICMPV6_PKT_TOOBIG
      4. No new outgoing datagrams to trigger the sk_dst_check()
         logic to update the sk->sk_dst_cache.
      5. getsockopt(IPV6_MTU) returns the mtu from the invalid
         sk->sk_dst_cache instead of the newly created RTF_CACHE clone.
      
      This patch updates the sk->sk_dst_cache for a connected datagram sk
      during pmtu-update code path.
      
      Note that the sk->sk_v6_daddr is used to do the route lookup
      instead of skb->data (i.e. iph).  It is because a UDP socket can become
      connected after sending out some datagrams in un-connected state.  or
      It can be connected multiple times to different destinations.  Hence,
      iph may not be related to where sk is currently connected to.
      
      It is done under '!sock_owned_by_user(sk)' condition because
      the user may make another ip6_datagram_connect()  (i.e changing
      the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
      code path.
      
      For the sock_owned_by_user(sk) == true case, the next patch will
      introduce a release_cb() which will update the sk->sk_dst_cache.
      
      Test:
      
      Server (Connected UDP Socket):
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Route Details:
      [root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
      2fac::/64 dev eth0  proto kernel  metric 256  pref medium
      2fac:face::/64 via 2fac::face dev eth0  metric 1024  pref medium
      
      A simple python code to create a connected UDP socket:
      
      import socket
      import errno
      
      HOST = '2fac::1'
      PORT = 8080
      
      s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
      s.bind((HOST, PORT))
      s.connect(('2fac:face::face', 53))
      print("connected")
      while True:
          try:
      	data = s.recv(1024)
          except socket.error as se:
      	if se.errno == errno.EMSGSIZE:
      		pmtu = s.getsockopt(41, 24)
      		print("PMTU:%d" % pmtu)
      		break
      s.close()
      
      Python program output after getting a ICMPV6_PKT_TOOBIG:
      [root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
      connected
      PMTU:1300
      
      Cache routes after recieving TOOBIG:
      [root@arch-fb-vm1 ~]# ip -6 r show table cache
      2fac:face::face via 2fac::face dev eth0  metric 0
          cache  expires 463sec mtu 1300 pref medium
      
      Client (Send the ICMPV6_PKT_TOOBIG):
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      scapy is used to generate the TOOBIG message.  Here is the scapy script I have
      used:
      
      >>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
      1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
      >>> sendp(p, iface='qemubr0')
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reported-by: default avatarWei Wang <weiwan@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33c162a9
    • Martin KaFai Lau's avatar
      ipv6: datagram: Refactor dst lookup and update codes to a new function · 7e2040db
      Martin KaFai Lau authored
      This patch moves the route lookup and update codes for connected
      datagram sk to a newly created function ip6_datagram_dst_update()
      
      It will be reused during the pmtu update in the later patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e2040db
    • Martin KaFai Lau's avatar
      ipv6: datagram: Refactor flowi6 init codes to a new function · 80fbdb20
      Martin KaFai Lau authored
      Move flowi6 init codes for connected datagram sk to a newly created
      function ip6_datagram_flow_key_init().
      
      Notes:
      1. fl6_flowlabel is used instead of fl6.flowlabel in __ip6_datagram_connect
      2. ipv6_addr_is_multicast(&fl6->daddr) is used instead of
         (addr_type & IPV6_ADDR_MULTICAST) in ip6_datagram_flow_key_init()
      
      This new function will be reused during pmtu update in the later patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80fbdb20
    • John Crispin's avatar
      net: mediatek: update the IRQ part of the binding document · f1d0540d
      John Crispin authored
      The current binding document only describes a single interrupt. Update the
      document by adding the 2 other interrupts.
      
      The driver currently only uses a single interrupt. The HW is however able
      to using IRQ grouping to split TX and RX onto separate GIC irqs.
      Signed-off-by: default avatarJohn Crispin <blogic@openwrt.org>
      Cc: devicetree@vger.kernel.org
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1d0540d
    • David S. Miller's avatar
      Merge tag 'mac80211-for-davem-2016-04-14' of... · 5e265029
      David S. Miller authored
      Merge tag 'mac80211-for-davem-2016-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      This has just the single fix from Dmitry Ivanov, adding the missing
      netlink notifier family check to avoid the socket close DoS problem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e265029
    • Alexei Starovoitov's avatar
      bpf/verifier: reject invalid LD_ABS | BPF_DW instruction · d82bccc6
      Alexei Starovoitov authored
      verifier must check for reserved size bits in instruction opcode and
      reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions,
      otherwise interpreter will WARN_RATELIMIT on them during execution.
      
      Fixes: ddd872bc ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions")
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d82bccc6
    • Lars Persson's avatar
      net: sched: do not requeue a NULL skb · 3dcd493f
      Lars Persson authored
      A failure in validate_xmit_skb_list() triggered an unconditional call
      to dev_requeue_skb with skb=NULL. This slowly grows the queue
      discipline's qlen count until all traffic through the queue stops.
      
      We take the optimistic approach and continue running the queue after a
      failure since it is unknown if later packets also will fail in the
      validate path.
      
      Fixes: 55a93b3e ("qdisc: validate skb without holding lock")
      Signed-off-by: default avatarLars Persson <larper@axis.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3dcd493f
    • Mathias Krause's avatar
      packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface · 309cf37f
      Mathias Krause authored
      Because we miss to wipe the remainder of i->addr[] in packet_mc_add(),
      pdiag_put_mclist() leaks uninitialized heap bytes via the
      PACKET_DIAG_MCLIST netlink attribute.
      
      Fix this by explicitly memset(0)ing the remaining bytes in i->addr[].
      
      Fixes: eea68e2f ("packet: Report socket mclist info via diag module")
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Acked-by: default avatarPavel Emelyanov <xemul@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      309cf37f
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 41015e89
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2016-04-13
      
      This series contains updates to i40e, i40evf and fm10k.
      
      Alex fixes a bug introduced earlier based on his interpretation of the
      XL710 datasheet.  The actual limit for fragments with TSO and a skbuff
      that has payload data in the header portion of the buffer is actually
      only 7 fragments and the skb-data portion counts as 2 buffers, one for
      the TSO header, and the one for a segment payload buffer.
      
      Jacob fixes a bug where in a previous refactor of the code broke
      multi-bit updates for VFs.  The problem occurs because a multi-bit
      request has a non-zero length, and the PF would simply drop any
      request with the upper 16 bits set.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41015e89
    • Chris Friesen's avatar
      route: do not cache fib route info on local routes with oif · d6d5e999
      Chris Friesen authored
      For local routes that require a particular output interface we do not want
      to cache the result.  Caching the result causes incorrect behaviour when
      there are multiple source addresses on the interface.  The end result
      being that if the intended recipient is waiting on that interface for the
      packet he won't receive it because it will be delivered on the loopback
      interface and the IP_PKTINFO ipi_ifindex will be set to the loopback
      interface as well.
      
      This can be tested by running a program such as "dhcp_release" which
      attempts to inject a packet on a particular interface so that it is
      received by another program on the same board.  The receiving process
      should see an IP_PKTINFO ipi_ifndex value of the source interface
      (e.g., eth1) instead of the loopback interface (e.g., lo).  The packet
      will still appear on the loopback interface in tcpdump but the important
      aspect is that the CMSG info is correct.
      
      Sample dhcp_release command line:
      
         dhcp_release eth1 192.168.204.222 02:11:33:22:44:66
      Signed-off-by: default avatarAllain Legacy <allain.legacy@windriver.com>
      Signed off-by: Chris Friesen <chris.friesen@windriver.com>
      Reviewed-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6d5e999
    • Jacob Keller's avatar
      fm10k: fix multi-bit VLAN update requests from VF · f808c5db
      Jacob Keller authored
      The VF uses a multi-bit update request to clear unused VLANs whenever it
      resets. However, an accident in a previous refector broke multi-bit
      updates for VFs, due to misreading a comment in fm10k_vf.c and
      attempting to reduce code duplication. The problem occurs because
      a multi-bit request has a non-zero length, and the PF would simply drop
      any request with the upper 16 bits set.
      
      We can't simply remove the check of the upper 16 bits and the call to
      fm10k_iov_select vid, because this would remove the checks for default
      VID and for ensuring no other VLANs can be enabled except pf_vid when it
      has been set. To resolve that issue, this revision uses the
      iov_select_vid when we have a single-bit update, and denies any
      multi-bit update when the VLAN was administratively set by the PF. This
      should be ok since the PF properly updates VLAN_TABLE when it assigns
      the PF vid. This ensures that requests to add or remove the PF vid work
      as expected, but a rogue VF could not use the multi-bit update as
      a loophole to attempt receiving traffic on other VLANs.
      Reported-by: default avatarNgai-Mint Kwan <ngai-mint.kwan@intel.com>
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKrishneil Singh <Krishneil.k.singh@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f808c5db
    • David Daney's avatar
      net: thunderx: Fix broken of_node_put() code. · 65c66af6
      David Daney authored
      commit b7d3e3d3 ("net: thunderx: Don't leak phy device references
      on -EPROBE_DEFER condition.") incorrectly moved the call to
      of_node_put() outside of the loop.  Under normal loop exit, the node
      has already had of_node_put() called, so the extra call results in:
      
      [    8.228020] ERROR: Bad of_node_put() on /soc@0/pci@848000000000/mrml-bridge0@1,0/bgx0/xlaui00
      [    8.239433] CPU: 16 PID: 608 Comm: systemd-udevd Not tainted 4.6.0-rc1-numa+ #157
      [    8.247380] Hardware name: www.cavium.com EBB8800/EBB8800, BIOS 0.3 Mar  2 2016
      [    8.273541] Call trace:
      [    8.273550] [<fffffc0008097364>] dump_backtrace+0x0/0x210
      [    8.273557] [<fffffc0008097598>] show_stack+0x24/0x2c
      [    8.273560] [<fffffc0008399ed0>] dump_stack+0x8c/0xb4
      [    8.273566] [<fffffc00085aa828>] of_node_release+0xa8/0xac
      [    8.273570] [<fffffc000839cad8>] kobject_cleanup+0x8c/0x194
      [    8.273573] [<fffffc000839c97c>] kobject_put+0x44/0x6c
      [    8.273576] [<fffffc00085a9ab0>] of_node_put+0x24/0x30
      [    8.273587] [<fffffc0000bd0f74>] bgx_probe+0x17c/0xcd8 [thunder_bgx]
      [    8.273591] [<fffffc00083ed220>] pci_device_probe+0xa0/0x114
      [    8.273596] [<fffffc0008473fbc>] driver_probe_device+0x178/0x418
      [    8.273599] [<fffffc000847435c>] __driver_attach+0x100/0x118
      [    8.273602] [<fffffc0008471b58>] bus_for_each_dev+0x6c/0xac
      [    8.273605] [<fffffc0008473884>] driver_attach+0x30/0x38
      [    8.273608] [<fffffc00084732f4>] bus_add_driver+0x1f8/0x29c
      [    8.273611] [<fffffc0008475028>] driver_register+0x70/0x110
      [    8.273617] [<fffffc00083ebf08>] __pci_register_driver+0x60/0x6c
      [    8.273623] [<fffffc0000bf0040>] bgx_init_module+0x40/0x48 [thunder_bgx]
      [    8.273626] [<fffffc0008090d04>] do_one_initcall+0xcc/0x1c0
      [    8.273631] [<fffffc0008198abc>] do_init_module+0x68/0x1c8
      [    8.273635] [<fffffc0008125668>] load_module+0xf44/0x11f4
      [    8.273638] [<fffffc0008125b64>] SyS_finit_module+0xb8/0xe0
      [    8.273641] [<fffffc0008093b30>] el0_svc_naked+0x24/0x28
      
      Go back to the previous (correct) code that only did the extra
      of_node_put() call on early exit from the loop.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65c66af6
    • Emrah Demir's avatar
      mISDN: Fixing missing validation in base_sock_bind() · b8216468
      Emrah Demir authored
      Add validation code into mISDN/socket.c
      Signed-off-by: default avatarEmrah Demir <ed@abdsec.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8216468
    • Alexander Duyck's avatar
      i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet · 3f3f7cb8
      Alexander Duyck authored
      This patch addresses a bug introduced based on my interpretation of the
      XL710 datasheet.  Specifically section 8.4.1 states that "A single transmit
      packet may span up to 8 buffers (up to 8 data descriptors per packet
      including both the header and payload buffers)."  It then later goes on to
      say that each segment for a TSO obeys the previous rule, however it then
      refers to TSO header and the segment payload buffers.
      
      I believe the actual limit for fragments with TSO and a skbuff that has
      payload data in the header portion of the buffer is actually only 7
      fragments as the skb->data portion counts as 2 buffers, one for the TSO
      header, and one for a segment payload buffer.
      
      Fixes: 2d37490b ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check")
      Reported-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarAlexander Duyck <aduyck@mirantis.com>
      Acked-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3f3f7cb8
    • David Ahern's avatar
      net: ipv6: Do not keep linklocal and loopback addresses · 70af921d
      David Ahern authored
      f1705ec1 added the option to retain user configured addresses on an
      admin down. A comment to one of the later revisions suggested using the
      IFA_F_PERMANENT flag rather than adding a user_managed boolean to the
      ifaddr struct. A side effect of this change is that link local and
      loopback addresses are also retained which is not part of the objective
      of f1705ec1. Add check to drop those addresses.
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70af921d
    • Wolfram Sang's avatar
      net: ethernet: renesas: ravb_main: test clock rate to avoid division by 0 · a6d37131
      Wolfram Sang authored
      The clk API may return 0 on clk_get_rate, so we should check the result before
      using it as a divisor.
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6d37131
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 60e19518
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for your net tree. More
      specifically, they are:
      
      1) Fix missing filter table per-netns registration in arptables, from
         Florian Westphal.
      
      2) Resolve out of bound access when parsing TCP options in
         nf_conntrack_tcp, patch from Jozsef Kadlecsik.
      
      3) Prefer NFPROTO_BRIDGE extensions over NFPROTO_UNSPEC in ebtables,
         this resolves conflict between xt_limit and ebt_limit, from Phil Sutter.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60e19518
  2. 13 Apr, 2016 1 commit
  3. 12 Apr, 2016 3 commits
  4. 11 Apr, 2016 10 commits
    • David Ahern's avatar
      net: vrf: Fix dev refcnt leak due to IPv6 prefix route · 4f7f34ea
      David Ahern authored
      ifupdown2 found a kernel bug with IPv6 routes and movement from the main
      table to the VRF table. Sequence of events:
      
      Create the interface and add addresses:
          ip link add dev eth4.105 link eth4 type vlan id 105
          ip addr add dev eth4.105 8.105.105.10/24
          ip -6 addr add dev eth4.105 2008:105:105::10/64
      
      At this point IPv6 has inserted a prefix route in the main table even
      though the interface is 'down'. From there the VRF device is created:
          ip link add dev vrf105 type vrf table 105
          ip addr add dev vrf105 9.9.105.10/32
          ip -6 addr add dev vrf105 2000:9:105::10/128
          ip link set vrf105 up
      
      Then the interface is enslaved, while still in the 'down' state:
          ip link set dev eth4.105 master vrf105
      
      Since the device is down the VRF driver cycling the device does not
      send the NETDEV_UP and NETDEV_DOWN but rather the NETDEV_CHANGE event
      which does not flush the routes inserted prior.
      
      When the link is brought up
          ip link set dev eth4.105 up
      
      the prefix route is added in the VRF table, but does not remove
      the route from the main table.
      
      Fix by handling the NETDEV_CHANGEUPPER event similar what was implemented
      for IPv4 in 7f49e7a3 ("net: Flush local routes when device changes vrf
      association")
      
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f7f34ea
    • David Ahern's avatar
      net: vrf: Fix dst reference counting · 9ab179d8
      David Ahern authored
      Vivek reported a kernel exception deleting a VRF with an active
      connection through it. The root cause is that the socket has a cached
      reference to a dst that is destroyed. Converting the dst_destroy to
      dst_release and letting proper reference counting kick in does not
      work as the dst has a reference to the device which needs to be released
      as well.
      
      I talked to Hannes about this at netdev and he pointed out the ipv4 and
      ipv6 dst handling has dst_ifdown for just this scenario. Rather than
      continuing with the reinvented dst wheel in VRF just remove it and
      leverage the ipv4 and ipv6 versions.
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ab179d8
    • David S. Miller's avatar
      Merge branch 'tipc-fixes' · 10c3c022
      David S. Miller authored
      Jon Maloy says:
      
      ====================
      tipc: name distributor pernet queue
      
      Commit #1 fixes a potential issue with deferred binding table
      updates being pushed to the wrong namespace.
      
      Commit #2 solves a problem with deferred binding table updates
      remaining in the the defer queue after the issuing node has gone
      down.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10c3c022
    • Erik Hugne's avatar
      tipc: purge deferred updates from dead nodes · ddb1d339
      Erik Hugne authored
      If a peer node becomes unavailable, in addition to removing the
      nametable entries from this node we also need to purge all deferred
      updates associated with this node.
      Signed-off-by: default avatarErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddb1d339
    • Erik Hugne's avatar
      tipc: make dist queue pernet · 541726ab
      Erik Hugne authored
      Nametable updates received from the network that cannot be applied
      immediately are placed on a defer queue. This queue is global to the
      TIPC module, which might cause problems when using TIPC in containers.
      To prevent nametable updates from escaping into the wrong namespace,
      we make the queue pernet instead.
      Signed-off-by: default avatarErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      541726ab
    • Hariprasad Shenai's avatar
      cxgb4: Stop Rx Queues before freeing it up · ebf4dc2b
      Hariprasad Shenai authored
      Stop all Ethernet RX Queues before freeing up various Ingress/Egress
      Queues, etc. We were seeing cases of Ingress Queues not getting serviced
      during the shutdown process leading to Ingress Paths jamming up through
      the chip and blocking the shutdown effort itself.
      
      One such case involved the Firmware sending a "Flush Token" through the
      ULP-TX -> ULP-RX path for an Ethernet TX Queue being freed in order to
      make sure there weren't any remaining TX Work Requests in the pipeline.
      But the return path was stalled by Ingress Data unable to be delivered to
      the Host because those Ingress Queues were no longer being serviced.
      
      Based on original work by Casey Leedom <leedom@chelsio.com>
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebf4dc2b
    • Phil Reid's avatar
      net: stmmac: socfgpa: Ensure emac bit set in System Manger for PTP · 734e00fa
      Phil Reid authored
      When using the PTP fpga to hps clock source for the stmmac module
      the appropriate bit in the System Manager FPGA Interface Group register
      needs to be set. This is not set by the bootloader setup  when the
      HPS emac pins are being for this emac module.
      
      This allows the PTP clock to be sourced from the FPGA and also connects
      the PTP pps and ext trig signals to the stmmac PTP hardware.
      
      Patch proposed by Phil Collins.
      Signed-off-by: default avatarPhil Reid <preid@electromag.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      734e00fa
    • Dmitry Ivanov's avatar
      netlink: don't send NETLINK_URELEASE for unbound sockets · e2726020
      Dmitry Ivanov authored
      All existing users of NETLINK_URELEASE use it to clean up resources that
      were previously allocated to a socket via some command. As a result, no
      users require getting this notification for unbound sockets.
      
      Sending it for unbound sockets, however, is a problem because any user
      (including unprivileged users) can create a socket that uses the same ID
      as an existing socket. Binding this new socket will fail, but if the
      NETLINK_URELEASE notification is generated for such sockets, the users
      thereof will be tricked into thinking the socket that they allocated the
      resources for is closed.
      
      In the nl80211 case, this will cause destruction of virtual interfaces
      that still belong to an existing hostapd process; this is the case that
      Dmitry noticed. In the NFC case, it will cause a poll abort. In the case
      of netlink log/queue it will cause them to stop reporting events, as if
      NFULNL_CFG_CMD_UNBIND/NFQNL_CFG_CMD_UNBIND had been called.
      
      Fix this problem by checking that the socket is bound before generating
      the NETLINK_URELEASE notification.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Ivanov <dima@ubnt.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2726020
    • David S. Miller's avatar
      decnet: Do not build routes to devices without decnet private data. · a36a0d40
      David S. Miller authored
      In particular, make sure we check for decnet private presence
      for loopback devices.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a36a0d40
    • Marcelo Ricardo Leitner's avatar
      sctp: avoid refreshing heartbeat timer too often · ba6f5e33
      Marcelo Ricardo Leitner authored
      Currently on high rate SCTP streams the heartbeat timer refresh can
      consume quite a lot of resources as timer updates are costly and it
      contains a random factor, which a) is also costly and b) invalidates
      mod_timer() optimization for not editing a timer to the same value.
      It may even cause the timer to be slightly advanced, for no good reason.
      
      As suggested by David Laight this patch now removes this timer update
      from hot path by leaving the timer on and re-evaluating upon its
      expiration if the heartbeat is still needed or not, similarly to what is
      done for TCP. If it's not needed anymore the timer is re-scheduled to
      the new timeout, considering the time already elapsed.
      
      For this, we now record the last tx timestamp per transport, updated in
      the same spots as hb timer was restarted on tx. Also split up
      sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
      sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
      heartbeat one.
      
      On loopback with MTU of 65535 and data chunks with 1636, so that we
      have a considerable amount of chunks without stressing system calls,
      netperf -t SCTP_STREAM -l 30, perf looked like this before:
      
      Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000
        Overhead  Command  Shared Object      Symbol
      +    6,15%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
      -    5,43%  netperf  [kernel.vmlinux]   [k] _raw_write_unlock_irqrestore
         - _raw_write_unlock_irqrestore
            - 96,54% _raw_spin_unlock_irqrestore
               - 36,14% mod_timer
                  + 97,24% sctp_transport_reset_timers
                  + 2,76% sctp_do_sm
               + 33,65% __wake_up_sync_key
               + 28,77% sctp_ulpq_tail_event
               + 1,40% del_timer
            - 1,84% mod_timer
               + 99,03% sctp_transport_reset_timers
               + 0,97% sctp_do_sm
            + 1,50% sctp_ulpq_tail_event
      
      And after this patch, now with netperf -l 60:
      
      Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000
        Overhead  Command  Shared Object      Symbol
      +    5,65%  netperf  [kernel.vmlinux]   [k] memcpy_erms
      +    5,59%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
      -    5,05%  netperf  [kernel.vmlinux]   [k] _raw_spin_unlock_irqrestore
         - _raw_spin_unlock_irqrestore
            + 49,89% __wake_up_sync_key
            + 45,68% sctp_ulpq_tail_event
            - 2,85% mod_timer
               + 76,51% sctp_transport_reset_t3_rtx
               + 23,49% sctp_do_sm
            + 1,55% del_timer
      +    2,50%  netperf  [sctp]             [k] sctp_datamsg_from_user
      +    2,26%  netperf  [sctp]             [k] sctp_sendmsg
      
      Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
      ~3.7%.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba6f5e33
  5. 10 Apr, 2016 1 commit
  6. 09 Apr, 2016 6 commits
    • Linus Torvalds's avatar
      Merge tag 'tty-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 183c948a
      Linus Torvalds authored
      Pull tty fixes from Greg KH:
       "Here are two tty fixes for issues found.
      
        One was due to a merge error in 4.6-rc1, and the other a regression
        fix for UML consoles that broke in 4.6-rc1.
      
        Both have been in linux-next for a while"
      
      * tag 'tty-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: Fix merge of "tty: Refactor tty_open()"
        tty: Fix UML console breakage
      183c948a
    • Linus Torvalds's avatar
      Merge tag 'usb-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · ffb927d1
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some USB fixes and new device ids for 4.6-rc3.
      
        Nothing major, the normal USB gadget fixes and usb-serial driver ids,
        along with some other fixes mixed in.  All except the USB serial ids
        have been tested in linux-next, the id additions should be fine as
        they are 'trivial'"
      
      * tag 'usb-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
        USB: option: add "D-Link DWM-221 B1" device id
        USB: serial: cp210x: Adding GE Healthcare Device ID
        USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
        usb: dwc3: keystone: drop dma_mask configuration
        usb: gadget: udc-core: remove manual dma configuration
        usb: dwc3: pci: add ID for one more Intel Broxton platform
        usb: renesas_usbhs: fix to avoid using a disabled ep in usbhsg_queue_done()
        usb: dwc2: do not override forced dr_mode in gadget setup
        usb: gadget: f_midi: unlock on error
        USB: digi_acceleport: do sanity checking for the number of ports
        USB: cypress_m8: add endpoint sanity check
        USB: mct_u232: add sanity checking in probe
        usb: fix regression in SuperSpeed endpoint descriptor parsing
        USB: usbip: fix potential out-of-bounds write
        usb: renesas_usbhs: disable TX IRQ before starting TX DMAC transfer
        usb: renesas_usbhs: avoid NULL pointer derefernce in usbhsf_pkt_handler()
        usb: gadget: f_midi: Fixed a bug when buflen was smaller than wMaxPacketSize
        usb: phy: qcom-8x16: fix regulator API abuse
        usb: ch9: Fix SSP Device Cap wFunctionalitySupport type
        usb: gadget: composite: Access SSP Dev Cap fields properly
        ...
      ffb927d1
    • Linus Torvalds's avatar
      Merge tag 'staging-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · c6e6e58c
      Linus Torvalds authored
      Pull staging and IIO driver fixes from Greg KH:
       "Here are some IIO driver fixes, along with two staging driver fixes
        for 4.6-rc3.
      
        One staging driver patch reverts the deletion of a driver that
        happened in 4.6-rc1.  We thought that laptop.org was dead, but it's
        still alive and kicking, and has users that were mad we broke their
        hardware by deleting a driver for their machines.  So that driver is
        added back and everyone is happy again.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'staging-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        Revert "Staging: olpc_dcon: Remove obsolete driver"
        staging/rdma/hfi1: select CRC32
        iio: gyro: bmg160: fix buffer read values
        iio: gyro: bmg160: fix endianness when reading axes
        iio: accel: bmc150: fix endianness when reading axes
        iio: st_magn: always define ST_MAGN_TRIGGER_SET_STATE
        iio: fix config watermark initial value
        iio: health: max30100: correct FIFO check condition
        iio: imu: Fix inv_mpu6050 dependencies
        iio: adc: Fix build error of missing devm_ioremap_resource on UM
        iio: light: apds9960: correct FIFO check condition
        iio: adc: max1363: correct reference voltage
        iio: adc: max1363: add missing adc to max1363_id
      c6e6e58c
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · fb41b4be
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a set of eight fixes.
      
        Two are trivial gcc-6 updates (brace additions and unused variable
        removal).  There's a couple of cxlflash regressions, a correction for
        sd being overly chatty on revalidation (causing excess log increases).
        A VPD issue which could crash USB devices because they seem very
        intolerant to VPD inquiries, an ALUA deadlock fix and a mpt3sas buffer
        overrun fix"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: Do not attach VPD to devices that don't support it
        sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
        scsi_dh_alua: Fix a recently introduced deadlock
        scsi: Declare local symbols static
        cxlflash: Move to exponential back-off when cmd_room is not available
        cxlflash: Fix regression issue with re-ordering patch
        mpt3sas: Don't overreach ioc->reply_post[] during initialization
        aacraid: add missing curly braces
      fb41b4be
    • Linus Torvalds's avatar
      Merge tag 'md/4.6-rc2-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md · 63b106a8
      Linus Torvalds authored
      Pull MD fixes from Shaohua Li:
       "This update mainly fixes bugs:
      
         - fix error handling (Guoqing)
         - fix a crash when a disk is hotremoved (me)
         - fix a dead loop (Wei Fang)"
      
      * tag 'md/4.6-rc2-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
        md/bitmap: clear bitmap if bitmap_create failed
        MD: add rdev reference for super write
        md: fix a trivial typo in comments
        md:raid1: fix a dead loop when read from a WriteMostly disk
      63b106a8
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 40bca9db
      Linus Torvalds authored
      Pull power management and ACPI fixes from Rafael Wysocki:
       "Fixes for some issues discovered after recent changes and for some
        that have just been found lately regardless of those changes
        (intel_pstate, intel_idle, PM core, mailbox/pcc, turbostat) plus
        support for some new CPU models (intel_idle, Intel RAPL driver,
        turbostat) and documentation updates (intel_pstate, PM core).
      
        Specifics:
      
         - intel_pstate fixes for two issues exposed by the recent switch over
           from using timers and for one issue introduced during the 4.4 cycle
           plus new comments describing data structures used by the driver
           (Rafael Wysocki, Srinivas Pandruvada).
      
         - intel_idle fixes related to CPU offline/online (Richard Cochran).
      
         - intel_idle support (new CPU IDs and state definitions mostly) for
           Skylake-X and Kabylake processors (Len Brown).
      
         - PCC mailbox driver fix for an out-of-bounds memory access that may
           cause the kernel to panic() (Shanker Donthineni).
      
         - New (missing) CPU ID for one apparently overlooked Haswell model in
           the Intel RAPL power capping driver (Srinivas Pandruvada).
      
         - Fix for the PM core's wakeup IRQs framework to make it work after
           wakeup settings reconfiguration from sysfs (Grygorii Strashko).
      
         - Runtime PM documentation update to make it describe what needs to
           be done during device removal more precisely (Krzysztof Kozlowski).
      
         - Stale comment removal cleanup in the cpufreq-dt driver (Viresh
           Kumar).
      
         - turbostat utility fixes and support for Broxton, Skylake-X and
           Kabylake processors (Len Brown)"
      
      * tag 'pm+acpi-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
        PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs
        tools/power turbostat: work around RC6 counter wrap
        tools/power turbostat: initial KBL support
        tools/power turbostat: initial SKX support
        tools/power turbostat: decode BXT TSC frequency via CPUID
        tools/power turbostat: initial BXT support
        tools/power turbostat: print IRTL MSRs
        tools/power turbostat: SGX state should print only if --debug
        intel_idle: Add KBL support
        intel_idle: Add SKX support
        intel_idle: Clean up all registered devices on exit.
        intel_idle: Propagate hot plug errors.
        intel_idle: Don't overreact to a cpuidle registration failure.
        intel_idle: Setup the timer broadcast only on successful driver load.
        intel_idle: Avoid a double free of the per-CPU data.
        intel_idle: Fix dangling registration on error path.
        intel_idle: Fix deallocation order on the driver exit path.
        intel_idle: Remove redundant initialization calls.
        intel_idle: Fix a helper function's return value.
        intel_idle: remove useless return from void function.
        ...
      40bca9db