1. 24 Jan, 2017 11 commits
  2. 23 Jan, 2017 3 commits
    • Florian Fainelli's avatar
      net: dsa: Check return value of phy_connect_direct() · 4078b76c
      Florian Fainelli authored
      We need to check the return value of phy_connect_direct() in
      dsa_slave_phy_connect() otherwise we may be continuing the
      initialization of a slave network device with a PHY that already
      attached somewhere else and which will soon be in error because the PHY
      device is in error.
      
      The conditions for such an error to occur are that we have a port of our
      switch that is not disabled, and has the same port number as a PHY
      address (say both 5) that can be probed using the DSA slave MII bus. We
      end-up having this slave network device find a PHY at the same address
      as our port number, and we try to attach to it.
      
      A slave network (e.g: port 0) has already attached to our PHY device,
      and we try to re-attach it with a different network device, but since we
      ignore the error we would end-up initializating incorrect device
      references by the time the slave network interface is opened.
      
      The code has been (re)organized several times, making it hard to provide
      an exact Fixes tag, this is a bugfix nonetheless.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4078b76c
    • Florian Fainelli's avatar
      net: phy: Avoid deadlock during phy_error() · eab12771
      Florian Fainelli authored
      phy_error() is called in the PHY state machine workqueue context, and
      calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
      the workqueue we execute from, causing a deadlock situation.
      
      Augment phy_trigger_machine() machine with a sync boolean indicating
      whether we should use cancel_*_sync() or just cancel_*_work().
      
      Fixes: 3c293f4e ("net: phy: Trigger state machine on state change and not polling.")
      Reported-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eab12771
    • David Ahern's avatar
      net: mpls: Fix multipath selection for LSR use case · 9f427a0e
      David Ahern authored
      MPLS multipath for LSR is broken -- always selecting the first nexthop
      in the one label case. For example:
      
          $ ip -f mpls ro ls
          100
                  nexthop as to 200 via inet 172.16.2.2  dev virt12
                  nexthop as to 300 via inet 172.16.3.2  dev virt13
          101
                  nexthop as to 201 via inet6 2000:2::2  dev virt12
                  nexthop as to 301 via inet6 2000:3::2  dev virt13
      
      In this example incoming packets have a single MPLS labels which means
      BOS bit is set. The BOS bit is passed from mpls_forward down to
      mpls_multipath_hash which never processes the hash loop because BOS is 1.
      
      Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len
      tracks the total mpls header length on each pass (on pass N mpls_hdr_len
      is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set
      it verifies the skb has sufficient header for ipv4 or ipv6, and find the
      IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to
      advance past it.
      
      With these changes I have verified the code correctly sees the label,
      BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp
      traffic for ipv4 and ipv6 are distributed across the nexthops.
      
      Fixes: 1c78efa8 ("mpls: flow-based multipath selection")
      Acked-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f427a0e
  3. 22 Jan, 2017 3 commits
  4. 20 Jan, 2017 11 commits
  5. 19 Jan, 2017 5 commits
  6. 18 Jan, 2017 7 commits
    • Daniel Borkmann's avatar
      bpf: don't trigger OOM killer under pressure with map alloc · d407bd25
      Daniel Borkmann authored
      This patch adds two helpers, bpf_map_area_alloc() and bpf_map_area_free(),
      that are to be used for map allocations. Using kmalloc() for very large
      allocations can cause excessive work within the page allocator, so i) fall
      back earlier to vmalloc() when the attempt is considered costly anyway,
      and even more importantly ii) don't trigger OOM killer with any of the
      allocators.
      
      Since this is based on a user space request, for example, when creating
      maps with element pre-allocation, we really want such requests to fail
      instead of killing other user space processes.
      
      Also, don't spam the kernel log with warnings should any of the allocations
      fail under pressure. Given that, we can make backend selection in
      bpf_map_area_alloc() generic, and convert all maps over to use this API
      for spots with potentially large allocation requests.
      
      Note, replacing the one kmalloc_array() is fine as overflow checks happen
      earlier in htab_map_alloc(), since it must also protect the multiplication
      for vmalloc() should kmalloc_array() fail.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d407bd25
    • David Ahern's avatar
      lwtunnel: fix autoload of lwt modules · 9ed59592
      David Ahern authored
      Trying to add an mpls encap route when the MPLS modules are not loaded
      hangs. For example:
      
          CONFIG_MPLS=y
          CONFIG_NET_MPLS_GSO=m
          CONFIG_MPLS_ROUTING=m
          CONFIG_MPLS_IPTUNNEL=m
      
          $ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
      The ip command hangs:
      root       880   826  0 21:25 pts/0    00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
          $ cat /proc/880/stack
          [<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
          [<ffffffff81065efc>] __request_module+0x27b/0x30a
          [<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
          [<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
          [<ffffffff814ae451>] fib_table_insert+0x90/0x41f
          [<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
          ...
      
      modprobe is trying to load rtnl-lwt-MPLS:
      
      root       881     5  0 21:25 ?        00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
      
      and it hangs after loading mpls_router:
      
          $ cat /proc/881/stack
          [<ffffffff81441537>] rtnl_lock+0x12/0x14
          [<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
          [<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
          [<ffffffff81000471>] do_one_initcall+0x8e/0x13f
          [<ffffffff81119961>] do_init_module+0x5a/0x1e5
          [<ffffffff810bd070>] load_module+0x13bd/0x17d6
          ...
      
      The problem is that lwtunnel_build_state is called with rtnl lock
      held preventing mpls_init from registering.
      
      Given the potential references held by the time lwtunnel_build_state it
      can not drop the rtnl lock to the load module. So, extract the module
      loading code from lwtunnel_build_state into a new function to validate
      the encap type. The new function is called while converting the user
      request into a fib_config which is well before any table, device or
      fib entries are examined.
      
      Fixes: 745041e2 ("lwtunnel: autoload of lwt modules")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ed59592
    • Michael Chan's avatar
      bnxt_en: Fix "uninitialized variable" bug in TPA code path. · 719ca811
      Michael Chan authored
      In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so
      that it will be correct for packets without TCP timestamps.  The bug
      caused the SKB fields to be incorrectly set up for packets without
      TCP timestamps, leading to these packets being rejected by the stack.
      Reported-by: default avatarAndy Gospodarek <andrew.gospodarek@broadocm.com>
      Acked-by: default avatarAndy Gospodarek <andrew.gospodarek@broadocm.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      719ca811
    • Daniel Gonzalez Cabanelas's avatar
      net: phy: bcm63xx: Utilize correct config_intr function · cd33b3e0
      Daniel Gonzalez Cabanelas authored
      Commit a1cba561 ("net: phy: Add Broadcom phy library for common
      interfaces") make the BCM63xx PHY driver utilize bcm_phy_config_intr()
      which would appear to do the right thing, except that it does not write
      to the MII_BCM63XX_IR register but to MII_BCM54XX_ECR which is
      different.
      
      This would be causing invalid link parameters and events from being
      generated by the PHY interrupt.
      
      Fixes: a1cba561 ("net: phy: Add Broadcom phy library for common interfaces")
      Signed-off-by: default avatarDaniel Gonzalez Cabanelas <dgcbueu@gmail.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd33b3e0
    • Eric Dumazet's avatar
      net: fix harmonize_features() vs NETIF_F_HIGHDMA · 7be2c82c
      Eric Dumazet authored
      Ashizuka reported a highmem oddity and sent a patch for freescale
      fec driver.
      
      But the problem root cause is that core networking stack
      must ensure no skb with highmem fragment is ever sent through
      a device that does not assert NETIF_F_HIGHDMA in its features.
      
      We need to call illegal_highdma() from harmonize_features()
      regardless of CSUM checks.
      
      Fixes: ec5f0615 ("net: Kill link between CSUM and SG features.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Pravin Shelar <pshelar@ovn.org>
      Reported-by: default avatar"Ashizuka, Yuusuke" <ashiduka@jp.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7be2c82c
    • David S. Miller's avatar
      Merge branch 'xen-netback-leaks' · d89ede6d
      David S. Miller authored
      Igor Druzhinin says:
      
      ====================
      xen-netback: fix memory leaks on XenBus disconnect
      
      Just split the initial patch in two as proposed by Wei.
      
      Since the approach for locking netdev statistics is inconsistent (tends not
      to have any locking at all) accross the kernel we'd better to rely on our
      internal lock for this purpose.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d89ede6d
    • Igor Druzhinin's avatar
      xen-netback: protect resource cleaning on XenBus disconnect · f16f1df6
      Igor Druzhinin authored
      vif->lock is used to protect statistics gathering agents from using the
      queue structure during cleaning.
      Signed-off-by: default avatarIgor Druzhinin <igor.druzhinin@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Reviewed-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f16f1df6