1. 03 Dec, 2016 2 commits
    • Pavel Machek's avatar
      stmmac: cleanup documenation, make it match reality · c6c60dae
      Pavel Machek authored
      Fix english in documentation, make documentation match reality, remove
      options that were removed from code.
      Signed-off-by: default avatarPavel Machek <pavel@denx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6c60dae
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 2745529a
      David S. Miller authored
      Couple conflicts resolved here:
      
      1) In the MACB driver, a bug fix to properly initialize the
         RX tail pointer properly overlapped with some changes
         to support variable sized rings.
      
      2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
         overlapping with a reorganization of the driver to support
         ACPI, OF, as well as PCI variants of the chip.
      
      3) In 'net' we had several probe error path bug fixes to the
         stmmac driver, meanwhile a lot of this code was cleaned up
         and reorganized in 'net-next'.
      
      4) The cls_flower classifier obtained a helper function in
         'net-next' called __fl_delete() and this overlapped with
         Daniel Borkamann's bug fix to use RCU for object destruction
         in 'net'.  It also overlapped with Jiri's change to guard
         the rhashtable_remove_fast() call with a check against
         tc_skip_sw().
      
      5) In mlx4, a revert bug fix in 'net' overlapped with some
         unrelated changes in 'net-next'.
      
      6) In geneve, a stale header pointer after pskb_expand_head()
         bug fix in 'net' overlapped with a large reorganization of
         the same code in 'net-next'.  Since the 'net-next' code no
         longer had the bug in question, there was nothing to do
         other than to simply take the 'net-next' hunks.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2745529a
  2. 02 Dec, 2016 38 commits
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 8dc0f265
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "This should be the last set of bugfixes for arm-soc in v4.9. None of
        these are critical regressions, but it would be nice to still get them
        merged.
      
         - On the Juno platform, the idle latency was described wrong, leading
           to suboptimal cpuidle tuning.
      
         - Also on the same platform, PCI I/O space was set up incorrectly and
           could not work.
      
         - On the sti platform, a syntactically incorrect DT entry caused
           warnings.
      
         - The newly added 'gr8' platform has somewhat confusing file names,
           which we rename for consistency"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm64: dts: juno: fix cluster sleep state entry latency on all SoC versions
        arm64: dts: juno: Correct PCI IO window
        ARM: dts: STiH407-family: fix i2c nodes
        ARM: gr8: Rename the DTSI and relevant DTS
      8dc0f265
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8bca927f
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Lots more phydev and probe error path leaks in various drivers by
          Johan Hovold.
      
       2) Fix race in packet_set_ring(), from Philip Pettersson.
      
       3) Use after free in dccp_invalid_packet(), from Eric Dumazet.
      
       4) Signnedness overflow in SO_{SND,RCV}BUFFORCE, also from Eric
          Dumazet.
      
       5) When tunneling between ipv4 and ipv6 we can be left with the wrong
          skb->protocol value as we enter the IPSEC engine and this causes all
          kinds of problems. Set it before the output path does any
          dst_output() calls, from Eli Cooper.
      
       6) bcmgenet uses wrong device struct pointer in DMA API calls, fix from
          Florian Fainelli.
      
       7) Various netfilter nat bug fixes from FLorian Westphal.
      
       8) Fix memory leak in ipvlan_link_new(), from Gao Feng.
      
       9) Locking fixes, particularly wrt. socket lookups, in l2tp from
          Guillaume Nault.
      
      10) Avoid invoking rhash teardowns in atomic context by moving netlink
          cb->done() dump completion from a worker thread. Fix from Herbert
          Xu.
      
      11) Buffer refcount problems in tun and macvtap on errors, from Jason
          Wang.
      
      12) We don't set Kconfig symbol DEFAULT_TCP_CONG properly when the user
          selects BBR. Fix from Julian Wollrath.
      
      13) Fix deadlock in transmit path on altera TSE driver, from Lino
          Sanfilippo.
      
      14) Fix unbalanced reference counting in dsa_switch_tree, from Nikita
          Yushchenko.
      
      15) tc_tunnel_key needs to be properly exported to userspace via uapi,
          fix from Roi Dayan.
      
      16) rds_tcp_init_net() doesn't unregister notifier in error path, fix
          from Sowmini Varadhan.
      
      17) Stale packet header pointer access after pskb_expand_head() in
          genenve driver, fix from Sabrina Dubroca.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
        net: avoid signed overflows for SO_{SND|RCV}BUFFORCE
        geneve: avoid use-after-free of skb->data
        tipc: check minimum bearer MTU
        net: renesas: ravb: unintialized return value
        sh_eth: remove unchecked interrupts for RZ/A1
        net: bcmgenet: Utilize correct struct device for all DMA operations
        NET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040
        cdc_ether: Fix handling connection notification
        ip6_offload: check segs for NULL in ipv6_gso_segment.
        RDS: TCP: unregister_netdevice_notifier() in error path of rds_tcp_init_net
        Revert: "ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"
        ipv6: Set skb->protocol properly for local output
        ipv4: Set skb->protocol properly for local output
        packet: fix race condition in packet_set_ring
        net: ethernet: altera: TSE: do not use tx queue lock in tx completion handler
        net: ethernet: altera: TSE: Remove unneeded dma sync for tx buffers
        net: ethernet: stmmac: fix of-node and fixed-link-phydev leaks
        net: ethernet: stmmac: platform: fix outdated function header
        net: ethernet: stmmac: dwmac-meson8b: fix probe error path
        net: ethernet: stmmac: dwmac-generic: fix probe error path
        ...
      8bca927f
    • Eric Dumazet's avatar
      net: avoid signed overflows for SO_{SND|RCV}BUFFORCE · b98b0bc8
      Eric Dumazet authored
      CAP_NET_ADMIN users should not be allowed to set negative
      sk_sndbuf or sk_rcvbuf values, as it can lead to various memory
      corruptions, crashes, OOM...
      
      Note that before commit 82981930 ("net: cleanups in
      sock_setsockopt()"), the bug was even more serious, since SO_SNDBUF
      and SO_RCVBUF were vulnerable.
      
      This needs to be backported to all known linux kernels.
      
      Again, many thanks to syzkaller team for discovering this gem.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b98b0bc8
    • Sabrina Dubroca's avatar
      geneve: avoid use-after-free of skb->data · 5b010147
      Sabrina Dubroca authored
      geneve{,6}_build_skb can end up doing a pskb_expand_head(), which
      makes the ip_hdr(skb) reference we stashed earlier stale. Since it's
      only needed as an argument to ip_tunnel_ecn_encap(), move this
      directly in the function call.
      
      Fixes: 08399efc ("geneve: ensure ECN info is handled properly in all tx/rx paths")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b010147
    • Michal Kubeček's avatar
      tipc: check minimum bearer MTU · 3de81b75
      Michal Kubeček authored
      Qian Zhang (张谦) reported a potential socket buffer overflow in
      tipc_msg_build() which is also known as CVE-2016-8632: due to
      insufficient checks, a buffer overflow can occur if MTU is too short for
      even tipc headers. As anyone can set device MTU in a user/net namespace,
      this issue can be abused by a regular user.
      
      As agreed in the discussion on Ben Hutchings' original patch, we should
      check the MTU at the moment a bearer is attached rather than for each
      processed packet. We also need to repeat the check when bearer MTU is
      adjusted to new device MTU. UDP case also needs a check to avoid
      overflow when calculating bearer MTU.
      
      Fixes: b97bf3fd ("[TIPC] Initial merge")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Reported-by: default avatarQian Zhang (张谦) <zhangqian-c@360.cn>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3de81b75
    • David S. Miller's avatar
      Merge tag 'linux-can-fixes-for-4.9-20161201' of... · f0d21e89
      David S. Miller authored
      Merge tag 'linux-can-fixes-for-4.9-20161201' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2016-12-02
      
      this is a pull request for net/master.
      
      There are two patches by Stephane Grosjean, who adds support for the new
      PCAN-USB X6 USB interface to the pcan_usb driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0d21e89
    • Dan Carpenter's avatar
      net: renesas: ravb: unintialized return value · 50d5aa4c
      Dan Carpenter authored
      We want to set the other "err" variable here so that we can return it
      later.  My version of GCC misses this issue but I caught it with a
      static checker.
      
      Fixes: 9f70eb33 ("net: ethernet: renesas: ravb: fix fixed-link phydev leaks")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50d5aa4c
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-next-for-davem-2016-12-01' of... · ab17cb1f
      David S. Miller authored
      Merge tag 'wireless-drivers-next-for-davem-2016-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
      
      Kalle Valo says:
      
      ====================
      wireless-drivers-next patches for 4.10
      
      Major changes:
      
      rsi
      
      * filter rx frames
      * configure tx power
      * make it possible to select antenna
      * support 802.11d
      
      brcmfmac
      
      * cleanup of scheduled scan code
      * support for bcm43341 chipset with different chip id
      * support rev6 of PCIe device interface
      
      ath10k
      
      * add spectral scan support for QCA6174 and QCA9377 families
      * show used tx bitrate with 10.4 firmware
      
      wil6210
      
      * add power save mode support
      * add abort scan functionality
      * add support settings retry limit for short frames
      
      bcma
      
      * add Dell Inspiron 3148
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab17cb1f
    • Chris Brandt's avatar
      sh_eth: remove unchecked interrupts for RZ/A1 · 33d446db
      Chris Brandt authored
      When streaming a lot of data and the RZ/A1 can't keep up, some status bits
      will get set that are not being checked or cleared which cause the
      following messages and the Ethernet driver to stop working. This
      patch fixes that issue.
      
      irq 21: nobody cared (try booting with the "irqpoll" option)
      handlers:
      [<c036b71c>] sh_eth_interrupt
      Disabling IRQ #21
      
      Fixes: db893473 ("sh_eth: Add support for r7s72100")
      Signed-off-by: default avatarChris Brandt <chris.brandt@renesas.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33d446db
    • Florian Fainelli's avatar
      net: bcmgenet: Utilize correct struct device for all DMA operations · 8c4799ac
      Florian Fainelli authored
      __bcmgenet_tx_reclaim() and bcmgenet_free_rx_buffers() are not using the
      same struct device during unmap that was used for the map operation,
      which makes DMA-API debugging warn about it. Fix this by always using
      &priv->pdev->dev throughout the driver, using an identical device
      reference for all map/unmap calls.
      
      Fixes: 1c1008c7 ("net: bcmgenet: add main driver file")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c4799ac
    • David S. Miller's avatar
      Merge branch 'mvneta-64bit' · 4f4f907a
      David S. Miller authored
      Gregory CLEMENT says:
      
      ====================
      Support Armada 37xx SoC (ARMv8 64-bits) in mvneta driver
      
      The Armada 37xx is a new ARMv8 SoC from Marvell using same network
      controller as the older Armada 370/38x/XP SoCs. This series adapts the
      driver in order to be able to use it on this new SoC. The main changes
      are:
      
      - 64-bits support: the first patches allow using the driver on a 64-bit
        architecture.
      
      - MBUS support: the mbus configuration is different on Armada 37xx
        from the older SoCs.
      
      - per cpu interrupt: Armada 37xx do not support per cpu interrupt for
        the NETA IP, the non-per-CPU behavior was added back.
      
      The first patch is an optimization in the rx path in swbm mode.
      The second patch remove unnecessary allocation for HWBM.
      The first item is solved by patches 4 and 5.
      The 2 last items are solved by patch 6.
      In patch 7 the dt support is added.
      
      Beside Armada 37xx, this series have been again tested on Armada XP
      and Armada 38x (with Hardware Buffer Management and with Software
      Buffer Management).
      
      This is the 6th version of the series:
      - 1st version:
      http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/469588.html
      
      - 2nd version:
      http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470476.html
      
      - 3rd version:
      http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/470901.html
      
      - 4th version:
      http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/471039.html
      
      - 5th version:
      http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/471478.html
      
      Changelog:
      v5 -> v6:
       - Added Tested-by from  Marcin Wojtas on the series
       - Added Reviewed-by from Jisheng Zhang on patch 3
       - Fix eth1 phy mode for Armada 3720 DB board on patch 7
      
      v4 -> v5:
       - remove unnecessary cast in patch 3
      
      v3 -> v4:
       - Adding new patch: "net: mvneta: do not allocate buffer in rxq init
         with HWBM"
      
       - Simplify the HWBM case in patch 3 as suggested by Marcin
      
      v2 -> v3:
       - Adding patch 1 "Optimize rx path for small frame"
      
       - Fix the kbuild error by moving the "phys_addr += pp->rx_offset_correction;"
        line from patch 2 to patch 3 where rx_offset_correction is introduced.
      
       - Move the memory allocation of the buf_virt_addr of the rxq to be
         called by the probe function in order to avoid a memory leak.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f4f907a
    • Gregory CLEMENT's avatar
      ARM64: dts: marvell: Add network support for Armada 3700 · ea7ae885
      Gregory CLEMENT authored
      Add neta nodes for network support both in device tree for the SoC and
      the board.
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea7ae885
    • Marcin Wojtas's avatar
      net: mvneta: Add network support for Armada 3700 SoC · 2636ac3c
      Marcin Wojtas authored
      Armada 3700 is a new ARMv8 SoC from Marvell using same network controller
      as older Armada 370/38x/XP. There are however some differences that
      needed taking into account when adding support for it:
      
      * open default MBUS window to 4GB of DRAM - Armada 3700 SoC's Mbus
        configuration for network controller has to be done on two levels:
        global and per-port. The first one is inherited from the
        bootloader. The latter can be opened in a default way, leaving
        arbitration to the bus controller.  Hence filled mbus_dram_target_info
        structure is not needed
      
      * make per-CPU operation optional - Recent patches adding RSS and XPS
        support for Armada 38x/XP enabled per-CPU operation of the controller
        by default. Contrary to older SoC's Armada 3700 SoC's network
        controller is not capable of per-CPU processing due to interrupt lines'
        connectivity.  This patch restores non-per-CPU operation, which is now
        optional and depends on neta_armada3700 flag value in mvneta_port
        structure. In order not to complicate the code, separate interrupt
        subroutine is implemented.
      
      For now, on the Armada 3700, RSS is disabled as the current
      implementation depend on the per cpu interrupts.
      
      [gregory.clement@free-electrons.com: extract from a larger patch, replace
      some ifdef and port to net-next for v4.10]
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2636ac3c
    • Gregory CLEMENT's avatar
      net: mvneta: Only disable mvneta_bm for 64-bits · f34daccc
      Gregory CLEMENT authored
      Actually only the mvneta_bm support is not 64-bits compatible.
      The mvneta code itself can run on 64-bits architecture.
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f34daccc
    • Marcin Wojtas's avatar
      net: mvneta: Convert to be 64 bits compatible · 8d5047cf
      Marcin Wojtas authored
      Prepare the mvneta driver in order to be usable on the 64 bits platform
      such as the Armada 3700.
      
      [gregory.clement@free-electrons.com]: this patch was extract from a larger
      one to ease review and maintenance.
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d5047cf
    • Gregory CLEMENT's avatar
      net: mvneta: Use cacheable memory to store the rx buffer virtual address · f88bee1c
      Gregory CLEMENT authored
      Until now the virtual address of the received buffer were stored in the
      cookie field of the rx descriptor. However, this field is 32-bits only
      which prevents to use the driver on a 64-bits architecture.
      
      With this patch the virtual address is stored in an array not shared with
      the hardware (no more need to use the DMA API). Thanks to this, it is
      possible to use cache contrary to the access of the rx descriptor member.
      
      The change is done in the swbm path only because the hwbm uses the cookie
      field, this also means that currently the hwbm is not usable in 64-bits.
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Reviewed-by: default avatarJisheng Zhang <jszhang@marvell.com>
      Tested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f88bee1c
    • Gregory CLEMENT's avatar
      net: mvneta: Do not allocate buffer in rxq init with HWBM · e9f64999
      Gregory CLEMENT authored
      For HWBM all buffers are allocated in mvneta_bm_construct() and in runtime
      they are put into descriptors by hardware. There is no need to fill them
      at this point.
      Suggested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9f64999
    • Gregory CLEMENT's avatar
      net: mvneta: Optimize rx path for small frame · ac83b7dd
      Gregory CLEMENT authored
      For small frame reuse the phys_addr variable instead of accessing the
      uncacheable value in the rx descriptor.
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac83b7dd
    • Linus Torvalds's avatar
      Fix up a couple of field names in the CREDITS file · ed8d747f
      Linus Torvalds authored
      Ozgur Karatas reported that the very first entry in the CREDITS file had
      the wrong tag for name (M: instead of N: - it happened when moving the
      entry from the MAINTAINERS file, where 'M:' stands for "Maintainer").
      
      And when I went looking, I found a couple of other cases of wrong
      tagging too.
      Reported-by: default avatarOzgur Karatas <mueddib@yandex.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed8d747f
    • David S. Miller's avatar
      Merge branch 'bpf-support-for-sockets' · b5b5eca9
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: Add bpf support for sockets
      
      The recently added VRF support in Linux leverages the bind-to-device
      API for programs to specify an L3 domain for a socket. While
      SO_BINDTODEVICE has been around for ages, not every ipv4/ipv6 capable
      program has support for it. Even for those programs that do support it,
      the API requires processes to be started as root (CAP_NET_RAW) which
      is not desirable from a general security perspective.
      
      This patch set leverages Daniel Mack's work to attach bpf programs to
      a cgroup to provide a capability to set sk_bound_dev_if for all
      AF_INET{6} sockets opened by a process in a cgroup when the sockets
      are allocated.
      
      For example:
       1. configure vrf (e.g., using ifupdown2)
              auto eth0
              iface eth0 inet dhcp
                  vrf mgmt
      
              auto mgmt
              iface mgmt
                  vrf-table auto
      
       2. configure cgroup
              mount -t cgroup2 none /tmp/cgroupv2
              mkdir /tmp/cgroupv2/mgmt
              test_cgrp2_sock /tmp/cgroupv2/mgmt 15
      
       3. set shell into cgroup (e.g., can be done at login using pam)
              echo $$ >> /tmp/cgroupv2/mgmt/cgroup.procs
      
      At this point all commands run in the shell (e.g, apt) have sockets
      automatically bound to the VRF (see output of ss -ap 'dev == <vrf>'),
      including processes not running as root.
      
      This capability enables running any program in a VRF context and is key
      to deploying Management VRF, a fundamental configuration for networking
      gear, with any Linux OS installation.
      
      This patchset also exports the socket family, type and protocol as
      read-only allowing bpf filters to deny a process in a cgroup the ability
      to open specific types of AF_INET or AF_INET6 sockets.
      
      v7
      - comments from Alexei
      
      v6
      - add export of socket family, type and protocol
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5b5eca9
    • David Ahern's avatar
      samples/bpf: add userspace example for prohibiting sockets · 554ae6e7
      David Ahern authored
      Add examples preventing a process in a cgroup from opening a socket
      based family, protocol and type.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      554ae6e7
    • David Ahern's avatar
      samples/bpf: Update bpf loader for cgroup section names · 4f2e7ae5
      David Ahern authored
      Add support for section names starting with cgroup/skb and cgroup/sock.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f2e7ae5
    • David Ahern's avatar
      bpf: Add support for reading socket family, type, protocol · aa4c1037
      David Ahern authored
      Add socket family, type and protocol to bpf_sock allowing bpf programs
      read-only access.
      
      Add __sk_flags_offset[0] to struct sock before the bitfield to
      programmtically determine the offset of the unsigned int containing
      protocol and type.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa4c1037
    • David Ahern's avatar
      samples: bpf: add userspace example for modifying sk_bound_dev_if · ad2805dc
      David Ahern authored
      Add a simple program to demonstrate the ability to attach a bpf program
      to a cgroup that sets sk_bound_dev_if for AF_INET{6} sockets when they
      are created.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad2805dc
    • David Ahern's avatar
      bpf: Add new cgroup attach type to enable sock modifications · 61023658
      David Ahern authored
      Add new cgroup based program type, BPF_PROG_TYPE_CGROUP_SOCK. Similar to
      BPF_PROG_TYPE_CGROUP_SKB programs can be attached to a cgroup and run
      any time a process in the cgroup opens an AF_INET or AF_INET6 socket.
      Currently only sk_bound_dev_if is exported to userspace for modification
      by a bpf program.
      
      This allows a cgroup to be configured such that AF_INET{6} sockets opened
      by processes are automatically bound to a specific device. In turn, this
      enables the running of programs that do not support SO_BINDTODEVICE in a
      specific VRF context / L3 domain.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61023658
    • David Ahern's avatar
      bpf: Refactor cgroups code in prep for new type · b2cd1257
      David Ahern authored
      Code move and rename only; no functional change intended.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2cd1257
    • Daniele Palmas's avatar
      NET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040 · 9bd813da
      Daniele Palmas authored
      This patch adds support for PID 0x1040 of Telit LE922A.
      
      The qmi adapter requires to have DTR set for proper working,
      so QMI_WWAN_QUIRK_DTR has been enabled.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9bd813da
    • Kristian Evensen's avatar
      cdc_ether: Fix handling connection notification · d5c83d0d
      Kristian Evensen authored
      Commit bfe9b9d2 ("cdc_ether: Improve ZTE MF823/831/910 handling")
      introduced a work-around in usbnet_cdc_status() for devices that exported
      cdc carrier on twice on connect. Before the commit, this behavior caused
      the link state to be incorrect. It was assumed that all CDC Ethernet
      devices would either export this behavior, or send one off and then one on
      notification (which seems to be the default behavior).
      
      Unfortunately, it turns out multiple devices sends a connection
      notification multiple times per second (via an interrupt), even when
      connection state does not change. This has been observed with several
      different USB LAN dongles (at least), for example 13b1:0041 (Linksys).
      After bfe9b9d2, the link state has been set as down and then up for
      each notification. This has caused a flood of Netlink NEWLINK messages and
      syslog to be flooded with messages similar to:
      
      cdc_ether 2-1:2.0 eth1: kevent 12 may have been dropped
      
      This commit fixes the behavior by reverting usbnet_cdc_status() to how it
      was before bfe9b9d2. The work-around has been moved to a separate
      status-function which is only called when a known, affect device is
      detected.
      
      v1->v2:
      
      * Do not open-code netif_carrier_ok() (thanks Henning Schild).
      * Call netif_carrier_off() instead of usb_link_change(). This prevents
      calling schedule_work() twice without giving the work queue a chance to be
      processed (thanks Bjørn Mork).
      
      Fixes: bfe9b9d2 ("cdc_ether: Improve ZTE MF823/831/910 handling")
      Reported-by: default avatarHenning Schild <henning.schild@siemens.com>
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5c83d0d
    • Artem Savkov's avatar
      ip6_offload: check segs for NULL in ipv6_gso_segment. · 6b6ebb6b
      Artem Savkov authored
      segs needs to be checked for being NULL in ipv6_gso_segment() before calling
      skb_shinfo(segs), otherwise kernel can run into a NULL-pointer dereference:
      
      [   97.811262] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
      [   97.819112] IP: [<ffffffff816e52f9>] ipv6_gso_segment+0x119/0x2f0
      [   97.825214] PGD 0 [   97.827047]
      [   97.828540] Oops: 0000 [#1] SMP
      [   97.831678] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 rpcsec_gss_krb5
      nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
      iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
      ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
      bridge stp llc snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel
      snd_hda_codec edac_mce_amd snd_hda_core edac_core snd_hwdep kvm_amd snd_seq kvm snd_seq_device
      snd_pcm irqbypass snd_timer ppdev parport_serial snd parport_pc k10temp pcspkr soundcore parport
      sp5100_tco shpchp sg wmi i2c_piix4 acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc
      ip_tables xfs libcrc32c sr_mod cdrom sd_mod ata_generic pata_acpi amdkfd amd_iommu_v2 radeon
      broadcom bcm_phy_lib i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
      ttm ahci serio_raw tg3 firewire_ohci libahci pata_atiixp drm ptp libata firewire_core pps_core
      i2c_core crc_itu_t fjes dm_mirror dm_region_hash dm_log dm_mod
      [   97.927721] CPU: 1 PID: 3504 Comm: vhost-3495 Not tainted 4.9.0-7.el7.test.x86_64 #1
      [   97.935457] Hardware name: AMD Snook/Snook, BIOS ESK0726A 07/26/2010
      [   97.941806] task: ffff880129a1c080 task.stack: ffffc90001bcc000
      [   97.947720] RIP: 0010:[<ffffffff816e52f9>]  [<ffffffff816e52f9>] ipv6_gso_segment+0x119/0x2f0
      [   97.956251] RSP: 0018:ffff88012fc43a10  EFLAGS: 00010207
      [   97.961557] RAX: 0000000000000000 RBX: ffff8801292c8700 RCX: 0000000000000594
      [   97.968687] RDX: 0000000000000593 RSI: ffff880129a846c0 RDI: 0000000000240000
      [   97.975814] RBP: ffff88012fc43a68 R08: ffff880129a8404e R09: 0000000000000000
      [   97.982942] R10: 0000000000000000 R11: ffff880129a84076 R12: 00000020002949b3
      [   97.990070] R13: ffff88012a580000 R14: 0000000000000000 R15: ffff88012a580000
      [   97.997198] FS:  0000000000000000(0000) GS:ffff88012fc40000(0000) knlGS:0000000000000000
      [   98.005280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   98.011021] CR2: 00000000000000cc CR3: 0000000126c5d000 CR4: 00000000000006e0
      [   98.018149] Stack:
      [   98.020157]  00000000ffffffff ffff88012fc43ac8 ffffffffa017ad0a 000000000000000e
      [   98.027584]  0000001300000000 0000000077d59998 ffff8801292c8700 00000020002949b3
      [   98.035010]  ffff88012a580000 0000000000000000 ffff88012a580000 ffff88012fc43a98
      [   98.042437] Call Trace:
      [   98.044879]  <IRQ> [   98.046803]  [<ffffffffa017ad0a>] ? tg3_start_xmit+0x84a/0xd60 [tg3]
      [   98.053156]  [<ffffffff815eeee0>] skb_mac_gso_segment+0xb0/0x130
      [   98.059158]  [<ffffffff815eefd3>] __skb_gso_segment+0x73/0x110
      [   98.064985]  [<ffffffff815ef40d>] validate_xmit_skb+0x12d/0x2b0
      [   98.070899]  [<ffffffff815ef5d2>] validate_xmit_skb_list+0x42/0x70
      [   98.077073]  [<ffffffff81618560>] sch_direct_xmit+0xd0/0x1b0
      [   98.082726]  [<ffffffff815efd86>] __dev_queue_xmit+0x486/0x690
      [   98.088554]  [<ffffffff8135c135>] ? cpumask_next_and+0x35/0x50
      [   98.094380]  [<ffffffff815effa0>] dev_queue_xmit+0x10/0x20
      [   98.099863]  [<ffffffffa09ce057>] br_dev_queue_push_xmit+0xa7/0x170 [bridge]
      [   98.106907]  [<ffffffffa09ce161>] br_forward_finish+0x41/0xc0 [bridge]
      [   98.113430]  [<ffffffff81627cf2>] ? nf_iterate+0x52/0x60
      [   98.118735]  [<ffffffff81627d6b>] ? nf_hook_slow+0x6b/0xc0
      [   98.124216]  [<ffffffffa09ce32c>] __br_forward+0x14c/0x1e0 [bridge]
      [   98.130480]  [<ffffffffa09ce120>] ? br_dev_queue_push_xmit+0x170/0x170 [bridge]
      [   98.137785]  [<ffffffffa09ce4bd>] br_forward+0x9d/0xb0 [bridge]
      [   98.143701]  [<ffffffffa09cfbb7>] br_handle_frame_finish+0x267/0x560 [bridge]
      [   98.150834]  [<ffffffffa09d0064>] br_handle_frame+0x174/0x2f0 [bridge]
      [   98.157355]  [<ffffffff8102fb89>] ? sched_clock+0x9/0x10
      [   98.162662]  [<ffffffff810b63b2>] ? sched_clock_cpu+0x72/0xa0
      [   98.168403]  [<ffffffff815eccf5>] __netif_receive_skb_core+0x1e5/0xa20
      [   98.174926]  [<ffffffff813659f9>] ? timerqueue_add+0x59/0xb0
      [   98.180580]  [<ffffffff815ed548>] __netif_receive_skb+0x18/0x60
      [   98.186494]  [<ffffffff815ee625>] process_backlog+0x95/0x140
      [   98.192145]  [<ffffffff815edccd>] net_rx_action+0x16d/0x380
      [   98.197713]  [<ffffffff8170cff1>] __do_softirq+0xd1/0x283
      [   98.203106]  [<ffffffff8170b2bc>] do_softirq_own_stack+0x1c/0x30
      [   98.209107]  <EOI> [   98.211029]  [<ffffffff8108a5c0>] do_softirq+0x50/0x60
      [   98.216166]  [<ffffffff815ec853>] netif_rx_ni+0x33/0x80
      [   98.221386]  [<ffffffffa09eeff7>] tun_get_user+0x487/0x7f0 [tun]
      [   98.227388]  [<ffffffffa09ef3ab>] tun_sendmsg+0x4b/0x60 [tun]
      [   98.233129]  [<ffffffffa0b68932>] handle_tx+0x282/0x540 [vhost_net]
      [   98.239392]  [<ffffffffa0b68c25>] handle_tx_kick+0x15/0x20 [vhost_net]
      [   98.245916]  [<ffffffffa0abacfe>] vhost_worker+0x9e/0xf0 [vhost]
      [   98.251919]  [<ffffffffa0abac60>] ? vhost_umem_alloc+0x40/0x40 [vhost]
      [   98.258440]  [<ffffffff81003a47>] ? do_syscall_64+0x67/0x180
      [   98.264094]  [<ffffffff810a44d9>] kthread+0xd9/0xf0
      [   98.268965]  [<ffffffff810a4400>] ? kthread_park+0x60/0x60
      [   98.274444]  [<ffffffff8170a4d5>] ret_from_fork+0x25/0x30
      [   98.279836] Code: 8b 93 d8 00 00 00 48 2b 93 d0 00 00 00 4c 89 e6 48 89 df 66 89 93 c2 00 00 00 ff 10 48 3d 00 f0 ff ff 49 89 c2 0f 87 52 01 00 00 <41> 8b 92 cc 00 00 00 48 8b 80 d0 00 00 00 44 0f b7 74 10 06 66
      [   98.299425] RIP  [<ffffffff816e52f9>] ipv6_gso_segment+0x119/0x2f0
      [   98.305612]  RSP <ffff88012fc43a10>
      [   98.309094] CR2: 00000000000000cc
      [   98.312406] ---[ end trace 726a2c7a2d2d78d0 ]---
      Signed-off-by: default avatarArtem Savkov <asavkov@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b6ebb6b
    • Eric Dumazet's avatar
      mlx4: fix use-after-free in mlx4_en_fold_software_stats() · 7f7bf160
      Eric Dumazet authored
      My recent commit to get more precise rx/tx counters in ndo_get_stats64()
      can lead to crashes at device dismantle, as Jesper found out.
      
      We must prevent mlx4_en_fold_software_stats() trying to access
      tx/rx rings if they are deleted.
      
      Fix this by adding a test against priv->port_up in
      mlx4_en_fold_software_stats()
      
      Calling mlx4_en_fold_software_stats() from mlx4_en_stop_port()
      allows us to eventually broadcast the latest/current counters to
      rtnetlink monitors.
      
      Fixes: 40931b85 ("mlx4: give precise rx/tx bytes/packets counters")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-and-bisected-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Tested-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Saeed Mahameed <saeedm@dev.mellanox.co.il>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f7bf160
    • Sunil Goutham's avatar
      net: thunderx: Fix transmit queue timeout issue · bd3ad7d3
      Sunil Goutham authored
      Transmit queue timeout issue is seen in two cases
      - Due to a race condition btw setting stop_queue at xmit()
        and checking for stopped_queue in NAPI poll routine, at times
        transmission from a SQ comes to a halt. This is fixed
        by using barriers and also added a check for SQ free descriptors,
        incase SQ is stopped and there are only CQE_RX i.e no CQE_TX.
      - Contrary to an assumption, a HW errata where HW doesn't stop transmission
        even though there are not enough CQEs available for a CQE_TX is
        not fixed in T88 pass 2.x. This results in a Qset error with
        'CQ_WR_FULL' stalling transmission. This is fixed by adjusting
        RXQ's  RED levels for CQ level such that there is always enough
        space left for CQE_TXs.
      Signed-off-by: default avatarSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd3ad7d3
    • Sowmini Varadhan's avatar
      RDS: TCP: unregister_netdevice_notifier() in error path of rds_tcp_init_net · 721c7443
      Sowmini Varadhan authored
      If some error is encountered in rds_tcp_init_net, make sure to
      unregister_netdevice_notifier(), else we could trigger a panic
      later on, when the modprobe from a netns fails.
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      721c7443
    • David S. Miller's avatar
      Merge branch 'offloading-tc-rules-hw' · 9aac3c18
      David S. Miller authored
      Hadar Hen Zion says:
      
      ====================
      Offloading tc rules using underline Hardware device
      
      This series adds flower classifier support in offloading tc rules when the
      Software ingress device is different from the Hardware ingress device,
      such as when dealing with IP tunnels
      
      The first two patches are a small fixes to flower, checking the skip_hw flag
      wasn't set before calling the Hardware offloading functions which will try to
      offload the rule.
      
      The next two patches are infrastructure patches, a preparation for the fourth
      patch which is adding support in flower to offload rules when the ingress
      device is not a Hardware device and therefore can't offload.
      In this case ndo_setup_tc is called with the mirred (egress) device.
      
      The last three patchs are adding mlx5e support to offload rules using the new
      "egress_device" flag.
      
      Thanks,
      Hadar
      
      Changes from v0:
      - check if CONFIG_NET_CLS_ACT is defined befor calling tc_action_ops get_dev()
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9aac3c18
    • Hadar Hen Zion's avatar
      net/mlx5e: Support adding ingress tc rule when egress device flag is set · ebe06875
      Hadar Hen Zion authored
      When ndo_setup_tc is called with an egress_dev flag set, it means that
      the ndo call was executed on the mirred action (egress) device and not
      on the ingress device.
      
      In order to support this kind of ndo_setup_tc call, and insert the
      correct decap rule to the hardware, the uplink device on the same eswitch
      should be found.
      
      Currently, we use this resolution between the mirred device and the
      uplink on the same eswitch to offload vxlan shared device decap rules.
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebe06875
    • Hadar Hen Zion's avatar
      net/mlx5e: Save the represntor netdevice as part of the representor · 726293f1
      Hadar Hen Zion authored
      Replace the representor private data to a net_device pointer holding the
      representor netdevice, instead of void pointer holding mlx5e_priv.
      
      It will be used by a new eswitch service function, returning the uplink representor
      netdevice.
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      726293f1
    • Hadar Hen Zion's avatar
      net/mlx5e: Bring back representor's ndos that were accidentally removed · 718f13e7
      Hadar Hen Zion authored
      The VF Representor udp tunnel ndo entries were removed by mistake,
      return them.
      
      Fixes: 370bad0f ('net/mlx5e: Support HW (offloaded) and SW counters for SRIOV switchdev mode')
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      718f13e7
    • Hadar Hen Zion's avatar
      net/sched: cls_flower: Add offload support using egress Hardware device · 7091d8c7
      Hadar Hen Zion authored
      In order to support hardware offloading when the device given by the tc
      rule is different from the Hardware underline device, extract the mirred
      (egress) device from the tc action when a filter is added, using the new
      tc_action_ops, get_dev().
      
      Flower caches the information about the mirred device and use it for
      calling ndo_setup_tc in filter change, update stats and delete.
      
      Calling ndo_setup_tc of the mirred (egress) device instead of the
      ingress device will allow a resolution between the software ingress
      device and the underline hardware device.
      
      The resolution will take place inside the offloading driver using
      'egress_device' flag added to tc_to_netdev struct which is provided to
      the offloading driver.
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7091d8c7
    • Hadar Hen Zion's avatar
      net/sched: act_mirred: Add new tc_action_ops get_dev() · 255cb304
      Hadar Hen Zion authored
      Adding support to a new tc_action_ops.
      get_dev is a general option which allows to get the underline
      device when trying to offload a tc rule.
      
      In case of mirred action the returned device is the mirred (egress)
      device.
      Signed-off-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      255cb304