1. 03 Apr, 2019 1 commit
    • Alexei Starovoitov's avatar
      bpf: add verifier stats and log_level bit 2 · 06ee7115
      Alexei Starovoitov authored
      In order to understand the verifier bottlenecks add various stats
      and extend log_level:
      log_level 1 and 2 are kept as-is:
      bit 0 - level=1 - print every insn and verifier state at branch points
      bit 1 - level=2 - print every insn and verifier state at every insn
      bit 2 - level=4 - print verifier error and stats at the end of verification
      
      When verifier rejects the program the libbpf is trying to load the program twice.
      Once with log_level=0 (no messages, only error code is reported to user space)
      and second time with log_level=1 to tell the user why the verifier rejected it.
      
      With introduction of bit 2 - level=4 the libbpf can choose to always use that
      level and load programs once, since the verification speed is not affected and
      in case of error the verbose message will be available.
      
      Note that the verifier stats are not part of uapi just like all other
      verbose messages. They're expected to change in the future.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      06ee7115
  2. 02 Apr, 2019 6 commits
  3. 01 Apr, 2019 1 commit
    • Yonghong Song's avatar
      bpf: add bpffs multi-dimensional array tests in test_btf · 9de2640b
      Yonghong Song authored
      For multiple dimensional arrays like below,
        int a[2][3]
      both llvm and pahole generated one BTF_KIND_ARRAY type like
        . element_type: int
        . index_type: unsigned int
        . number of elements: 6
      
      Such a collapsed BTF_KIND_ARRAY type will cause the divergence
      in BTF vs. the user code. In the compile-once-run-everywhere
      project, the header file is generated from BTF and used for bpf
      program, and the definition in the header file will be different
      from what user expects.
      
      But the kernel actually supports chained multi-dimensional array
      types properly. The above "int a[2][3]" can be represented as
        Type #n:
          . element_type: int
          . index_type: unsigned int
          . number of elements: 3
        Type #(n+1):
          . element_type: type #n
          . index_type: unsigned int
          . number of elements: 2
      
      The following llvm commit
        https://reviews.llvm.org/rL357215
      also enables llvm to generated proper chained multi-dimensional arrays.
      
      The test_btf already has a raw test ("struct test #1") for chained
      multi-dimensional arrays. This patch added amended bpffs test for
      chained multi-dimensional arrays.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9de2640b
  4. 29 Mar, 2019 3 commits
    • Alexei Starovoitov's avatar
      Merge branch 'variable-stack-access' · c3969de8
      Alexei Starovoitov authored
      Andrey Ignatov says:
      
      ====================
      The patch set adds support for stack access with variable offset from helpers.
      
      Patch 1 is the main patch in the set and provides more details.
      Patch 2 adds selftests for new functionality.
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c3969de8
    • Andrey Ignatov's avatar
      selftests/bpf: Test variable offset stack access · 8ff80e96
      Andrey Ignatov authored
      Test different scenarios of indirect variable-offset stack access: out of
      bound access (>0), min_off below initialized part of the stack,
      max_off+size above initialized part of the stack, initialized stack.
      
      Example of output:
        ...
        #856/p indirect variable-offset stack access, out of bound OK
        #857/p indirect variable-offset stack access, max_off+size > max_initialized OK
        #858/p indirect variable-offset stack access, min_off < min_initialized OK
        #859/p indirect variable-offset stack access, ok OK
        ...
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8ff80e96
    • Andrey Ignatov's avatar
      bpf: Support variable offset stack access from helpers · 2011fccf
      Andrey Ignatov authored
      Currently there is a difference in how verifier checks memory access for
      helper arguments for PTR_TO_MAP_VALUE and PTR_TO_STACK with regard to
      variable part of offset.
      
      check_map_access, that is used for PTR_TO_MAP_VALUE, can handle variable
      offsets just fine, so that BPF program can call a helper like this:
      
        some_helper(map_value_ptr + off, size);
      
      , where offset is unknown at load time, but is checked by program to be
      in a safe rage (off >= 0 && off + size < map_value_size).
      
      But it's not the case for check_stack_boundary, that is used for
      PTR_TO_STACK, and same code with pointer to stack is rejected by
      verifier:
      
        some_helper(stack_value_ptr + off, size);
      
      For example:
        0: (7a) *(u64 *)(r10 -16) = 0
        1: (7a) *(u64 *)(r10 -8) = 0
        2: (61) r2 = *(u32 *)(r1 +0)
        3: (57) r2 &= 4
        4: (17) r2 -= 16
        5: (0f) r2 += r10
        6: (18) r1 = 0xffff888111343a80
        8: (85) call bpf_map_lookup_elem#1
        invalid variable stack read R2 var_off=(0xfffffffffffffff0; 0x4)
      
      Add support for variable offset access to check_stack_boundary so that
      if offset is checked by program to be in a safe range it's accepted by
      verifier.
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2011fccf
  5. 28 Mar, 2019 2 commits
  6. 27 Mar, 2019 27 commits
    • Eric Dumazet's avatar
      inet: switch IP ID generator to siphash · df453700
      Eric Dumazet authored
      According to Amit Klein and Benny Pinkas, IP ID generation is too weak
      and might be used by attackers.
      
      Even with recent net_hash_mix() fix (netns: provide pure entropy for net_hash_mix())
      having 64bit key and Jenkins hash is risky.
      
      It is time to switch to siphash and its 128bit keys.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAmit Klein <aksecurity@gmail.com>
      Reported-by: default avatarBenny Pinkas <benny@pinkas.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df453700
    • Colin Ian King's avatar
      net: phy: mdio-bcm-unimac: remove redundant !timeout check · 180a8c3d
      Colin Ian King authored
      The check for zero timeout is always true at the end of the proceeding
      while loop; the only other exit path in the loop is if the unimac MDIO
      is not busy.  Remove the redundant zero timeout check and always
      return -ETIMEDOUT on this timeout return path.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      180a8c3d
    • Eric Dumazet's avatar
      tcp: fix zerocopy and notsent_lowat issues · 4f661542
      Eric Dumazet authored
      My recent patch had at least three problems :
      
      1) TX zerocopy wants notification when skb is acknowledged,
         thus we need to call skb_zcopy_clear() if the skb is
         cached into sk->sk_tx_skb_cache
      
      2) Some applications might expect precise EPOLLOUT
         notifications, so we need to update sk->sk_wmem_queued
         and call sk_mem_uncharge() from sk_wmem_free_skb()
         in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.
      
      3) Reuse of saved skb should have used skb_cloned() instead
        of simply checking if the fast clone has been freed.
      
      Fixes: 472c2e07 ("tcp: add one skb cache for tx")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Tested-by: default avatarHolger Hoffstätte <holger@applied-asynchrony.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f661542
    • Numan Siddique's avatar
      net: openvswitch: Add a new action check_pkt_len · 4d5ec89f
      Numan Siddique authored
      This patch adds a new action - 'check_pkt_len' which checks the
      packet length and executes a set of actions if the packet
      length is greater than the specified length or executes
      another set of actions if the packet length is lesser or equal to.
      
      This action takes below nlattrs
        * OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
      
        * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER - Nested actions
          to apply if the packet length is greater than the specified 'pkt_len'
      
        * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL - Nested
          actions to apply if the packet length is lesser or equal to the
          specified 'pkt_len'.
      
      The main use case for adding this action is to solve the packet
      drops because of MTU mismatch in OVN virtual networking solution.
      When a VM (which belongs to a logical switch of OVN) sends a packet
      destined to go via the gateway router and if the nic which provides
      external connectivity, has a lesser MTU, OVS drops the packet
      if the packet length is greater than this MTU.
      
      With the help of this action, OVN will check the packet length
      and if it is greater than the MTU size, it will generate an
      ICMP packet (type 3, code 4) and includes the next hop mtu in it
      so that the sender can fragment the packets.
      
      Reported-at:
      https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.htmlSuggested-by: default avatarBen Pfaff <blp@ovn.org>
      Signed-off-by: default avatarNuman Siddique <nusiddiq@redhat.com>
      CC: Gregory Rose <gvrose8192@gmail.com>
      CC: Pravin B Shelar <pshelar@ovn.org>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Tested-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Reviewed-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d5ec89f
    • David S. Miller's avatar
      Merge branch 'ethtool-add-support-for-Fast-Link-Down-as-new-PHY-tunable' · d7aa0338
      David S. Miller authored
      Heiner Kallweit says:
      
      ====================
      ethtool: add support for Fast Link Down as new PHY tunable
      
      This adds support for Fast Link Down as new PHY tunable.
      Fast Link Down reduces the time until a link down event is reported
      for 1000BaseT. According to the standard it's 750ms what is too long
      for several use cases.
      
      This is the kernel-related series, the ethtool userspace extension
      I'd submit once the kernel part has been applied.
      
      v2:
      - add describing comment in patch 1
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7aa0338
    • Heiner Kallweit's avatar
      net: phy: marvell: add PHY tunable fast link down support for 88E1540 · 69f42be8
      Heiner Kallweit authored
      1000BaseT standard requires that a link is reported as down earliest
      after 750ms. Several use case however require a much faster detecion
      of a broken link. Fast Link Down supports this by intentionally
      violating a the standard. This patch exposes the Fast Link Down
      feature of 88E1540 and 88E6390. These PHY's can be found as internal
      PHY's in several switches: 88E6352, 88E6240, 88E6176, 88E6172,
      and 88E6390(X). Fast Link Down and EEE are mutually exclusive.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69f42be8
    • Heiner Kallweit's avatar
      ethtool: add PHY Fast Link Down support · 3aeb0803
      Heiner Kallweit authored
      This adds support for Fast Link Down as new PHY tunable.
      Fast Link Down reduces the time until a link down event is reported
      for 1000BaseT. According to the standard it's 750ms what is too long
      for several use cases.
      
      v2:
      - add comment describing the constants
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3aeb0803
    • Bart Van Assche's avatar
      net/core: Allow the compiler to verify declaration and definition consistency · 7b7ed885
      Bart Van Assche authored
      Instead of declaring a function in a .c file, declare it in a header
      file and include that header file from the source files that define
      and that use the function. That allows the compiler to verify
      consistency of declaration and definition. See also commit
      52267790 ("sock: add MSG_ZEROCOPY") # v4.14.
      
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b7ed885
    • Bart Van Assche's avatar
      net/core: Fix rtnetlink kernel-doc headers · a986967e
      Bart Van Assche authored
      This patch avoids that the following warnings are reported when building
      with W=1:
      
      net/core/rtnetlink.c:3580: warning: Function parameter or member 'ndm' not described in 'ndo_dflt_fdb_add'
      net/core/rtnetlink.c:3580: warning: Function parameter or member 'tb' not described in 'ndo_dflt_fdb_add'
      net/core/rtnetlink.c:3580: warning: Function parameter or member 'dev' not described in 'ndo_dflt_fdb_add'
      net/core/rtnetlink.c:3580: warning: Function parameter or member 'addr' not described in 'ndo_dflt_fdb_add'
      net/core/rtnetlink.c:3580: warning: Function parameter or member 'vid' not described in 'ndo_dflt_fdb_add'
      net/core/rtnetlink.c:3580: warning: Function parameter or member 'flags' not described in 'ndo_dflt_fdb_add'
      net/core/rtnetlink.c:3718: warning: Function parameter or member 'ndm' not described in 'ndo_dflt_fdb_del'
      net/core/rtnetlink.c:3718: warning: Function parameter or member 'tb' not described in 'ndo_dflt_fdb_del'
      net/core/rtnetlink.c:3718: warning: Function parameter or member 'dev' not described in 'ndo_dflt_fdb_del'
      net/core/rtnetlink.c:3718: warning: Function parameter or member 'addr' not described in 'ndo_dflt_fdb_del'
      net/core/rtnetlink.c:3718: warning: Function parameter or member 'vid' not described in 'ndo_dflt_fdb_del'
      net/core/rtnetlink.c:3861: warning: Function parameter or member 'skb' not described in 'ndo_dflt_fdb_dump'
      net/core/rtnetlink.c:3861: warning: Function parameter or member 'cb' not described in 'ndo_dflt_fdb_dump'
      net/core/rtnetlink.c:3861: warning: Function parameter or member 'filter_dev' not described in 'ndo_dflt_fdb_dump'
      net/core/rtnetlink.c:3861: warning: Function parameter or member 'idx' not described in 'ndo_dflt_fdb_dump'
      net/core/rtnetlink.c:3861: warning: Excess function parameter 'nlh' description in 'ndo_dflt_fdb_dump'
      
      Cc: Hubert Sokolowski <hubert.sokolowski@intel.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a986967e
    • Bart Van Assche's avatar
      net/core: Document __skb_flow_dissect() flags argument · d79b3baf
      Bart Van Assche authored
      This patch avoids that the following warning is reported when building
      with W=1:
      
      warning: Function parameter or member 'flags' not described in '__skb_flow_dissect'
      
      Cc: Tom Herbert <tom@herbertland.com>
      Fixes: cd79a238 ("flow_dissector: Add flags argument to skb_flow_dissector functions") # v4.3.
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d79b3baf
    • Bart Van Assche's avatar
      net/core: Document all dev_ioctl() arguments · b3c0fd61
      Bart Van Assche authored
      This patch avoids that the following warnings are reported when building
      with W=1:
      
      net/core/dev_ioctl.c:378: warning: Function parameter or member 'ifr' not described in 'dev_ioctl'
      net/core/dev_ioctl.c:378: warning: Function parameter or member 'need_copyout' not described in 'dev_ioctl'
      net/core/dev_ioctl.c:378: warning: Excess function parameter 'arg' description in 'dev_ioctl'
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Fixes: 44c02a2c ("dev_ioctl(): move copyin/copyout to callers") # v4.16.
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3c0fd61
    • Bart Van Assche's avatar
      net/core: Document reuseport_add_sock() bind_inany argument · 37f3c421
      Bart Van Assche authored
      This patch avoids that the following warning is reported when building
      with W=1:
      
      warning: Function parameter or member 'bind_inany' not described in 'reuseport_add_sock'
      
      Cc: Martin KaFai Lau <kafai@fb.com>
      Fixes: 2dbb9b9e ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") # v4.19.
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37f3c421
    • Heiner Kallweit's avatar
      net: dsa: mv88e6xxx: remove unneeded cmode initialization · 863d1a8d
      Heiner Kallweit authored
      This partially reverts ed8fe202 ("net: dsa: mv88e6xxx: prevent
      interrupt storm caused by mv88e6390x_port_set_cmode"). I missed
      that chip->ports[].cmode is overwritten anyway by the cmode
      caching in mv88e6xxx_setup().
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      863d1a8d
    • Sudarsana Reddy Kalluru's avatar
      bnx2x: Utilize FW 7.13.11.0. · 32705592
      Sudarsana Reddy Kalluru authored
      Commit 8fcf0ec44c11f "bnx2x: Add FW 7.13.11.0" added said .bin FW to
      linux-firmware; This patch incorporates the FW in the bnx2x driver.
      This introduces few FW fixes and the support for Tx VLAN filtering.
      
      Please consider applying it to 'net-next' tree.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32705592
    • Kristian Evensen's avatar
      fou: Support binding FoU socket · 1713cb37
      Kristian Evensen authored
      An FoU socket is currently bound to the wildcard-address. While this
      works fine, there are several use-cases where the use of the
      wildcard-address is not desirable. For example, I use FoU on some
      multi-homed servers and would like to use FoU on only one of the
      interfaces.
      
      This commit adds support for binding FoU sockets to a given source
      address/interface, as well as connecting the socket to a given
      destination address/port. udp_tunnel already provides the required
      infrastructure, so most of the code added is for exposing and setting
      the different attributes (local address, peer address, etc.).
      
      The lookups performed when we add, delete or get an FoU-socket has also
      been updated to compare all the attributes a user can set. Since the
      comparison now involves several elements, I have added a separate
      comparison-function instead of open-coding.
      
      In order to test the code and ensure that the new comparison code works
      correctly, I started by creating a wildcard socket bound to port 1234 on
      my machine. I then tried to create a non-wildcarded socket bound to the
      same port, as well as fetching and deleting the socket (including source
      address, peer address or interface index in the netlink request).  Both
      the create, fetch and delete request failed. Deleting/fetching the
      socket was only successful when my netlink request attributes matched
      those used to create the socket.
      
      I then repeated the tests, but with a socket bound to a local ip
      address, a socket bound to a local address + interface, and a bound
      socket that was also «connected» to a peer. Add only worked when no
      socket with the matching source address/interface (or wildcard) existed,
      while fetch/delete was only successful when all attributes matched.
      
      In addition to testing that the new code work, I also checked that the
      current behavior is kept. If none of the new attributes are provided,
      then an FoU-socket is configured as before (i.e., wildcarded).  If any
      of the new attributes are provided, the FoU-socket is configured as
      expected.
      
      v1->v2:
      * Fixed building with IPv6 disabled (kbuild).
      * Fixed a return type warning and make the ugly comparison function more
      readable (kbuild).
      * Describe more in detail what has been tested (thanks David Miller).
      * Make peer port required if peer address is specified.
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1713cb37
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1a9df9e2
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Fixes here and there, a couple new device IDs, as usual:
      
         1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei.
      
         2) Fix 64-bit division in iwlwifi, from Arnd Bergmann.
      
         3) Fix documentation for some eBPF helpers, from Quentin Monnet.
      
         4) Some UAPI bpf header sync with tools, also from Quentin Monnet.
      
         5) Set descriptor ownership bit at the right time for jumbo frames in
            stmmac driver, from Aaro Koskinen.
      
         6) Set IFF_UP properly in tun driver, from Eric Dumazet.
      
         7) Fix load/store doubleword instruction generation in powerpc eBPF
            JIT, from Naveen N. Rao.
      
         8) nla_nest_start() return value checks all over, from Kangjie Lu.
      
         9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this
            merge window. From Marcelo Ricardo Leitner and Xin Long.
      
        10) Fix memory corruption with large MTUs in stmmac, from Aaro
            Koskinen.
      
        11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric
            Dumazet.
      
        12) Fix topology subscription cancellation in tipc, from Erik Hugne.
      
        13) Memory leak in genetlink error path, from Yue Haibing.
      
        14) Valid control actions properly in packet scheduler, from Davide
            Caratti.
      
        15) Even if we get EEXIST, we still need to rehash if a shrink was
            delayed. From Herbert Xu.
      
        16) Fix interrupt mask handling in interrupt handler of r8169, from
            Heiner Kallweit.
      
        17) Fix leak in ehea driver, from Wen Yang"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits)
        dpaa2-eth: fix race condition with bql frame accounting
        chelsio: use BUG() instead of BUG_ON(1)
        net: devlink: skip info_get op call if it is not defined in dumpit
        net: phy: bcm54xx: Encode link speed and activity into LEDs
        tipc: change to check tipc_own_id to return in tipc_net_stop
        net: usb: aqc111: Extend HWID table by QNAP device
        net: sched: Kconfig: update reference link for PIE
        net: dsa: qca8k: extend slave-bus implementations
        net: dsa: qca8k: remove leftover phy accessors
        dt-bindings: net: dsa: qca8k: support internal mdio-bus
        dt-bindings: net: dsa: qca8k: fix example
        net: phy: don't clear BMCR in genphy_soft_reset
        bpf, libbpf: clarify bump in libbpf version info
        bpf, libbpf: fix version info and add it to shared object
        rxrpc: avoid clang -Wuninitialized warning
        tipc: tipc clang warning
        net: sched: fix cleanup NULL pointer exception in act_mirr
        r8169: fix cable re-plugging issue
        net: ethernet: ti: fix possible object reference leak
        net: ibm: fix possible object reference leak
        ...
      1a9df9e2
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · eec7e295
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      100GbE Intel Wired LAN Driver Updates 2019-03-26
      
      This series contains more updates to the ice driver only.
      
      Jeremiah provides his first patch to the Linux kernel to clean up
      un-necessary newlines in driver log messages.
      
      Mitch updates the ice driver to use existing status codes in the iavf
      driver so that when errors occur, it will not report nonsensical
      results.  Adds support for VF admin queue interrupts by programming the
      VPINT_MBX_CTL register array.
      
      Brett adds a check for a bit that we set while preparing for a reset, to
      ensure we are prepared to do a proper reset.  Also implemented PCI error
      handling operations.  Went through and audited the hot path with pahole
      and made modifications based on the results since 2 structures were
      taking up more space than necessary due to cache alignment issues.
      Fixed an issue where when flow control was disabled, the state of flow
      control was being displayed as "Unknown".
      
      Anirudh fixes adaptive interrupt moderation changes by adding code that
      was missed, that should have been added in the initial patch to add that
      support.  Cleaned up a function prototype that was never implemented.
      Did additional code cleanup by removing unneeded braces and redundant
      code comments.
      
      Akeem fixes an issue that occurs when the VF is attempting to remove the
      default LAN/MAC address, which is programmed by the administrator by
      updating the error message to explicitly say that the VF cannot change
      the MAC programmed by the PF.
      
      Preethi fixes the driver to not fall into the error path when a added
      filter already exists, but instead continue to process the rest of the
      function and add appropriate checks after adding MAC filters.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eec7e295
    • David S. Miller's avatar
      Merge branch 'net-mvpp2-Classifier-updates-and-cleanups' · b0be25c5
      David S. Miller authored
      Maxime Chevallier says:
      
      ====================
      net: mvpp2: Classifier updates and cleanups
      
      In preparation for the future addition of classification offload
      support, this series cleans-up the current code dealing with the
      classification engines.
      
      As of today, the classification code only deals with RSS, which is a
      very narrow use-case when considering all of the classifier's features.
      
      This lead to design and naming decisions that don't really stand when
      considering using more classification features.
      
      This series therefore includes quite a lot a renaming of functions and
      macros, and tries to make the naming schemes more consistent and the
      code more readable.
      
      The debugfs interface that allows reading the various Parsing and
      Classification engines has been cleaned-up and made more generic,
      allowing to read the Flow Table and C2 engine hit counters on a
      per-entry basis.
      
      The Classifier itself has been made more robust by introducing the
      lu_type field in classification lookups, that prevents false-positive
      matches in the future. We also initialise the various engine in a more
      extensive and less error-prone way by assining default values to all
      entries in the C2 and Flow table.
      
      This is a pretty big series considering it's mostly cleanup, but my goal
      is to make the future series that will contain new big features easier
      to review, focusing on the real logic.
      
      Besides the debugfs interface, this series doesn't intend to introduce any
      new features or change in behaviours.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0be25c5
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Rework C2 engine macros · c2d3d8ee
      Maxime Chevallier authored
      The C2 classification engine has a 256 entry TCAM, used for ternary
      matches on an 8 byte Header Extracted Key. For now, we compute the
      various indices for classification and RSS that use this engine thanks
      to a set of macros.
      
      This commit mainly renames the macros used to make it clear that they
      should be used with the C2 engine, but also make use of the full 256
      entries in the engine. For now, the C2 entries are only used for RSS.
      
      These entries are put at the end of the TCAM range, in case we want to
      add higher priority matches later on.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2d3d8ee
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Initialize lookup priorities for all entries in the flow · 693131db
      Maxime Chevallier authored
      When classifying a packet pertaining to a given flow, the classifier
      will issue multiple lookup commands until it finds one with the 'last'
      bit set. It expects all prorities to be assign continuously (although
      not necessarily in an ordered fashion) from 0 to the number of lookups.
      
      We can initialize this once, and make sure unused lookups are given an
      empty port map. This avoids having to maintain priorities and the
      information of which lookup is the last.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      693131db
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Invalidate all C2 entries except the ones we use · 8d2847d9
      Maxime Chevallier authored
      C2 TCAM entries can be invalidated to avoid unwanted matches. Make sure
      all entries are invalidated at init, then validate only the ones we use.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d2847d9
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Rename the flow table macros · ff2f3cb6
      Maxime Chevallier authored
      The Flow Table dictates what lookups will be issued for each flow type.
      The lookup sequence for each flow is similar, and the index of each
      lookup is computed by some macros.
      
      There are similar mechanisms for the C2 TCAM lookups, so in order to
      avoid confusion, rename the flow table index computing macros with a
      common prefix.
      
      The only difference in behaviour is that we now use the very first entry
      in the flow for the RSS lookup (the first entry was previously unused).
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff2f3cb6
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Don't use the sequence attribute for classification · 5b353806
      Maxime Chevallier authored
      The classifier allows to combine multiple lookups in one "sequence" that
      is counted as a single lookup to an engine, with a single result.
      
      We don't actually use that feature, so remove any places where we set
      this field, so that the classifier doesn't try to interpret these
      fields.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b353806
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Rename classifer per-port functions · 6310f77d
      Maxime Chevallier authored
      This commit renames some of the classifier functions to follow the
      naming 'mvpp2_port_*' that's used for function that act on a given port.
      
      This commit is purely cosmetic.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6310f77d
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Move C2 read/write helpers around · b11ffdc5
      Maxime Chevallier authored
      Move C2 read/write helpers higher in the file to ease future work that
      rely on these helpers
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b11ffdc5
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Write C2 TCAM data last when writing a C2 entry · 147c538e
      Maxime Chevallier authored
      When writing a C2 entry to hardware, some registers writes will only
      take effect when the TCAM_DATA4 register is written. This includes all
      C2 TCAM registers, and the C2 invalidate register.
      
      To make sure we always write C2 entries correctly, document that
      behaviour with a comment, and move TCAM writes to the end of the
      mvpp2_cls_c2_write helper.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      147c538e
    • Maxime Chevallier's avatar
      net: mvpp2: cls: Use iterators to go through the cls_table · e4bfb4ac
      Maxime Chevallier authored
      The cls_table is a global read-only table containing the different
      parameters that are used by various tables in the classifier. It
      describes the links between the Header Parser, the decoding table and
      the flow_table.
      
      There are several possible way we want to iterate over that table,
      depending on wich classifier engine we want to configure. For the Header
      Parser, we want to iterate over each entry. For the Decoding table, we
      want to iterate over each entry having a unique flow_id. Finally, when
      configuring an ethtool flow, we want to iterate over each entry having a
      unique flow_id and that has a given flow_type.
      
      This commit introduces some iterator to both provide syntactic sugar and
      also clarify the way we want to iterate over the table.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4bfb4ac