1. 24 Jan, 2018 40 commits
    • Lyude Paul's avatar
      igb: Free IRQs when device is hotplugged · 888f2293
      Lyude Paul authored
      Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon
      hotplugging my kernel would immediately crash due to igb:
      
      [  680.825801] kernel BUG at drivers/pci/msi.c:352!
      [  680.828388] invalid opcode: 0000 [#1] SMP
      [  680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb]
      [  680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G           O     4.15.0-rc3Lyude-Test+ #6
      [  680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017
      [  680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
      [  680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0
      [  680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286
      [  680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c
      [  680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178
      [  680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00
      [  680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298
      [  680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000
      [  680.836332] FS:  0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000
      [  680.836817] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0
      [  680.837954] Call Trace:
      [  680.838853]  pci_disable_msix+0xce/0xf0
      [  680.839616]  igb_reset_interrupt_capability+0x5d/0x60 [igb]
      [  680.840278]  igb_remove+0x9d/0x110 [igb]
      [  680.840764]  pci_device_remove+0x36/0xb0
      [  680.841279]  device_release_driver_internal+0x157/0x220
      [  680.841739]  pci_stop_bus_device+0x7d/0xa0
      [  680.842255]  pci_stop_bus_device+0x2b/0xa0
      [  680.842722]  pci_stop_bus_device+0x3d/0xa0
      [  680.843189]  pci_stop_and_remove_bus_device+0xe/0x20
      [  680.843627]  trim_stale_devices+0xf3/0x140
      [  680.844086]  trim_stale_devices+0x94/0x140
      [  680.844532]  trim_stale_devices+0xa6/0x140
      [  680.845031]  ? get_slot_status+0x90/0xc0
      [  680.845536]  acpiphp_check_bridge.part.5+0xfe/0x140
      [  680.846021]  acpiphp_hotplug_notify+0x175/0x200
      [  680.846581]  ? free_bridge+0x100/0x100
      [  680.847113]  acpi_device_hotplug+0x8a/0x490
      [  680.847535]  acpi_hotplug_work_fn+0x1a/0x30
      [  680.848076]  process_one_work+0x182/0x3a0
      [  680.848543]  worker_thread+0x2e/0x380
      [  680.848963]  ? process_one_work+0x3a0/0x3a0
      [  680.849373]  kthread+0x111/0x130
      [  680.849776]  ? kthread_create_worker_on_cpu+0x50/0x50
      [  680.850188]  ret_from_fork+0x1f/0x30
      [  680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b
      [  680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0
      
      As it turns out, normally the freeing of IRQs that would fix this is called
      inside of the scope of __igb_close(). However, since the device is
      already gone by the point we try to unregister the netdevice from the
      driver due to a hotplug we end up seeing that the netif isn't present
      and thus, forget to free any of the device IRQs.
      
      So: make sure that if we're in the process of dismantling the netdev, we
      always allow __igb_close() to be called so that IRQs may be freed
      normally. Additionally, only allow igb_close() to be called from
      __igb_close() if it hasn't already been called for the given adapter.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: 9474933c ("igb: close/suspend race in netif_device_detach")
      Cc: Todd Fujinaka <todd.fujinaka@intel.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: stable@vger.kernel.org
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      888f2293
    • Matt Turner's avatar
      e1000e: Alert the user that C-states will be disabled by enabling jumbo frames · 8299b006
      Matt Turner authored
      I personally spent a long time trying to decypher why my CPU would not
      reach deeper C-states. Let's just tell the next user what's going on.
      Signed-off-by: default avatarMatt Turner <matt.turner@intel.com>
      Acked-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      8299b006
    • Jesus Sanchez-Palencia's avatar
      igb: Clarify idleslope config constraints · 0da6090f
      Jesus Sanchez-Palencia authored
      By design, the idleslope increments are restricted to 16.384kbps steps.
      Add a comment to igb_main.c making that explicit and add one example
      that illustrates the impact of that.
      Signed-off-by: default avatarJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0da6090f
    • Matt Turner's avatar
      e1000e: Set HTHRESH when PTHRESH is used · b701cacd
      Matt Turner authored
      According to section 12.0.3.4.13 "Receive Descriptor Control - RXDCTL"
      of the Intel® 82579 Gigabit Ethernet PHY Datasheet v2.1:
      
          "HTHRESH should be given a non zero value when ever PTHRESH is
           used."
      
      In RXDCTL(0), PTHRESH lives at bits 5:0, and HTHREST lives at bits 13:8.
      Set only bit 8 of HTHREST as is done in e1000_flush_rx_ring(). Found by
      inspection.
      Signed-off-by: default avatarMatt Turner <matt.turner@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b701cacd
    • Zhang Shengju's avatar
      igb: add function to get maximum RSS queues · 28cb2d1b
      Zhang Shengju authored
      This patch adds a new function igb_get_max_rss_queues() to get maximum
      RSS queues, this will reduce duplicate code and facilitate future
      maintenance.
      Signed-off-by: default avatarZhang Shengju <zhangshengju@cmss.chinamobile.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      28cb2d1b
    • Corinna Vinschen's avatar
      igb: Allow to remove administratively set MAC on VFs · 177132df
      Corinna Vinschen authored
      Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
      for use by a virtual machine (either using vfio device assignment or
      macvtap passthru mode), it saves the current MAC address and vlan tag
      so that it can reset them to their original value when the guest is
      done.  Libvirt can't leave the VF MAC set to the value used by the
      now-defunct guest since it may be started again later using a
      different VF, but it certainly shouldn't just pick any random value,
      either. So it saves the state of everything prior to using the VF, and
      resets it to that.
      
      The igb driver initializes the MAC addresses of all VFs to
      00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
      netlink message, also visible in the list of VFs in the output of "ip
      link show"). But when libvirt attempts to restore the MAC address back
      to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
      responds with "Invalid argument".
      
      Forbidding a reset back to the original value leaves the VF MAC at the
      value set for the now-defunct virtual machine. Especially on a system
      with NetworkManager enabled, this has very bad consequences, since
      NetworkManager forces all interfacess to be IFF_UP all the time - if
      the same virtual machine is restarted using a different VF (or even on
      a different host), there will be multiple interfaces watching for
      traffic with the same MAC address.
      
      To allow libvirt to revert to the original state, we need a way to
      remove the administrative set MAC on a VF, to allow normal host
      operation again, and to reset/overwrite the VF MAC via VF netdev.
      
      This patch implements the outlined scenario by allowing to set the
      VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
      igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
      so it's possible to reset the VF MAC back to the original value via
      the VF netdev.
      
      Note: Recent patches to libvirt allow for a workaround if the NIC
      isn't capable of resetting the administrative MAC back to all 0, but
      in theory the NIC should allow resetting the MAC in the first place.
      Signed-off-by: default avatarCorinna Vinschen <vinschen@redhat.com>
      Tested-by: default avatarAaron Brown <arron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      177132df
    • David S. Miller's avatar
      Merge branch 'pktgen-Behavior-flags-fixes' · 46410c2e
      David S. Miller authored
      Dmitry Safonov says:
      
      ====================
      pktgen: Behavior flags fixes
      
      v2:
        o fixed a nitpick from David Miller
      
      There are a bunch of fixes/cleanups/Documentations.
      Diffstat says for itself, regardless added docs and missed flag
      parameters.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46410c2e
    • Dmitry Safonov's avatar
      pktgen: Clean read user supplied flag mess · 52e12d5d
      Dmitry Safonov authored
      Don't use error-prone-brute-force way.
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52e12d5d
    • Dmitry Safonov's avatar
      pktgen: Remove brute-force printing of flags · 99c6d3d2
      Dmitry Safonov authored
      Add macro generated pkt_flag_names array, with a little help of which
      the flags can be printed by using an index.
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99c6d3d2
    • Dmitry Safonov's avatar
      pktgen: Add behaviour flags macro to generate flags/names · 6f107c74
      Dmitry Safonov authored
      PKT_FALGS macro will be used to add package behavior names definitions
      to simplify the code that prints/reads pkg flags.
      Sorted the array in order of printing the flags in pktgen_if_show()
      Note: Renamed IPSEC_ON => IPSEC for simplicity.
      
      No visible behavior change expected.
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f107c74
    • Dmitry Safonov's avatar
      pktgen: Add missing !flag parameters · 57a5749b
      Dmitry Safonov authored
      o FLOW_SEQ now can be disabled with pgset "flag !FLOW_SEQ"
      o FLOW_SEQ and FLOW_RND are antonyms, as it's shown by pktgen_if_show()
      o IPSEC now may be disabled
      
      Note, that IPV6 is enabled with dst6/src6 parameters, not with
      a flag parameter.
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57a5749b
    • Dmitry Safonov's avatar
      Documentation/pktgen: Clearify how-to use pktgen samples · d2ee7973
      Dmitry Safonov authored
      o Change process name in ps output: looks like, these days the process
        is named kpktgend_<cpu>, rather than pktgen/<cpu>.
      o Use pg_ctrl for start/stop as it can work well with pgset without
        changes to $(PGDEV) variable.
      o Clarify a bit needed $(PGDEV) definition for sample scripts and that
        one needs to `source functions.sh`.
      o Document how-to unset a behaviour flag, note about history expansion.
      o Fix pgset spi parameter value.
      
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2ee7973
    • David S. Miller's avatar
      Merge branch 'cxgb4-fix-build-error' · 969ade40
      David S. Miller authored
      Rahul Lakkireddy says:
      
      ====================
      cxgb4: fix build error
      
      Patch 1 fixes build error with compiling cudbg_zlib.c when
      CONFIG_ZLIB_DEFLATE macro is not defined.
      
      Patch 2 fixes following sparse warning:
      "Using plain integer as NULL pointer"
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      969ade40
    • Rahul Lakkireddy's avatar
      cxgb4: properly initialize variables · 325694e6
      Rahul Lakkireddy authored
      memset variables to 0 to fix sparse warnings:
      
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c:409:42: sparse: Using
      plain integer as NULL pointer
      
      drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:43:47: sparse: Using
      plain integer as NULL pointer
      
      Fixes: ad75b7d3 ("cxgb4: implement ethtool dump data operations")
      Fixes: 91c1953d ("cxgb4: use zlib deflate to compress firmware dump")
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      325694e6
    • Rahul Lakkireddy's avatar
      cxgb4: enable ZLIB_DEFLATE when building cxgb4 · a1cf9c9f
      Rahul Lakkireddy authored
      Fixes:
      drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:39:5: error:
      redefinition of 'cudbg_compress_buff'
          int cudbg_compress_buff(struct cudbg_init *pdbg_init,
              ^~~~~~~~~~~~~~~~~~~
         In file included from
      drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:23:0:
         drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.h:45:19: note: previous
      definition of 'cudbg_compress_buff' was here
          static inline int cudbg_compress_buff(struct cudbg_init *pdbg_init,
                            ^~~~~~~~~~~~~~~~~~~
      
      Fixes: 91c1953d ("cxgb4: use zlib deflate to compress firmware dump")
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a1cf9c9f
    • David S. Miller's avatar
      Merge branch 'net-smc-socket-closing-improvements' · 9e1a27cd
      David S. Miller authored
      Ursula Braun says:
      
      ====================
      net/smc: socket closing improvements
      
      while the first 2 patches are just small cleanups, the remaing
      patches affect socket closing.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e1a27cd
    • Ursula Braun's avatar
      net/smc: continue waiting if peer signals write_shutdown · aa377e68
      Ursula Braun authored
      If the peer sends a shutdown WRITE, this should not affect sending
      in general, and waiting for send buffer space in particular.
      Stop waiting of the local socket for send buffer space only, if peer
      signals closing, but not if peer signals just shutdown WRITE.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa377e68
    • Ursula Braun's avatar
      net/smc: improve state change handling after close wait · bbb96bf2
      Ursula Braun authored
      When a socket is closed or shutdown, smc waits for data being transmitted
      in certain states. If the state changes during this wait, the close
      switch depending on state should be reentered.
      In addition, state change is avoided if sending of close or shutdown fails.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbb96bf2
    • Ursula Braun's avatar
      net/smc: make wait for work request uninterruptible · 86e780d3
      Ursula Braun authored
      Work requests are needed for every ib_post_send(), among them the
      ib_post_send() to signal closing. If an smc socket program is cancelled,
      the smc connections should be cleaned up, and require sending of closing
      signals to the peer. This may fail, if a wait for
      a free work request is needed, but is cancelled immediately due to the
      cancel interrupt. To guarantee notification of the peer, the wait for
      a work request is changed to uninterruptible.
      
      And the area to receive work request completion info with
      ib_poll_cq() is cleared first.
      And _tx_ variable names are used in the _tx_routines for the
      demultiplexing common type in the header.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86e780d3
    • Ursula Braun's avatar
      net/smc: get rid of tx_pend waits in socket closing · 8429c134
      Ursula Braun authored
      There is no need to wait for confirmation of pending tx requests
      for a closing connection, since pending tx slots are dismissed
      when finishing a connection.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8429c134
    • Ursula Braun's avatar
      net/smc: simplify function smc_clcsock_accept() · 35a6b178
      Ursula Braun authored
      Cleanup to avoid duplicate code in smc_clcsock_accept().
      No functional change.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35a6b178
    • Ursula Braun's avatar
      net/smc: use local struct sock variables consistently · 3163c507
      Ursula Braun authored
      Cleanup to consistently exploit the local struct sock definitions.
      No functional change.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3163c507
    • Ganesh Goudar's avatar
      cxgb4/cxgb4vf: add support for ndo_set_vf_vlan · 9d5fd927
      Ganesh Goudar authored
      implement ndo_set_vf_vlan for mgmt netdevice to configure
      the PCIe VF.
      
      Original work by: Casey Leedom <leedom@chelsio.com>
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d5fd927
    • David S. Miller's avatar
      Merge branch 'bpf-and-netdevsim-test-updates' · 43df215d
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      bpf and netdevsim test updates
      
      A number of test improvements (delayed by merges).  Quentin provides
      patches for checking printing to the verifier log from the drivers
      and validating extack messages are propagated.  There is also a test
      for replacing TC filters to avoid adding back the bug Daniel recently
      fixed in net and stable.
      ====================
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43df215d
    • Jakub Kicinski's avatar
      selftests/bpf: validate replace of TC filters is working · 6d2d58f1
      Jakub Kicinski authored
      Daniel discovered recently I broke TC filter replace (and fixed
      it in commit ad9294db ("bpf: fix cls_bpf on filter replace")).
      Add a test to make sure it never happens again.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d2d58f1
    • Quentin Monnet's avatar
      selftests/bpf: check bpf verifier log buffer usage works for HW offload · 9045bdc8
      Quentin Monnet authored
      Make netdevsim print a message to the BPF verifier log buffer when a
      program is offloaded.
      
      Then use this message in hardware offload selftests to make sure that
      using this buffer actually prints the message to the console for
      eBPF hardware offload.
      
      The message is appended after the last instruction is processed with the
      verifying function from netdevsim. Output looks like the following:
      
          $ tc filter add dev foo ingress bpf obj sample_ret0.o \
              sec .text verbose skip_sw
      
          Prog section '.text' loaded (5)!
           - Type:         3
           - Instructions: 2 (0 over limit)
           - License:
      
          Verifier analysis:
      
          0: (b7) r0 = 0
          1: (95) exit
          [netdevsim] Hello from netdevsim!
          processed 2 insns, stack depth 0
      
      "verbose" flag is required to see it in the console since netdevsim does
      not throw an error after printing the message.
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9045bdc8
    • Jakub Kicinski's avatar
      netdevsim: don't compile BPF code if syscall not enabled · 7c5db7e7
      Jakub Kicinski authored
      We should not compile netdevsim/bpf.c if BPF syscall is not
      enabled.  Otherwise bpf core would have to provide wrappers
      for all functions offload drivers may call, even though
      system will never see a BPF object.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c5db7e7
    • Quentin Monnet's avatar
      selftests/bpf: add checks on extack messages for eBPF hw offload tests · caf95228
      Quentin Monnet authored
      Add checks to test that netlink extack messages are correctly displayed
      in some expected error cases for eBPF offload to netdevsim with TC and
      XDP.
      
      iproute2 may be built without libmnl support, in which case the extack
      messages will not be reported.  Try to detect this condition, and when
      enountered print a mild warning to the user and skip the extack validation.
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      caf95228
    • Quentin Monnet's avatar
      netdevsim: add extack support for TC eBPF offload · 728461f2
      Quentin Monnet authored
      Use the recently added extack support for TC eBPF filters in netdevsim.
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      728461f2
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 52150464
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2018-01-23
      
      This series contains updates to i40e and i40evf only.
      
      Pawel enables FlatNVM support on x722 devices by allowing nvmupdate tool
      to configure the preservation flags in the AdminQ command.
      
      Mitch fixes a potential divide by zero error when DCB is enabled and
      the firmware fails to configure the VSI, so check for this state.
      Fixed a bug where the driver could fail to adhere to ETS bandwidth
      allocations if 8 traffic classes were configured on the switch.
      
      Sudheer fixes a potential deadlock by avoiding to call
      flush_schedule_work() in i40evf_remove(), since cancel_work_sync()
      and cancel_delayed_work_sync() already cleans up necessary work items.
      Fixed an issue with the problematic detection and recovery from
      hung queues in the PF which was causing lost interrupts.  This is done
      by triggering a software interrupt so that interrupts are forced on
      and if we are already in napi_poll and an interrupt fires, napi_poll
      will not be rescheduled and the interrupt is lost.
      
      Avinash fixes an issue in the VF where is was possible to issue a
      reset_task while the device is currently being removed.
      
      Michal fixes an issue occurring while calling i40e_led_set() with
      the blink parameter set to true, which was causing the activity LED
      instead of the link LED to blink for port identification.
      
      Shiraz changes the client interface to not call client close/open on
      netdev down/up events, since this causes a lot of thrash that is
      not needed.  Instead, disable the PE TCP-ENA flag during a netdev
      down event and re-enable on a netdev up event, since this blocks all
      TCP traffic to the RDMA protocol engine.
      
      Alan fixes an issue which was causing a potential transmit hang by
      ignoring the PF link up message if the VF state is not yet in the
      RUNNING state.
      
      Amritha fixes the channel VSI recreation during the reset flow to
      reconfigure the transmit rings and the queue context associated with
      the channel VSI.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52150464
    • David S. Miller's avatar
      Merge branch 'act_csum-spinlock-remove' · 6b44d0f9
      David S. Miller authored
      Davide Caratti says:
      
      ====================
      net/sched: remove spinlock from 'csum' action
      
      Similarly to what has been done earlier with other actions [1][2], this
      series tries to improve the performance of 'csum' tc action, removing a
      spinlock in the data path. Patch 1 lets act_csum use per-CPU counters;
      patch 2 removes spin_{,un}lock_bh() calls from the act() method.
      
      test procedure (using pktgen from https://github.com/netoptimizer):
      
       # ip link add name eth1 type dummy
       # ip link set dev eth1 up
       # tc qdisc add dev eth1 root handle 1: prio
       # for a in pass drop; do
       > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp
       > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a
       > for n in 2 4; do
       > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1
       > done
       > done
      
      test results:
      
            |    |  before patch   |   after patch
        $a  | $n | avg. pps/thread | avg. pps/thread
       -----+----+-----------------+----------------
       pass |  2 |    1671463 ± 4% |    1920789 ± 3%
       pass |  4 |     648797 ± 1% |     738190 ± 1%
       drop |  2 |    3212692 ± 2% |    3719811 ± 2%
       drop |  4 |    1078824 ± 1% |    1328099 ± 1%
      
      references:
      
      [1] https://www.spinics.net/lists/netdev/msg334760.html
      [2] https://www.spinics.net/lists/netdev/msg465862.html
      
      v3 changes:
       - use rtnl_dereference() in place of rcu_dereference() in tcf_csum_dump()
      
      v2 changes:
       - add 'drop' test, it produces more contentions
       - use RCU-protected struct to store 'action' and 'update_flags', to avoid
         reading the values from subsequent configurations
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b44d0f9
    • Davide Caratti's avatar
      net/sched: act_csum: don't use spinlock in the fast path · 9c5f69bb
      Davide Caratti authored
      use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on
      act_csum configuration, to reduce the effects of contention in the data
      path when multiple readers are present.
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c5f69bb
    • Davide Caratti's avatar
      net/sched: act_csum: use per-core statistics · f6052cf2
      Davide Caratti authored
      use per-CPU counters, like other TC actions do, instead of maintaining one
      set of stats across all cores. This allows updating act_csum stats without
      the need of protecting them using spin_{,un}lock_bh() invocations.
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6052cf2
    • Roopa Prabhu's avatar
      net: link_watch: mark bonding link events urgent · b76f4189
      Roopa Prabhu authored
      It takes 1sec for bond link down notification to hit user-space
      when all slaves of the bond go down. 1sec is too long for
      protocol daemons in user-space relying on bond notification
      to recover (eg: multichassis lag implementations in user-space).
      Since the link event code already marks team device port link events
       as urgent, this patch moves the code to cover all lag ports and master.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b76f4189
    • David S. Miller's avatar
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 0542e13b
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      10GbE Intel Wired LAN Driver Updates 2018-01-23
      
      This series contains updates to ixgbe only.
      
      Shannon Nelson provides an implementation of the ipsec hardware offload
      feature for the ixgbe driver for these devices: x540, x550, 82599.
      
      The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security
      Associations (SAs), using up to 128 inbound IP addresses, and using the
      rfc4106(gcm(aes)) encryption.  This code does not yet support checksum
      offload, or TSO in conjunction with the ipsec offload - those will be
      added in the future.
      
      This code shows improvements in both packet throughput and CPU utilization.
      For example, here are some quicky numbers that show the magnitude of the
      performance gain on a single run of "iperf -c <dest>" with the ipsec
      offload on both ends of a point-to-point connection:
      
      	9.4 Gbps - normal case
      	7.6 Gbps - ipsec with offload
      	343 Mbps - ipsec no offload
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0542e13b
    • David S. Miller's avatar
      Merge branch 'GEHC-Bx50-Switch-Support' · c89b517d
      David S. Miller authored
      Sebastian Reichel says:
      
      ====================
      GEHC Bx50 Switch Support
      
      This adds support for the internal switch found in GE Healthcare
      B450v3, B650v3 and B850v3. All devices use a GPIO bitbanged MDIO
      bus to communicate with the switch and a PCIe based network card
      for exchanging network data. The cpu network data link requires,
      that the switch's internal phy interface is enabled, so support
      for that is added by the first patch in this series.
      
      The patch series is based on v4.15-rc8.
      
      Changes since PATCHv4:
       * Introduce dsa_port_link_(un)register_of and mark the fixed
         variant static.
       * Update patch description to describe the phy<->phy connection
         from i210 to the Marvell switch
      Changes since PATCHv3:
       * Enable the phy in dsa_port_setup() instead of abusing the
         fixed link setup function
      Changes since PATCHv2:
       * Add phy nodes to switch in bx50.dtsi and reference them
         from switch ports
       * Enable cpu-port's phy based on 'phy-handle' instead of 'phy-mode'
      Changes since PATCHv1:
       * Use 'marvell,mv88e6085' instead of introducing compatible
         string for mv88e6240.
       * Fix indention of DT nodes
       * Only enable 'cpu' phy, if explicitly set to "internal".
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c89b517d
    • Sebastian Reichel's avatar
      ARM: dts: imx6q-b450v3: Add switch port configuration · 658d063d
      Sebastian Reichel authored
      This adds support for the Marvell switch and names the network
      ports according to the labels, that can be found next to the
      connectors. The switch is connected to the host system using a
      PCI based network card.
      
      The PCI bus configuration has been written using the following
      information:
      
      root@b450v3# lspci -tv
      -[0000:00]---00.0-[01]----00.0  Intel Corporation I210 Gigabit Network Connection
      root@b450v3# lspci -nn
      00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
      01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      658d063d
    • Sebastian Reichel's avatar
      ARM: dts: imx6q-b650v3: Add switch port configuration · b2ea7f83
      Sebastian Reichel authored
      This adds support for the Marvell switch and names the network
      ports according to the labels, that can be found next to the
      connectors. The switch is connected to the host system using a
      PCI based network card.
      
      The PCI bus configuration has been written using the following
      information:
      
      root@b650v3# lspci -tv
      -[0000:00]---00.0-[01]----00.0  Intel Corporation I210 Gigabit Network Connection
      root@b650v3# lspci -nn
      00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
      01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2ea7f83
    • Sebastian Reichel's avatar
      ARM: dts: imx6q-b850v3: Add switch port configuration · e6b22e41
      Sebastian Reichel authored
      This adds support for the Marvell switch and names the network
      ports according to the labels, that can be found next to the
      connectors ("ID", "IX", "ePort 1", "ePort 2"). The switch is
      connected to the host system using a PCI based network card.
      
      The PCI bus configuration has been written using the following
      information:
      
      root@b850v3# lspci -tv
      -[0000:00]---00.0-[01]----00.0-[02-05]--+-01.0-[03]----00.0  Intel Corporation I210 Gigabit Network Connection
                                              +-02.0-[04]----00.0  Intel Corporation I210 Gigabit Network Connection
                                              \-03.0-[05]--
      root@b850v3# lspci -nn
      00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
      01:00.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
      02:01.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
      02:02.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
      02:03.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
      03:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
      04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6b22e41
    • Sebastian Reichel's avatar
      ARM: dts: imx6q-bx50v3: Add internal switch · e26dead4
      Sebastian Reichel authored
      B850v3, B650v3 and B450v3 all have a GPIO bit banged MDIO bus to
      communicate with a Marvell switch. On all devices the switch is
      connected to a PCI based network card, which needs to be referenced
      by DT, so this also adds the common PCI root node.
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e26dead4