An error occurred fetching the project authors.
  1. 28 Jul, 2022 3 commits
  2. 15 Jul, 2022 1 commit
    • Zhuo Chen's avatar
      ice: Remove pci_aer_clear_nonfatal_status() call · ca415ea1
      Zhuo Chen authored
      After commit 62b36c3e ("PCI/AER: Remove
      pci_cleanup_aer_uncorrect_error_status() calls"), calls to
      pci_cleanup_aer_uncorrect_error_status() have already been removed. But in
      commit 5995b6d0 ("ice: Implement pci_error_handler ops")
      pci_cleanup_aer_uncorrect_error_status  was used again, so remove it in
      this patch.
      Signed-off-by: default avatarZhuo Chen <chenzhuo.1@bytedance.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Cc: Sen Wang <wangsen.harry@bytedance.com>
      Cc: Wenliang Wang <wangwenliang.1995@bytedance.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      ca415ea1
  3. 12 Jul, 2022 1 commit
  4. 14 Jun, 2022 1 commit
  5. 17 May, 2022 1 commit
    • Paul Greenwalt's avatar
      ice: fix possible under reporting of ethtool Tx and Rx statistics · 31b6298f
      Paul Greenwalt authored
      The hardware statistics counters are not cleared during resets so the
      drivers first access is to initialize the baseline and then subsequent
      reads are for reporting the counters. The statistics counters are read
      during the watchdog subtask when the interface is up. If the baseline
      is not initialized before the interface is up, then there can be a brief
      window in which some traffic can be transmitted/received before the
      initial baseline reading takes place.
      
      Directly initialize ethtool statistics in driver open so the baseline will
      be initialized when the interface is up, and any dropped packets
      incremented before the interface is up won't be reported.
      
      Fixes: 28dc1b86 ("ice: ignore dropped packets during init")
      Signed-off-by: default avatarPaul Greenwalt <paul.greenwalt@intel.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      31b6298f
  6. 09 May, 2022 1 commit
  7. 06 May, 2022 1 commit
    • Ivan Vecera's avatar
      ice: Fix race during aux device (un)plugging · 486b9eee
      Ivan Vecera authored
      Function ice_plug_aux_dev() assigns pf->adev field too early prior
      aux device initialization and on other side ice_unplug_aux_dev()
      starts aux device deinit and at the end assigns NULL to pf->adev.
      This is wrong because pf->adev should always be non-NULL only when
      aux device is fully initialized and ready. This wrong order causes
      a crash when ice_send_event_to_aux() call occurs because that function
      depends on non-NULL value of pf->adev and does not assume that
      aux device is half-initialized or half-destroyed.
      After order correction the race window is tiny but it is still there,
      as Leon mentioned and manipulation with pf->adev needs to be protected
      by mutex.
      
      Fix (un-)plugging functions so pf->adev field is set after aux device
      init and prior aux device destroy and protect pf->adev assignment by
      new mutex. This mutex is also held during ice_send_event_to_aux()
      call to ensure that aux device is valid during that call.
      Note that device lock used ice_send_event_to_aux() needs to be kept
      to avoid race with aux drv unload.
      
      Reproducer:
      cycle=1
      while :;do
              echo "#### Cycle: $cycle"
      
              ip link set ens7f0 mtu 9000
              ip link add bond0 type bond mode 1 miimon 100
              ip link set bond0 up
              ifenslave bond0 ens7f0
              ip link set bond0 mtu 9000
              ethtool -L ens7f0 combined 1
              ip link del bond0
              ip link set ens7f0 mtu 1500
              sleep 1
      
              let cycle++
      done
      
      In short when the device is added/removed to/from bond the aux device
      is unplugged/plugged. When MTU of the device is changed an event is
      sent to aux device asynchronously. This can race with (un)plugging
      operation and because pf->adev is set too early (plug) or too late
      (unplug) the function ice_send_event_to_aux() can touch uninitialized
      or destroyed fields. In the case of crash below pf->adev->dev.mutex.
      
      Crash:
      [   53.372066] bond0: (slave ens7f0): making interface the new active one
      [   53.378622] bond0: (slave ens7f0): Enslaving as an active interface with an u
      p link
      [   53.386294] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
      [   53.549104] bond0: (slave ens7f1): Enslaving as a backup interface with an up
       link
      [   54.118906] ice 0000:ca:00.0 ens7f0: Number of in use tx queues changed inval
      idating tc mappings. Priority traffic classification disabled!
      [   54.233374] ice 0000:ca:00.1 ens7f1: Number of in use tx queues changed inval
      idating tc mappings. Priority traffic classification disabled!
      [   54.248204] bond0: (slave ens7f0): Releasing backup interface
      [   54.253955] bond0: (slave ens7f1): making interface the new active one
      [   54.274875] bond0: (slave ens7f1): Releasing backup interface
      [   54.289153] bond0 (unregistering): Released all slaves
      [   55.383179] MII link monitoring set to 100 ms
      [   55.398696] bond0: (slave ens7f0): making interface the new active one
      [   55.405241] BUG: kernel NULL pointer dereference, address: 0000000000000080
      [   55.405289] bond0: (slave ens7f0): Enslaving as an active interface with an u
      p link
      [   55.412198] #PF: supervisor write access in kernel mode
      [   55.412200] #PF: error_code(0x0002) - not-present page
      [   55.412201] PGD 25d2ad067 P4D 0
      [   55.412204] Oops: 0002 [#1] PREEMPT SMP NOPTI
      [   55.412207] CPU: 0 PID: 403 Comm: kworker/0:2 Kdump: loaded Tainted: G S
                 5.17.0-13579-g57f2d6540f03 #1
      [   55.429094] bond0: (slave ens7f1): Enslaving as a backup interface with an up
       link
      [   55.430224] Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.4.4 10/07/
      2021
      [   55.430226] Workqueue: ice ice_service_task [ice]
      [   55.468169] RIP: 0010:mutex_unlock+0x10/0x20
      [   55.472439] Code: 0f b1 13 74 96 eb e0 4c 89 ee eb d8 e8 79 54 ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 04 25 40 ef 01 00 31 d2 <f0> 48 0f b1 17 75 01 c3 e9 e3 fe ff ff 0f 1f 00 0f 1f 44 00 00 48
      [   55.491186] RSP: 0018:ff4454230d7d7e28 EFLAGS: 00010246
      [   55.496413] RAX: ff1a79b208b08000 RBX: ff1a79b2182e8880 RCX: 0000000000000001
      [   55.503545] RDX: 0000000000000000 RSI: ff4454230d7d7db0 RDI: 0000000000000080
      [   55.510678] RBP: ff1a79d1c7e48b68 R08: ff4454230d7d7db0 R09: 0000000000000041
      [   55.517812] R10: 00000000000000a5 R11: 00000000000006e6 R12: ff1a79d1c7e48bc0
      [   55.524945] R13: 0000000000000000 R14: ff1a79d0ffc305c0 R15: 0000000000000000
      [   55.532076] FS:  0000000000000000(0000) GS:ff1a79d0ffc00000(0000) knlGS:0000000000000000
      [   55.540163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   55.545908] CR2: 0000000000000080 CR3: 00000003487ae003 CR4: 0000000000771ef0
      [   55.553041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   55.560173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   55.567305] PKRU: 55555554
      [   55.570018] Call Trace:
      [   55.572474]  <TASK>
      [   55.574579]  ice_service_task+0xaab/0xef0 [ice]
      [   55.579130]  process_one_work+0x1c5/0x390
      [   55.583141]  ? process_one_work+0x390/0x390
      [   55.587326]  worker_thread+0x30/0x360
      [   55.590994]  ? process_one_work+0x390/0x390
      [   55.595180]  kthread+0xe6/0x110
      [   55.598325]  ? kthread_complete_and_exit+0x20/0x20
      [   55.603116]  ret_from_fork+0x1f/0x30
      [   55.606698]  </TASK>
      
      Fixes: f9f5301e ("ice: Register auxiliary device to provide RDMA")
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      486b9eee
  8. 05 May, 2022 1 commit
  9. 26 Apr, 2022 1 commit
    • Petr Oros's avatar
      ice: wait 5 s for EMP reset after firmware flash · b537752e
      Petr Oros authored
      We need to wait 5 s for EMP reset after firmware flash. Code was extracted
      from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this
      wait, fw_activate let card in inconsistent state and recoverable only
      by second flash/activate. Flash was tested on these fw's:
      From -> To
       3.00 -> 3.10/3.20
       3.10 -> 3.00/3.20
       3.20 -> 3.00/3.10
      
      Reproducer:
      [root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
      Preparing to flash
      [fw.mgmt] Erasing
      [fw.mgmt] Erasing done
      [fw.mgmt] Flashing 100%
      [fw.mgmt] Flashing done 100%
      [fw.undi] Erasing
      [fw.undi] Erasing done
      [fw.undi] Flashing 100%
      [fw.undi] Flashing done 100%
      [fw.netlist] Erasing
      [fw.netlist] Erasing done
      [fw.netlist] Flashing 100%
      [fw.netlist] Flashing done 100%
      Activate new firmware by devlink reload
      [root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
      reload_actions_performed:
          fw_activate
      [root@host ~]# ip link show ens7f0
      71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
          link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
          altname enp202s0f0
      
      dmesg after flash:
      [   55.120788] ice: Copyright (c) 2018, Intel Corporation.
      [   55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway
      [   55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0
      [   55.603629] ice 0000:ca:00.0: Get PHY capability failed.
      [   55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
      [   55.647348] ice 0000:ca:00.0: PTP init successful
      [   55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8
      [   55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode.
      [   55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware
      [   55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
      Reboot doesn’t help, only second flash/activate with OOT or patched
      driver put card back in consistent state.
      
      After patch:
      [root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
      Preparing to flash
      [fw.mgmt] Erasing
      [fw.mgmt] Erasing done
      [fw.mgmt] Flashing 100%
      [fw.mgmt] Flashing done 100%
      [fw.undi] Erasing
      [fw.undi] Erasing done
      [fw.undi] Flashing 100%
      [fw.undi] Flashing done 100%
      [fw.netlist] Erasing
      [fw.netlist] Erasing done
      [fw.netlist] Flashing 100%
      [fw.netlist] Flashing done 100%
      Activate new firmware by devlink reload
      [root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
      reload_actions_performed:
          fw_activate
      [root@host ~]# ip link show ens7f0
      19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
          altname enp202s0f0
      
      Fixes: 399e27db ("ice: support immediate firmware activation via devlink reload")
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      b537752e
  10. 12 Apr, 2022 1 commit
    • Joe Damato's avatar
      ice: Add mpls+tso support · 69e66c04
      Joe Damato authored
      Attempt to add mpls+tso support.
      
      I don't have ice hardware available to test myself, but I just implemented
      this feature in i40e and thought it might be useful to implement for ice
      while this is fresh in my brain.
      
      Hoping some one at intel will be able to test this on my behalf.
      Signed-off-by: default avatarJoe Damato <jdamato@fastly.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      69e66c04
  11. 08 Apr, 2022 1 commit
    • Alexander Lobakin's avatar
      ice: arfs: fix use-after-free when freeing @rx_cpu_rmap · d7442f51
      Alexander Lobakin authored
      The CI testing bots triggered the following splat:
      
      [  718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80
      [  718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834
      [  718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S      W IOE     5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1
      [  718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020
      [  718.223418] Call Trace:
      [  718.227139]
      [  718.230783]  dump_stack_lvl+0x33/0x42
      [  718.234431]  print_address_description.constprop.9+0x21/0x170
      [  718.238177]  ? free_irq_cpu_rmap+0x53/0x80
      [  718.241885]  ? free_irq_cpu_rmap+0x53/0x80
      [  718.245539]  kasan_report.cold.18+0x7f/0x11b
      [  718.249197]  ? free_irq_cpu_rmap+0x53/0x80
      [  718.252852]  free_irq_cpu_rmap+0x53/0x80
      [  718.256471]  ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice]
      [  718.260174]  ice_remove_arfs+0x5f/0x70 [ice]
      [  718.263810]  ice_rebuild_arfs+0x3b/0x70 [ice]
      [  718.267419]  ice_rebuild+0x39c/0xb60 [ice]
      [  718.270974]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
      [  718.274472]  ? ice_init_phy_user_cfg+0x360/0x360 [ice]
      [  718.278033]  ? delay_tsc+0x4a/0xb0
      [  718.281513]  ? preempt_count_sub+0x14/0xc0
      [  718.284984]  ? delay_tsc+0x8f/0xb0
      [  718.288463]  ice_do_reset+0x92/0xf0 [ice]
      [  718.292014]  ice_pci_err_resume+0x91/0xf0 [ice]
      [  718.295561]  pci_reset_function+0x53/0x80
      <...>
      [  718.393035] Allocated by task 690:
      [  718.433497] Freed by task 20834:
      [  718.495688] Last potentially related work creation:
      [  718.568966] The buggy address belongs to the object at ffff8881bd127e00
                      which belongs to the cache kmalloc-96 of size 96
      [  718.574085] The buggy address is located 0 bytes inside of
                      96-byte region [ffff8881bd127e00, ffff8881bd127e60)
      [  718.579265] The buggy address belongs to the page:
      [  718.598905] Memory state around the buggy address:
      [  718.601809]  ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  718.604796]  ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
      [  718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  718.610811]                    ^
      [  718.613819]  ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
      [  718.617107]  ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      
      This is due to that free_irq_cpu_rmap() is always being called
      *after* (devm_)free_irq() and thus it tries to work with IRQ descs
      already freed. For example, on device reset the driver frees the
      rmap right before allocating a new one (the splat above).
      Make rmap creation and freeing function symmetrical with
      {request,free}_irq() calls i.e. do that on ifup/ifdown instead
      of device probe/remove/resume. These operations can be performed
      independently from the actual device aRFS configuration.
      Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers
      only when aRFS is disabled -- otherwise, CPU rmap sets and clears
      its own and they must not be touched manually.
      
      Fixes: 28bf2672 ("ice: Implement aRFS")
      Co-developed-by: default avatarIvan Vecera <ivecera@redhat.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Signed-off-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Tested-by: default avatarIvan Vecera <ivecera@redhat.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      d7442f51
  12. 05 Apr, 2022 2 commits
  13. 01 Apr, 2022 2 commits
    • Ivan Vecera's avatar
      ice: Fix broken IFF_ALLMULTI handling · 1273f895
      Ivan Vecera authored
      Handling of all-multicast flag and associated multicast promiscuous
      mode is broken in ice driver. When an user switches allmulticast
      flag on or off the driver checks whether any VLANs are configured
      over the interface (except default VLAN 0).
      
      If any extra VLANs are registered it enables multicast promiscuous
      mode for all these VLANs (including default VLAN 0) using
      ICE_SW_LKUP_PROMISC_VLAN look-up type. In this situation all
      multicast packets tagged with known VLAN ID or untagged are received
      and multicast packets tagged with unknown VLAN ID ignored.
      
      If no extra VLANs are registered (so only VLAN 0 exists) it enables
      multicast promiscuous mode for VLAN 0 and uses ICE_SW_LKUP_PROMISC
      look-up type. In this situation any multicast packets including
      tagged ones are received.
      
      The driver handles IFF_ALLMULTI in ice_vsi_sync_fltr() this way:
      
      ice_vsi_sync_fltr() {
        ...
        if (changed_flags & IFF_ALLMULTI) {
          if (netdev->flags & IFF_ALLMULTI) {
            if (vsi->num_vlans > 1)
              ice_set_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS);
            else
              ice_set_promisc(..., ICE_MCAST_PROMISC_BITS);
          } else {
            if (vsi->num_vlans > 1)
              ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS);
            else
              ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS);
          }
        }
        ...
      }
      
      The code above depends on value vsi->num_vlan that specifies number
      of VLANs configured over the interface (including VLAN 0) and
      this is problem because that value is modified in NDO callbacks
      ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid().
      
      Scenario 1:
      1. ip link set ens7f0 allmulticast on
      2. ip link add vlan10 link ens7f0 type vlan id 10
      3. ip link set ens7f0 allmulticast off
      4. ip link set ens7f0 allmulticast on
      
      [1] In this scenario IFF_ALLMULTI is enabled and the driver calls
          ice_set_promisc(..., ICE_MCAST_PROMISC_BITS) that installs
          multicast promisc rule with non-VLAN look-up type.
      [2] Then VLAN with ID 10 is added and vsi->num_vlan incremented to 2
      [3] Command switches IFF_ALLMULTI off and the driver calls
          ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS) but this
          call is effectively NOP because it looks for multicast promisc
          rules for VLAN 0 and VLAN 10 with VLAN look-up type but no such
          rules exist. So the all-multicast remains enabled silently
          in hardware.
      [4] Command tries to switch IFF_ALLMULTI on and the driver calls
          ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS) but this call
          fails (-EEXIST) because non-VLAN multicast promisc rule already
          exists.
      
      Scenario 2:
      1. ip link add vlan10 link ens7f0 type vlan id 10
      2. ip link set ens7f0 allmulticast on
      3. ip link add vlan20 link ens7f0 type vlan id 20
      4. ip link del vlan10 ; ip link del vlan20
      5. ip link set ens7f0 allmulticast off
      
      [1] VLAN with ID 10 is added and vsi->num_vlan==2
      [2] Command switches IFF_ALLMULTI on and driver installs multicast
          promisc rules with VLAN look-up type for VLAN 0 and 10
      [3] VLAN with ID 20 is added and vsi->num_vlan==3 but no multicast
          promisc rules is added for this new VLAN so the interface does
          not receive MC packets from VLAN 20
      [4] Both VLANs are removed but multicast rule for VLAN 10 remains
          installed so interface receives multicast packets from VLAN 10
      [5] Command switches IFF_ALLMULTI off and because vsi->num_vlan is 1
          the driver tries to remove multicast promisc rule for VLAN 0
          with non-VLAN look-up that does not exist.
          All-multicast looks disabled from user point of view but it
          is partially enabled in HW (interface receives all multicast
          packets either untagged or tagged with VLAN ID 10)
      
      To resolve these issues the patch introduces these changes:
      1. Adds handling for IFF_ALLMULTI to ice_vlan_rx_add_vid() and
         ice_vlan_rx_kill_vid() callbacks. So when VLAN is added/removed
         and IFF_ALLMULTI is enabled an appropriate multicast promisc
         rule for that VLAN ID is added/removed.
      2. In ice_vlan_rx_add_vid() when first VLAN besides VLAN 0 is added
         so (vsi->num_vlan == 2) and IFF_ALLMULTI is enabled then look-up
         type for existing multicast promisc rule for VLAN 0 is updated
         to ICE_MCAST_VLAN_PROMISC_BITS.
      3. In ice_vlan_rx_kill_vid() when last VLAN besides VLAN 0 is removed
         so (vsi->num_vlan == 1) and IFF_ALLMULTI is enabled then look-up
         type for existing multicast promisc rule for VLAN 0 is updated
         to ICE_MCAST_PROMISC_BITS.
      4. Both ice_vlan_rx_{add,kill}_vid() have to run under ICE_CFG_BUSY
         bit protection to avoid races with ice_vsi_sync_fltr() that runs
         in ice_service_task() context.
      5. Bit ICE_VSI_VLAN_FLTR_CHANGED is use-less and can be removed.
      6. Error messages added to ice_fltr_*_vsi_promisc() helper functions
         to avoid them in their callers
      7. Small improvements to increase readability
      
      Fixes: 5eda8afd ("ice: Add support for PF/VF promiscuous mode")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1273f895
    • Ivan Vecera's avatar
      ice: Fix MAC address setting · 2c0069f3
      Ivan Vecera authored
      Commit 2ccc1c1c ("ice: Remove excess error variables") merged
      the usage of 'status' and 'err' variables into single one in
      function ice_set_mac_address(). Unfortunately this causes
      a regression when call of ice_fltr_add_mac() returns -EEXIST because
      this return value does not indicate an error in this case but
      value of 'err' remains to be -EEXIST till the end of the function
      and is returned to caller.
      
      Prior mentioned commit this does not happen because return value of
      ice_fltr_add_mac() was stored to 'status' variable first and
      if it was -EEXIST then 'err' remains to be zero.
      
      Fix the problem by reset 'err' to zero when ice_fltr_add_mac()
      returns -EEXIST.
      
      Fixes: 2ccc1c1c ("ice: Remove excess error variables")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c0069f3
  14. 23 Mar, 2022 1 commit
    • Alexander Lobakin's avatar
      ice: fix 'scheduling while atomic' on aux critical err interrupt · 32d53c0a
      Alexander Lobakin authored
      There's a kernel BUG splat on processing aux critical error
      interrupts in ice_misc_intr():
      
      [ 2100.917085] BUG: scheduling while atomic: swapper/15/0/0x00010000
      ...
      [ 2101.060770] Call Trace:
      [ 2101.063229]  <IRQ>
      [ 2101.065252]  dump_stack+0x41/0x60
      [ 2101.068587]  __schedule_bug.cold.100+0x4c/0x58
      [ 2101.073060]  __schedule+0x6a4/0x830
      [ 2101.076570]  schedule+0x35/0xa0
      [ 2101.079727]  schedule_preempt_disabled+0xa/0x10
      [ 2101.084284]  __mutex_lock.isra.7+0x310/0x420
      [ 2101.088580]  ? ice_misc_intr+0x201/0x2e0 [ice]
      [ 2101.093078]  ice_send_event_to_aux+0x25/0x70 [ice]
      [ 2101.097921]  ice_misc_intr+0x220/0x2e0 [ice]
      [ 2101.102232]  __handle_irq_event_percpu+0x40/0x180
      [ 2101.106965]  handle_irq_event_percpu+0x30/0x80
      [ 2101.111434]  handle_irq_event+0x36/0x53
      [ 2101.115292]  handle_edge_irq+0x82/0x190
      [ 2101.119148]  handle_irq+0x1c/0x30
      [ 2101.122480]  do_IRQ+0x49/0xd0
      [ 2101.125465]  common_interrupt+0xf/0xf
      [ 2101.129146]  </IRQ>
      ...
      
      As Andrew correctly mentioned previously[0], the following call
      ladder happens:
      
      ice_misc_intr() <- hardirq
        ice_send_event_to_aux()
          device_lock()
            mutex_lock()
              might_sleep()
                might_resched() <- oops
      
      Add a new PF state bit which indicates that an aux critical error
      occurred and serve it in ice_service_task() in process context.
      The new ice_pf::oicr_err_reg is read-write in both hardirq and
      process contexts, but only 3 bits of non-critical data probably
      aren't worth explicit synchronizing (and they're even in the same
      byte [31:24]).
      
      [0] https://lore.kernel.org/all/YeSRUVmrdmlUXHDn@lunn.ch
      
      Fixes: 348048e7 ("ice: Implement iidc operations")
      Signed-off-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Tested-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Acked-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      32d53c0a
  15. 15 Mar, 2022 7 commits
    • Sudheer Mogilappagari's avatar
      ice: destroy flow director filter mutex after releasing VSIs · 1b4ae7d9
      Sudheer Mogilappagari authored
      Currently fdir_fltr_lock is accessed in ice_vsi_release_all() function
      after it is destroyed. Instead destroy mutex after ice_vsi_release_all.
      
      Fixes: 40319796 ("ice: Add flow director support for channel mode")
      Signed-off-by: default avatarSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
      Tested-by: default avatarBharathi Sreenivas <bharathi.sreenivas@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      1b4ae7d9
    • Maciej Fijalkowski's avatar
      ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats() · f1535469
      Maciej Fijalkowski authored
      It is possible to do NULL pointer dereference in routine that updates
      Tx ring stats. Currently only stats and bytes are updated when ring
      pointer is valid, but later on ring is accessed to propagate gathered Tx
      stats onto VSI stats.
      
      Change the existing logic to move to next ring when ring is NULL.
      
      Fixes: e72bba21 ("ice: split ice_ring onto Tx/Rx separate structs")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Acked-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      f1535469
    • Jacob Keller's avatar
      ice: introduce ICE_VF_RESET_LOCK flag · f5f085c0
      Jacob Keller authored
      The ice_reset_vf function performs actions which must be taken only
      while holding the VF configuration lock. Some flows already acquired the
      lock, while other flows must acquire it just for the reset function. Add
      the ICE_VF_RESET_LOCK flag to the function so that it can handle taking
      and releasing the lock instead at the appropriate scope.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      f5f085c0
    • Jacob Keller's avatar
      ice: convert ice_reset_vf to take flags · 7eb517e4
      Jacob Keller authored
      The ice_reset_vf function takes a boolean parameter which indicates
      whether or not the reset is due to a VFLR event.
      
      This is somewhat confusing to read because readers must interpret what
      "true" and "false" mean when seeing a line of code like
      "ice_reset_vf(vf, false)".
      
      We will want to add another toggle to the ice_reset_vf in a following
      change. To avoid proliferating many arguments, convert this function to
      take flags instead. ICE_VF_RESET_VFLR will indicate if this is a VFLR
      reset. A value of 0 indicates no flags.
      
      One could argue that "ice_reset_vf(vf, 0)" is no more readable than
      "ice_reset_vf(vf, false)".. However, this type of flags interface is
      somewhat common and using 0 to mean "no flags" makes sense in this
      context. We could bother to add a define for "ICE_VF_RESET_PLAIN" or
      something similar, but this can be confusing since its not an actual bit
      flag.
      
      This paves the way to add another flag to the function in a following
      change.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      7eb517e4
    • Jacob Keller's avatar
      ice: drop is_vflr parameter from ice_reset_all_vfs · dac57288
      Jacob Keller authored
      The ice_reset_all_vfs function takes a parameter to handle whether its
      operating after a VFLR event or not. This is not necessary as every
      caller always passes true. Simplify the interface by removing the
      parameter.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      dac57288
    • Jacob Keller's avatar
      ice: rename ICE_MAX_VF_COUNT to avoid confusion · dc36796e
      Jacob Keller authored
      The ICE_MAX_VF_COUNT field is defined in ice_sriov.h. This count is true
      for SR-IOV but will not be true for all VF implementations, such as when
      the ice driver supports Scalable IOV.
      
      Rename this definition to clearly indicate ICE_MAX_SRIOV_VFS.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      dc36796e
    • Jacob Keller's avatar
      ice: remove circular header dependencies on ice.h · 649c87c6
      Jacob Keller authored
      Several headers in the ice driver include ice.h even though they are
      themselves included by that header. The most notable of these is
      ice_common.h, but several other headers also do this.
      
      Such a recursive inclusion is problematic as it forces headers to be
      included in a strict order, otherwise compilation errors can result. The
      circular inclusions do not trigger an endless loop due to standard
      header inclusion guards, however other errors can occur.
      
      For example, ice_flow.h defines ice_rss_hash_cfg, which is used by
      ice_sriov.h as part of the definition of ice_vf_hash_ip_ctx.
      
      ice_flow.h includes ice_acl.h, which includes ice_common.h, and which
      finally includes ice.h. Since ice.h itself includes ice_sriov.h, this
      creates a circular dependency.
      
      The definition in ice_sriov.h requires things from ice_flow.h, but
      ice_flow.h itself will lead to trying to load ice_sriov.h as part of its
      process for expanding ice.h. The current code avoids this issue by
      having an implicit dependency without the include of ice_flow.h.
      
      If we were to fix that so that ice_sriov.h explicitly depends on
      ice_flow.h the following pattern would occur:
      
        ice_flow.h -> ice_acl.h -> ice_common.h -> ice.h -> ice_sriov.h
      
      At this point, during the expansion of, the header guard for ice_flow.h
      is already set, so when ice_sriov.h attempts to load the ice_flow.h
      header it is skipped. Then, we go on to begin including the rest of
      ice_sriov.h, including structure definitions which depend on
      ice_rss_hash_cfg. This produces a compiler warning because
      ice_rss_hash_cfg hasn't yet been included. Remember, we're just at the
      start of ice_flow.h!
      
      If the order of headers is incorrect (ice_flow.h is not implicitly
      loaded first in all files which include ice_sriov.h) then we get the
      same failure.
      
      Removing this recursive inclusion requires fixing a few cases where some
      headers depended on the header inclusions from ice.h. In addition, a few
      other changes are also required.
      
      Most notably, ice_hw_to_dev is implemented as a macro in ice_osdep.h,
      which is the likely reason that ice_common.h includes ice.h at all. This
      macro implementation requires the full definition of ice_pf in order to
      properly compile.
      
      Fix this by moving it to a function declared in ice_main.c, so that we
      do not require all files to depend on the layout of the ice_pf
      structure.
      
      Note that this change only fixes circular dependencies, but it does not
      fully resolve all implicit dependencies where one header may depend on
      the inclusion of another. I tried to fix as many of the implicit
      dependencies as I noticed, but fixing them all requires a somewhat
      tedious analysis of each header and attempting to compile it separately.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      649c87c6
  16. 10 Mar, 2022 1 commit
    • Ivan Vecera's avatar
      ice: Fix race condition during interface enslave · 5cb1ebdb
      Ivan Vecera authored
      Commit 5dbbbd01 ("ice: Avoid RTNL lock when re-creating
      auxiliary device") changes a process of re-creation of aux device
      so ice_plug_aux_dev() is called from ice_service_task() context.
      This unfortunately opens a race window that can result in dead-lock
      when interface has left LAG and immediately enters LAG again.
      
      Reproducer:
      ```
      #!/bin/sh
      
      ip link add lag0 type bond mode 1 miimon 100
      ip link set lag0
      
      for n in {1..10}; do
              echo Cycle: $n
              ip link set ens7f0 master lag0
              sleep 1
              ip link set ens7f0 nomaster
      done
      ```
      
      This results in:
      [20976.208697] Workqueue: ice ice_service_task [ice]
      [20976.213422] Call Trace:
      [20976.215871]  __schedule+0x2d1/0x830
      [20976.219364]  schedule+0x35/0xa0
      [20976.222510]  schedule_preempt_disabled+0xa/0x10
      [20976.227043]  __mutex_lock.isra.7+0x310/0x420
      [20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
      [20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
      [20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
      [20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
      [20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
      [20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
      [20976.285691]  auxiliary_bus_probe+0x45/0x70
      [20976.289790]  really_probe+0x1f2/0x480
      [20976.298509]  driver_probe_device+0x49/0xc0
      [20976.302609]  bus_for_each_drv+0x79/0xc0
      [20976.306448]  __device_attach+0xdc/0x160
      [20976.310286]  bus_probe_device+0x9d/0xb0
      [20976.314128]  device_add+0x43c/0x890
      [20976.321287]  __auxiliary_device_add+0x43/0x60
      [20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
      [20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
      [20976.342591]  process_one_work+0x1a7/0x360
      [20976.350536]  worker_thread+0x30/0x390
      [20976.358128]  kthread+0x10a/0x120
      [20976.365547]  ret_from_fork+0x1f/0x40
      ...
      [20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
      [20976.446469] Call Trace:
      [20976.448921]  __schedule+0x2d1/0x830
      [20976.452414]  schedule+0x35/0xa0
      [20976.455559]  schedule_preempt_disabled+0xa/0x10
      [20976.460090]  __mutex_lock.isra.7+0x310/0x420
      [20976.464364]  device_del+0x36/0x3c0
      [20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
      [20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
      [20976.477288]  notifier_call_chain+0x47/0x70
      [20976.481386]  __netdev_upper_dev_link+0x18b/0x280
      [20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
      [20976.494475]  do_setlink+0x336/0xf50
      [20976.502517]  __rtnl_newlink+0x529/0x8b0
      [20976.543441]  rtnl_newlink+0x43/0x60
      [20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
      [20976.559238]  netlink_rcv_skb+0x4c/0x120
      [20976.563079]  netlink_unicast+0x196/0x230
      [20976.567005]  netlink_sendmsg+0x204/0x3d0
      [20976.570930]  sock_sendmsg+0x4c/0x50
      [20976.574423]  ____sys_sendmsg+0x1eb/0x250
      [20976.586807]  ___sys_sendmsg+0x7c/0xc0
      [20976.606353]  __sys_sendmsg+0x57/0xa0
      [20976.609930]  do_syscall_64+0x5b/0x1a0
      [20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
         is called from ice_service_task() context, aux device is created
         and associated device->lock is taken.
      2. Command 'ip link ... set master...' calls ice's notifier under
         RTNL lock and that notifier calls ice_unplug_aux_dev(). That
         function tries to take aux device->lock but this is already taken
         by ice_plug_aux_dev() in step 1
      3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
         taken in step 2
      4. Dead-lock
      
      The patch fixes this issue by following changes:
      - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
        call in ice_service_task()
      - The bit is checked in ice_clear_rdma_cap() and only if it is not set
        then ice_unplug_aux_dev() is called. If it is set (in other words
        plugging of aux device was requested and ice_plug_aux_dev() is
        potentially running) then the function only clears the bit
      - Once ice_plug_aux_dev() call (in ice_service_task) is finished
        the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
        whether it was already cleared by ice_clear_rdma_cap(). If so then
        aux device is unplugged.
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Co-developed-by: default avatarPetr Oros <poros@redhat.com>
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5cb1ebdb
  17. 09 Mar, 2022 1 commit
  18. 08 Mar, 2022 2 commits
  19. 03 Mar, 2022 5 commits
    • Jacob Keller's avatar
      ice: convert VF storage to hash table with krefs and RCU · 3d5985a1
      Jacob Keller authored
      The ice driver stores VF structures in a simple array which is allocated
      once at the time of VF creation. The VF structures are then accessed
      from the array by their VF ID. The ID must be between 0 and the number
      of allocated VFs.
      
      Multiple threads can access this table:
      
       * .ndo operations such as .ndo_get_vf_cfg or .ndo_set_vf_trust
       * interrupts, such as due to messages from the VF using the virtchnl
         communication
       * processing such as device reset
       * commands to add or remove VFs
      
      The current implementation does not keep track of when all threads are
      done operating on a VF and can potentially result in use-after-free
      issues caused by one thread accessing a VF structure after it has been
      released when removing VFs. Some of these are prevented with various
      state flags and checks.
      
      In addition, this structure is quite static and does not support a
      planned future where virtualization can be more dynamic. As we begin to
      look at supporting Scalable IOV with the ice driver (as opposed to just
      supporting Single Root IOV), this structure is not sufficient.
      
      In the future, VFs will be able to be added and removed individually and
      dynamically.
      
      To allow for this, and to better protect against a whole class of
      use-after-free bugs, replace the VF storage with a combination of a hash
      table and krefs to reference track all of the accesses to VFs through
      the hash table.
      
      A hash table still allows efficient look up of the VF given its ID, but
      also allows adding and removing VFs. It does not require contiguous VF
      IDs.
      
      The use of krefs allows the cleanup of the VF memory to be delayed until
      after all threads have released their reference (by calling ice_put_vf).
      
      To prevent corruption of the hash table, a combination of RCU and the
      mutex table_lock are used. Addition and removal from the hash table use
      the RCU-aware hash macros. This allows simple read-only look ups that
      iterate to locate a single VF can be fast using RCU. Accesses which
      modify the hash table, or which can't take RCU because they sleep, will
      hold the mutex lock.
      
      By using this design, we have a stronger guarantee that the VF structure
      can't be released until after all threads are finished operating on it.
      We also pave the way for the more dynamic Scalable IOV implementation in
      the future.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      3d5985a1
    • Jacob Keller's avatar
      ice: factor VF variables to separate structure · 000773c0
      Jacob Keller authored
      We maintain a number of values for VFs within the ice_pf structure. This
      includes the VF table, the number of allocated VFs, the maximum number
      of supported SR-IOV VFs, the number of queue pairs per VF, the number of
      MSI-X vectors per VF, and a bitmap of the VFs with detected MDD events.
      
      We're about to add a few more variables to this list. Clean this up
      first by extracting these members out into a new ice_vfs structure
      defined in ice_virtchnl_pf.h
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      000773c0
    • Jacob Keller's avatar
      ice: convert ice_for_each_vf to include VF entry iterator · c4c2c7db
      Jacob Keller authored
      The ice_for_each_vf macro is intended to be used to loop over all VFs.
      The current implementation relies on an iterator that is the index into
      the VF array in the PF structure. This forces all users to perform a
      look up themselves.
      
      This abstraction forces a lot of duplicate work on callers and leaks the
      interface implementation to the caller. Replace this with an
      implementation that includes the VF pointer the primary iterator. This
      version simplifies callers which just want to iterate over every VF, as
      they no longer need to perform their own lookup.
      
      The "i" iterator value is replaced with a new unsigned int "bkt"
      parameter, as this will match the necessary interface for replacing
      the VF array with a hash table. For now, the bkt is the VF ID, but in
      the future it will simply be the hash bucket index. Document that it
      should not be treated as a VF ID.
      
      This change aims to simplify switching from the array to a hash table. I
      considered alternative implementations such as an xarray but decided
      that the hash table was the simplest and most suitable implementation. I
      also looked at methods to hide the bkt iterator entirely, but I couldn't
      come up with a feasible solution that worked for hash table iterators.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      c4c2c7db
    • Jacob Keller's avatar
      ice: store VF pointer instead of VF ID · b03d519d
      Jacob Keller authored
      The VSI structure contains a vf_id field used to associate a VSI with a
      VF. This is used mainly for ICE_VSI_VF as well as partially for
      ICE_VSI_CTRL associated with the VFs.
      
      This API was designed with the idea that VFs are stored in a simple
      array that was expected to be static throughout most of the driver's
      life.
      
      We plan on refactoring VF storage in a few key ways:
      
        1) converting from a simple static array to a hash table
        2) using krefs to track VF references obtained from the hash table
        3) use RCU to delay release of VF memory until after all references
           are dropped
      
      This is motivated by the goal to ensure that the lifetime of VF
      structures is accounted for, and prevent various use-after-free bugs.
      
      With the existing vsi->vf_id, the reference tracking for VFs would
      become somewhat convoluted, because each VSI maintains a vf_id field
      which will then require performing a look up. This means all these flows
      will require reference tracking and proper usage of rcu_read_lock, etc.
      
      We know that the VF VSI will always be backed by a valid VF structure,
      because the VSI is created during VF initialization and removed before
      the VF is destroyed. Rely on this and store a reference to the VF in the
      VSI structure instead of storing a VF ID. This will simplify the usage
      and avoid the need to perform lookups on the hash table in the future.
      
      For ICE_VSI_VF, it is expected that vsi->vf is always non-NULL after
      ice_vsi_alloc succeeds. Because of this, use WARN_ON when checking if a
      vsi->vf pointer is valid when dealing with VF VSIs. This will aid in
      debugging code which violates this assumption and avoid more disastrous
      panics.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      b03d519d
    • Karol Kolacinski's avatar
      ice: add TTY for GNSS module for E810T device · 43113ff7
      Karol Kolacinski authored
      Add a new ice_gnss.c file for holding the basic GNSS module functions.
      If the device supports GNSS module, call the new ice_gnss_init and
      ice_gnss_release functions where appropriate.
      
      Implement basic functionality for reading the data from GNSS module
      using TTY device.
      
      Add I2C read AQ command. It is now required for controlling the external
      physical connectors via external I2C port expander on E810-T adapters.
      
      Future changes will introduce write functionality.
      Signed-off-by: default avatarKarol Kolacinski <karol.kolacinski@intel.com>
      Signed-off-by: default avatarSudhansu Sekhar Mishra <sudhansu.mishra@intel.com>
      Tested-by: default avatarSunitha Mekala <sunithax.d.mekala@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43113ff7
  20. 18 Feb, 2022 1 commit
    • Jacob Keller's avatar
      ice: fix concurrent reset and removal of VFs · fadead80
      Jacob Keller authored
      Commit c503e632 ("ice: Stop processing VF messages during teardown")
      introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is
      intended to prevent some issues with concurrently handling messages from
      VFs while tearing down the VFs.
      
      This change was motivated by crashes caused while tearing down and
      bringing up VFs in rapid succession.
      
      It turns out that the fix actually introduces issues with the VF driver
      caused because the PF no longer responds to any messages sent by the VF
      during its .remove routine. This results in the VF potentially removing
      its DMA memory before the PF has shut down the device queues.
      
      Additionally, the fix doesn't actually resolve concurrency issues within
      the ice driver. It is possible for a VF to initiate a reset just prior
      to the ice driver removing VFs. This can result in the remove task
      concurrently operating while the VF is being reset. This results in
      similar memory corruption and panics purportedly fixed by that commit.
      
      Fix this concurrency at its root by protecting both the reset and
      removal flows using the existing VF cfg_lock. This ensures that we
      cannot remove the VF while any outstanding critical tasks such as a
      virtchnl message or a reset are occurring.
      
      This locking change also fixes the root cause originally fixed by commit
      c503e632 ("ice: Stop processing VF messages during teardown"), so we
      can simply revert it.
      
      Note that I kept these two changes together because simply reverting the
      original commit alone would leave the driver vulnerable to worse race
      conditions.
      
      Fixes: c503e632 ("ice: Stop processing VF messages during teardown")
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      fadead80
  21. 14 Feb, 2022 1 commit
  22. 10 Feb, 2022 2 commits
  23. 09 Feb, 2022 2 commits
    • Brett Creeley's avatar
      ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev · 1babaf77
      Brett Creeley authored
      In order for the driver to support 802.1ad VLAN filtering and offloads,
      it needs to advertise those VLAN features and also support modifying
      those VLAN features, so make the necessary changes to
      ice_set_netdev_features(). By default, enable CTAG insertion/stripping
      and CTAG filtering for both Single and Double VLAN Modes (SVM/DVM).
      Also, in DVM, enable STAG filtering by default. This is done by
      setting the feature bits in netdev->features. Also, in DVM, support
      toggling of STAG insertion/stripping, but don't enable them by
      default. This is done by setting the feature bits in
      netdev->hw_features.
      
      Since 802.1ad VLAN filtering and offloads are only supported in DVM, make
      sure they are not enabled by default and that they cannot be enabled
      during runtime, when the device is in SVM.
      
      Add an implementation for the ndo_fix_features() callback. This is
      needed since the hardware cannot support multiple VLAN ethertypes for
      VLAN insertion/stripping simultaneously and all supported VLAN filtering
      must either be enabled or disabled together.
      
      Disable inner VLAN stripping by default when DVM is enabled. If a VSI
      supports stripping the inner VLAN in DVM, then it will have to configure
      that during runtime. For example if a VF is configured in a port VLAN
      while DVM is enabled it will be allowed to offload inner VLANs.
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      1babaf77
    • Brett Creeley's avatar
      ice: Support configuring the device to Double VLAN Mode · a1ffafb0
      Brett Creeley authored
      In order to support configuring the device in Double VLAN Mode (DVM),
      the DDP and FW have to support DVM. If both support DVM, the PF that
      downloads the package needs to update the default recipes, set the
      VLAN mode, and update boost TCAM entries.
      
      To support updating the default recipes in DVM, add support for
      updating an existing switch recipe's lkup_idx and mask. This is done
      by first calling the get recipe AQ (0x0292) with the desired recipe
      ID. Then, if that is successful update one of the lookup indices
      (lkup_idx) and its associated mask if the mask is valid otherwise
      the already existing mask will be used.
      
      The VLAN mode of the device has to be configured while the global
      configuration lock is held while downloading the DDP, specifically after
      the DDP has been downloaded. If supported, the device will default to
      DVM.
      Co-developed-by: default avatarDan Nowlin <dan.nowlin@intel.com>
      Signed-off-by: default avatarDan Nowlin <dan.nowlin@intel.com>
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      a1ffafb0