An error occurred fetching the project authors.
  1. 19 Mar, 2024 1 commit
    • Jonah Palmer's avatar
      vdpa/mlx5: Allow CVQ size changes · 749a4016
      Jonah Palmer authored
      The MLX driver was not updating its control virtqueue size at set_vq_num
      and instead always initialized to MLX5_CVQ_MAX_ENT (16) at
      setup_cvq_vring.
      
      Qemu would try to set the size to 64 by default, however, because the
      CVQ size always was initialized to 16, an error would be thrown when
      sending >16 control messages (as used-ring entry 17 is initialized to 0).
      For example, starting a guest with x-svq=on and then executing the
      following command would produce the error below:
      
       # for i in {1..20}; do ifconfig eth0 hw ether XX:xx:XX:xx:XX:XX; done
      
       qemu-system-x86_64: Insufficient written data (0)
       [  435.331223] virtio_net virtio0: Failed to set mac address by vq command.
       SIOCSIFHWADDR: Invalid argument
      Acked-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Acked-by: default avatarEugenio Pérez <eperezma@redhat.com>
      Signed-off-by: default avatarJonah Palmer <jonah.palmer@oracle.com>
      Message-Id: <20240216142502.78095-1-jonah.palmer@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Tested-by: default avatarLei Yang <leiyang@redhat.com>
      Fixes: 5262912e ("vdpa/mlx5: Add support for control VQ and MAC setting")
      749a4016
  2. 10 Jan, 2024 7 commits
  3. 01 Dec, 2023 1 commit
  4. 01 Nov, 2023 10 commits
  5. 18 Oct, 2023 2 commits
    • Dragos Tatulea's avatar
      vdpa/mlx5: Fix firmware error on creation of 1k VQs · abb0dcf9
      Dragos Tatulea authored
      A firmware error is triggered when configuring a 9k MTU on the PF after
      switching to switchdev mode and then using a vdpa device with larger
      (1k) rings:
      mlx5_cmd_out_err: CREATE_GENERAL_OBJECT(0xa00) op_mod(0xd) failed, status bad resource(0x5), syndrome (0xf6db90), err(-22)
      
      This is due to the fact that the hw VQ size parameters are computed
      based on the umem_1/2/3_buffer_param_a/b capabilities and all
      device capabilities are read only when the driver is moved to switchdev mode.
      
      The problematic configuration flow looks like this:
      1) Create VF
      2) Unbind VF
      3) Switch PF to switchdev mode.
      4) Bind VF
      5) Set PF MTU to 9k
      6) create vDPA device
      7) Start VM with vDPA device and 1K queue size
      
      Note that setting the MTU before step 3) doesn't trigger this issue.
      
      This patch reads the forementioned umem parameters at the latest point
      possible before the VQs of the device are created.
      
      v2:
      - Allocate output with kmalloc to reduce stack frame size.
      - Removed stable from cc.
      
      Fixes: 1a86b377 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Message-Id: <20230831155702.1080754-1-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      abb0dcf9
    • Dragos Tatulea's avatar
      vdpa/mlx5: Fix double release of debugfs entry · f8a3db47
      Dragos Tatulea authored
      The error path in setup_driver deletes the debugfs entry but doesn't
      clear the pointer. During .dev_del the invalid pointer will be released
      again causing a crash.
      
      This patch fixes the issue by always clearing the debugfs entry in
      mlx5_vdpa_remove_debugfs. Also, stop removing the debugfs entry in
      .dev_del op: the debugfs entry is already handled within the
      setup_driver/teardown_driver scope.
      
      Cc: stable@vger.kernel.org
      Fixes: f0417e72 ("vdpa/mlx5: Add and remove debugfs in setup/teardown driver")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Message-Id: <20230829174014.928189-2-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      f8a3db47
  6. 10 Aug, 2023 3 commits
    • Dragos Tatulea's avatar
      vdpa/mlx5: Fix crash on shutdown for when no ndev exists · 810b0cc1
      Dragos Tatulea authored
      The ndev was accessed on shutdown without a check if it actually exists.
      This triggered the crash pasted below.
      
      Instead of doing the ndev check, delete the shutdown handler altogether.
      The irqs will be released at the parent VF level (mlx5_core).
      
       BUG: kernel NULL pointer dereference, address: 0000000000000300
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP
       CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.5.0-rc2_for_upstream_min_debug_2023_07_17_15_05 #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       RIP: 0010:mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
       RSP: 0018:ffff8881003bfdc0 EFLAGS: 00010286
       RAX: ffff888103befba0 RBX: ffff888109d28008 RCX: 0000000000000017
       RDX: 0000000000000001 RSI: 0000000000000212 RDI: ffff888109d28000
       RBP: 0000000000000000 R08: 0000000d3a3a3882 R09: 0000000000000001
       R10: 0000000000000000 R11: 0000000000000000 R12: ffff888109d28000
       R13: ffff888109d28080 R14: 00000000fee1dead R15: 0000000000000000
       FS:  00007f4969e0be40(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000300 CR3: 00000001051cd006 CR4: 0000000000370eb0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        <TASK>
        ? __die+0x20/0x60
        ? page_fault_oops+0x14c/0x3c0
        ? exc_page_fault+0x75/0x140
        ? asm_exc_page_fault+0x22/0x30
        ? mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
        device_shutdown+0x13e/0x1e0
        kernel_restart+0x36/0x90
        __do_sys_reboot+0x141/0x210
        ? vfs_writev+0xcd/0x140
        ? handle_mm_fault+0x161/0x260
        ? do_writev+0x6b/0x110
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x46/0xb0
       RIP: 0033:0x7f496990fb56
       RSP: 002b:00007fffc7bdde88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
       RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f496990fb56
       RDX: 0000000001234567 RSI: 0000000028121969 RDI: fffffffffee1dead
       RBP: 00007fffc7bde1d0 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
       R13: 00007fffc7bddf10 R14: 0000000000000000 R15: 00007fffc7bde2b8
        </TASK>
       CR2: 0000000000000300
       ---[ end trace 0000000000000000 ]---
      
      Fixes: bc9a2b3e ("vdpa/mlx5: Support interrupt bypassing")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Message-Id: <20230803152648.199297-1-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      810b0cc1
    • Eugenio Pérez's avatar
      vdpa/mlx5: Delete control vq iotlb in destroy_mr only when necessary · ad03a0f4
      Eugenio Pérez authored
      mlx5_vdpa_destroy_mr can be called from .set_map with data ASID after
      the control virtqueue ASID iotlb has been populated. The control vq
      iotlb must not be cleared, since it will not be populated again.
      
      So call the ASID aware destroy function which makes sure that the
      right vq resource is destroyed.
      
      Fixes: 8fcd20c3 ("vdpa/mlx5: Support different address spaces for control and data")
      Signed-off-by: default avatarEugenio Pérez <eperezma@redhat.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Message-Id: <20230802171231.11001-5-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      ad03a0f4
    • Dragos Tatulea's avatar
      vdpa/mlx5: Correct default number of queues when MQ is on · 3fe02419
      Dragos Tatulea authored
      The standard specifies that the initial number of queues is the
      default, which is 1 (1 tx, 1 rx).
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarEugenio Pérez <eperezma@redhat.com>
      Message-Id: <20230727172354.68243-2-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Tested-by: default avatarLei Yang <leiyang@redhat.com>
      3fe02419
  7. 07 Aug, 2023 1 commit
    • Maher Sanalla's avatar
      net/mlx5: Allocate completion EQs dynamically · f14c1a14
      Maher Sanalla authored
      This commit enables the dynamic allocation of EQs at runtime, allowing
      for more flexibility in managing completion EQs and reducing the memory
      overhead of driver load. Whenever a CQ is created for a given vector
      index, the driver will lookup to see if there is an already mapped
      completion EQ for that vector, if so, utilize it. Otherwise, allocate a
      new EQ on demand and then utilize it for the CQ completion events.
      
      Add a protection lock to the EQ table to protect from concurrent EQ
      creation attempts.
      
      While at it, replace mlx5_vector2irqn()/mlx5_vector2eqn() with
      mlx5_comp_eqn_get() and mlx5_comp_irqn_get() which will allocate an
      EQ on demand if no EQ is found for the given vector.
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f14c1a14
  8. 27 Jun, 2023 1 commit
    • Eli Cohen's avatar
      vdpa/mlx5: Support interrupt bypassing · bc9a2b3e
      Eli Cohen authored
      Add support for generation of interrupts from the device directly to the
      VM to the VCPU thus avoiding the overhead on the host CPU.
      
      When supported, the driver will attempt to allocate vectors for each
      data virtqueue. If a vector for a virtqueue cannot be provided it will
      use the QP mode where notifications go through the driver.
      
      In addition, we add a shutdown callback to make sure allocated
      interrupts are released in case of shutdown to allow clean shutdown.
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Message-Id: <20230607190007.290505-1-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      bc9a2b3e
  9. 08 Jun, 2023 1 commit
    • Dragos Tatulea's avatar
      vdpa/mlx5: Fix hang when cvq commands are triggered during device unregister · 73790bdf
      Dragos Tatulea authored
      Currently the vdpa device is unregistered after the workqueue that
      processes vq commands is disabled. However, the device unregister
      process can still send commands to the cvq (a vlan delete for example)
      which leads to a hang because the handing workqueue has been disabled
      and the command never finishes:
      
       [ 2263.095764] rcu: INFO: rcu_sched self-detected stall on CPU
       [ 2263.096307] rcu:        9-....: (5250 ticks this GP) idle=dac4/1/0x4000000000000000 softirq=111009/111009 fqs=2544
       [ 2263.097154] rcu:        (t=5251 jiffies g=393549 q=347 ncpus=10)
       [ 2263.097648] CPU: 9 PID: 94300 Comm: kworker/u20:2 Not tainted 6.3.0-rc6_for_upstream_min_debug_2023_04_14_00_02 #1
       [ 2263.098535] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       [ 2263.099481] Workqueue: mlx5_events mlx5_vhca_state_work_handler [mlx5_core]
       [ 2263.100143] RIP: 0010:virtnet_send_command+0x109/0x170
       [ 2263.100621] Code: 1d df f5 ff 85 c0 78 5c 48 8b 7b 08 e8 d0 c5 f5 ff 84 c0 75 11 eb 22 48 8b 7b 08 e8 01 b7 f5 ff 84 c0 75 15 f3 90 48 8b 7b 08 <48> 8d 74 24 04 e8 8d c5 f5 ff 48 85 c0 74 de 48 8b 83 f8 00 00 00
       [ 2263.102148] RSP: 0018:ffff888139cf36e8 EFLAGS: 00000246
       [ 2263.102624] RAX: 0000000000000000 RBX: ffff888166bea940 RCX: 0000000000000001
       [ 2263.103244] RDX: 0000000000000000 RSI: ffff888139cf36ec RDI: ffff888146763800
       [ 2263.103864] RBP: ffff888139cf3710 R08: ffff88810d201000 R09: 0000000000000000
       [ 2263.104473] R10: 0000000000000002 R11: 0000000000000003 R12: 0000000000000002
       [ 2263.105082] R13: 0000000000000002 R14: ffff888114528400 R15: ffff888166bea000
       [ 2263.105689] FS:  0000000000000000(0000) GS:ffff88852cc80000(0000) knlGS:0000000000000000
       [ 2263.106404] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [ 2263.106925] CR2: 00007f31f394b000 CR3: 000000010615b006 CR4: 0000000000370ea0
       [ 2263.107542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [ 2263.108163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [ 2263.108769] Call Trace:
       [ 2263.109059]  <TASK>
       [ 2263.109320]  ? check_preempt_wakeup+0x11f/0x230
       [ 2263.109750]  virtnet_vlan_rx_kill_vid+0x5a/0xa0
       [ 2263.110180]  vlan_vid_del+0x9c/0x170
       [ 2263.110546]  vlan_device_event+0x351/0x760 [8021q]
       [ 2263.111004]  raw_notifier_call_chain+0x41/0x60
       [ 2263.111426]  dev_close_many+0xcb/0x120
       [ 2263.111808]  unregister_netdevice_many_notify+0x130/0x770
       [ 2263.112297]  ? wq_worker_running+0xa/0x30
       [ 2263.112688]  unregister_netdevice_queue+0x89/0xc0
       [ 2263.113128]  unregister_netdev+0x18/0x20
       [ 2263.113512]  virtnet_remove+0x4f/0x230
       [ 2263.113885]  virtio_dev_remove+0x31/0x70
       [ 2263.114273]  device_release_driver_internal+0x18f/0x1f0
       [ 2263.114746]  bus_remove_device+0xc6/0x130
       [ 2263.115146]  device_del+0x173/0x3c0
       [ 2263.115502]  ? kernfs_find_ns+0x35/0xd0
       [ 2263.115895]  device_unregister+0x1a/0x60
       [ 2263.116279]  unregister_virtio_device+0x11/0x20
       [ 2263.116706]  device_release_driver_internal+0x18f/0x1f0
       [ 2263.117182]  bus_remove_device+0xc6/0x130
       [ 2263.117576]  device_del+0x173/0x3c0
       [ 2263.117929]  ? vdpa_dev_remove+0x20/0x20 [vdpa]
       [ 2263.118364]  device_unregister+0x1a/0x60
       [ 2263.118752]  mlx5_vdpa_dev_del+0x4c/0x80 [mlx5_vdpa]
       [ 2263.119232]  vdpa_match_remove+0x21/0x30 [vdpa]
       [ 2263.119663]  bus_for_each_dev+0x71/0xc0
       [ 2263.120054]  vdpa_mgmtdev_unregister+0x57/0x70 [vdpa]
       [ 2263.120520]  mlx5v_remove+0x12/0x20 [mlx5_vdpa]
       [ 2263.120953]  auxiliary_bus_remove+0x18/0x30
       [ 2263.121356]  device_release_driver_internal+0x18f/0x1f0
       [ 2263.121830]  bus_remove_device+0xc6/0x130
       [ 2263.122223]  device_del+0x173/0x3c0
       [ 2263.122581]  ? devl_param_driverinit_value_get+0x29/0x90
       [ 2263.123070]  mlx5_rescan_drivers_locked+0xc4/0x2d0 [mlx5_core]
       [ 2263.123633]  mlx5_unregister_device+0x54/0x80 [mlx5_core]
       [ 2263.124169]  mlx5_uninit_one+0x54/0x150 [mlx5_core]
       [ 2263.124656]  mlx5_sf_dev_remove+0x45/0x90 [mlx5_core]
       [ 2263.125153]  auxiliary_bus_remove+0x18/0x30
       [ 2263.125560]  device_release_driver_internal+0x18f/0x1f0
       [ 2263.126052]  bus_remove_device+0xc6/0x130
       [ 2263.126451]  device_del+0x173/0x3c0
       [ 2263.126815]  mlx5_sf_dev_remove+0x39/0xf0 [mlx5_core]
       [ 2263.127318]  mlx5_sf_dev_state_change_handler+0x178/0x270 [mlx5_core]
       [ 2263.127920]  blocking_notifier_call_chain+0x5a/0x80
       [ 2263.128379]  mlx5_vhca_state_work_handler+0x151/0x200 [mlx5_core]
       [ 2263.128951]  process_one_work+0x1bb/0x3c0
       [ 2263.129355]  ? process_one_work+0x3c0/0x3c0
       [ 2263.129766]  worker_thread+0x4d/0x3c0
       [ 2263.130140]  ? process_one_work+0x3c0/0x3c0
       [ 2263.130548]  kthread+0xb9/0xe0
       [ 2263.130895]  ? kthread_complete_and_exit+0x20/0x20
       [ 2263.131349]  ret_from_fork+0x1f/0x30
       [ 2263.131717]  </TASK>
      
      The fix is to disable and destroy the workqueue after the device
      unregister. It is expected that vhost will not trigger kicks after
      the unregister. But even if it would, the wq is disabled already by
      setting the pointer to NULL (done so in the referenced commit).
      
      Fixes: ad6dc1da ("vdpa/mlx5: Avoid processing works if workqueue was destroyed")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Message-Id: <20230516095800.3549932-1-dtatulea@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      73790bdf
  10. 21 Apr, 2023 3 commits
  11. 04 Apr, 2023 1 commit
  12. 10 Mar, 2023 1 commit
  13. 21 Feb, 2023 6 commits
    • Si-Wei Liu's avatar
      vdpa/mlx5: support device features provisioning · deeacf35
      Si-Wei Liu authored
      This patch implements features provisioning for mlx5_vdpa.
      
      1) Validate the provisioned features are a subset of the parent
          features.
      2) Clearing features that are not wanted by userspace.
      
      For example:
      
          # vdpa mgmtdev show
          pci/0000:41:04.2:
            supported_classes net
            max_supported_vqs 65
            dev_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ CTRL_VLAN MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
      
      1) Provision vDPA device with all features derived from the parent
      
          # vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2
          # vdpa dev config show
          vdpa1: mac e4:11:c6:d3:45:f0 link up link_announce false max_vq_pairs 1 mtu 1500
            negotiated_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ CTRL_VLAN MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM
      
      2) Provision vDPA device with a subset of parent features
      
          # vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 device_features 0x300020000
          # vdpa dev config show
          vdpa1:
            negotiated_features CTRL_VQ VERSION_1 ACCESS_PLATFORM
      Signed-off-by: default avatarSi-Wei Liu <si-wei.liu@oracle.com>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Message-Id: <1675725124-7375-7-git-send-email-si-wei.liu@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      deeacf35
    • Si-Wei Liu's avatar
      vdpa/mlx5: make MTU/STATUS presence conditional on feature bits · 033779a7
      Si-Wei Liu authored
      The spec says:
          mtu only exists if VIRTIO_NET_F_MTU is set
          status only exists if VIRTIO_NET_F_STATUS is set
      
      We should only present MTU and STATUS conditionally depending on
      the feature bits.
      Signed-off-by: default avatarSi-Wei Liu <si-wei.liu@oracle.com>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Message-Id: <1675725124-7375-6-git-send-email-si-wei.liu@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      033779a7
    • Jason Wang's avatar
      vdpa: mlx5: support per virtqueue dma device · 36871fb9
      Jason Wang authored
      This patch implements per virtqueue dma device for mlx5_vdpa. This is
      needed for virtio_vdpa to work for CVQ which is backed by vringh but
      not DMA. We simply advertise the vDPA device itself as the DMA device
      for CVQ then DMA API can simply use PA so the identical mapping for
      CVQ can still be used. Otherwise the identical (1:1) mapping won't
      work when platform IOMMU is enabled since the IOVA is allocated on
      demand which is not necessarily the PA.
      
      This fixes the following crash when mlx5 vDPA device is bound to
      virtio-vdpa with platform IOMMU enabled but not in passthrough mode:
      
      BUG: unable to handle page fault for address: ff2fb3063deb1002
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      PGD 1393001067 P4D 1393002067 PUD 0
      Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 55 PID: 8923 Comm: kworker/u112:3 Kdump: loaded Not tainted 6.1.0+ #7
      Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.5.4 12/17/2021
      Workqueue: mlx5_vdpa_wq mlx5_cvq_kick_handler [mlx5_vdpa]
      RIP: 0010:vringh_getdesc_iotlb+0x93/0x1d0 [vringh]
      Code: 14 25 40 ef 01 00 83 82 c0 0a 00 00 01 48 2b 05 93 5a 1b ea 8b 4c 24 14 48 c1 f8 06 48 c1 e0 0c 48 03 05 90 5a 1b ea 48 01 c8 <0f> b7 00 83 aa c0 0a 00 00 01 65 ff 0d bc e4 41 3f 0f 84 05 01 00
      RSP: 0018:ff46821ba664fdf8 EFLAGS: 00010282
      RAX: ff2fb3063deb1002 RBX: 0000000000000a20 RCX: 0000000000000002
      RDX: ff2fb318d2f94380 RSI: 0000000000000002 RDI: 0000000000000001
      RBP: ff2fb3065e832410 R08: ff46821ba664fe00 R09: 0000000000000001
      R10: 0000000000000000 R11: 000000000000000d R12: ff2fb3065e832488
      R13: ff2fb3065e8324a8 R14: ff2fb3065e8324c8 R15: ff2fb3065e8324a8
      FS:  0000000000000000(0000) GS:ff2fb3257fac0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ff2fb3063deb1002 CR3: 0000001392010006 CR4: 0000000000771ee0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
      <TASK>
        mlx5_cvq_kick_handler+0x89/0x2b0 [mlx5_vdpa]
        process_one_work+0x1e2/0x3b0
        ? rescuer_thread+0x390/0x390
        worker_thread+0x50/0x3a0
        ? rescuer_thread+0x390/0x390
        kthread+0xd6/0x100
        ? kthread_complete_and_exit+0x20/0x20
        ret_from_fork+0x1f/0x30
        </TASK>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Tested-by: default avatarEli Cohen <elic@nvidia.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Message-Id: <20230119061525.75068-6-jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      36871fb9
    • Eli Cohen's avatar
      vdpa/mlx5: Add RX counters to debugfs · 0a599750
      Eli Cohen authored
      For each interface, either VLAN tagged or untagged, add two hardware
      counters: one for unicast and another for multicast. The counters count
      RX packets and bytes and can be read through debugfs:
      
      $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/untagged/mcast/packets
      $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/untagged/ucast/bytes
      
      This feature is controlled via the config option
      MLX5_VDPA_STEERING_DEBUG. It is off by default as it may have some
      impact on performance.
      
      includes a fixup By Yang Yingliang <yangyingliang@huawei.com>:
      
      vdpa/mlx5: fix check wrong pointer in mlx5_vdpa_add_mac_vlan_rules()
      
      The local variable 'rule' is not used anymore, fix return value
      check after calling mlx5_add_flow_rules().
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Message-Id: <20221114131759.57883-9-elic@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Message-Id: <20230104074418.1737510-1-yangyingliang@huawei.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarEli Cohen <elic@nvidia.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      0a599750
    • Eli Cohen's avatar
      vdpa/mlx5: Add debugfs subtree · 29422100
      Eli Cohen authored
      Add debugfs subtree and expose flow table ID and TIR number. This
      information can be used by external tools to do extended
      troubleshooting.
      
      The information can be retrieved like so:
      $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/table_id
      $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.1/vdpa-0/rx/tirn
      Reviewed-by: default avatarSi-Wei Liu <si-wei.liu@oracle.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Message-Id: <20221114131759.57883-8-elic@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      29422100
    • Eli Cohen's avatar
      vdpa/mlx5: Move some definitions to a new header file · 72c67e9b
      Eli Cohen authored
      Move some definitions from mlx5_vnet.c to newly added header file
      mlx5_vnet.h. We need these definitions for the following patches that
      add debugfs tree to expose information vital for debug.
      Reviewed-by: default avatarSi-Wei Liu <si-wei.liu@oracle.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Message-Id: <20221114131759.57883-7-elic@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      72c67e9b
  14. 28 Dec, 2022 2 commits