1. 02 Jan, 2023 2 commits
    • Jamal Hadi Salim's avatar
      net: sched: cbq: dont intepret cls results when asked to drop · caa4b35b
      Jamal Hadi Salim authored
      If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume that
      res.class contains a valid pointer
      
      Sample splat reported by Kyle Zeng
      
      [    5.405624] 0: reclassify loop, rule prio 0, protocol 800
      [    5.406326] ==================================================================
      [    5.407240] BUG: KASAN: slab-out-of-bounds in cbq_enqueue+0x54b/0xea0
      [    5.407987] Read of size 1 at addr ffff88800e3122aa by task poc/299
      [    5.408731]
      [    5.408897] CPU: 0 PID: 299 Comm: poc Not tainted 5.10.155+ #15
      [    5.409516] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS 1.15.0-1 04/01/2014
      [    5.410439] Call Trace:
      [    5.410764]  dump_stack+0x87/0xcd
      [    5.411153]  print_address_description+0x7a/0x6b0
      [    5.411687]  ? vprintk_func+0xb9/0xc0
      [    5.411905]  ? printk+0x76/0x96
      [    5.412110]  ? cbq_enqueue+0x54b/0xea0
      [    5.412323]  kasan_report+0x17d/0x220
      [    5.412591]  ? cbq_enqueue+0x54b/0xea0
      [    5.412803]  __asan_report_load1_noabort+0x10/0x20
      [    5.413119]  cbq_enqueue+0x54b/0xea0
      [    5.413400]  ? __kasan_check_write+0x10/0x20
      [    5.413679]  __dev_queue_xmit+0x9c0/0x1db0
      [    5.413922]  dev_queue_xmit+0xc/0x10
      [    5.414136]  ip_finish_output2+0x8bc/0xcd0
      [    5.414436]  __ip_finish_output+0x472/0x7a0
      [    5.414692]  ip_finish_output+0x5c/0x190
      [    5.414940]  ip_output+0x2d8/0x3c0
      [    5.415150]  ? ip_mc_finish_output+0x320/0x320
      [    5.415429]  __ip_queue_xmit+0x753/0x1760
      [    5.415664]  ip_queue_xmit+0x47/0x60
      [    5.415874]  __tcp_transmit_skb+0x1ef9/0x34c0
      [    5.416129]  tcp_connect+0x1f5e/0x4cb0
      [    5.416347]  tcp_v4_connect+0xc8d/0x18c0
      [    5.416577]  __inet_stream_connect+0x1ae/0xb40
      [    5.416836]  ? local_bh_enable+0x11/0x20
      [    5.417066]  ? lock_sock_nested+0x175/0x1d0
      [    5.417309]  inet_stream_connect+0x5d/0x90
      [    5.417548]  ? __inet_stream_connect+0xb40/0xb40
      [    5.417817]  __sys_connect+0x260/0x2b0
      [    5.418037]  __x64_sys_connect+0x76/0x80
      [    5.418267]  do_syscall_64+0x31/0x50
      [    5.418477]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
      [    5.418770] RIP: 0033:0x473bb7
      [    5.418952] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00
      00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00
      00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34
      24 89
      [    5.420046] RSP: 002b:00007fffd20eb0f8 EFLAGS: 00000246 ORIG_RAX:
      000000000000002a
      [    5.420472] RAX: ffffffffffffffda RBX: 00007fffd20eb578 RCX: 0000000000473bb7
      [    5.420872] RDX: 0000000000000010 RSI: 00007fffd20eb110 RDI: 0000000000000007
      [    5.421271] RBP: 00007fffd20eb150 R08: 0000000000000001 R09: 0000000000000004
      [    5.421671] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
      [    5.422071] R13: 00007fffd20eb568 R14: 00000000004fc740 R15: 0000000000000002
      [    5.422471]
      [    5.422562] Allocated by task 299:
      [    5.422782]  __kasan_kmalloc+0x12d/0x160
      [    5.423007]  kasan_kmalloc+0x5/0x10
      [    5.423208]  kmem_cache_alloc_trace+0x201/0x2e0
      [    5.423492]  tcf_proto_create+0x65/0x290
      [    5.423721]  tc_new_tfilter+0x137e/0x1830
      [    5.423957]  rtnetlink_rcv_msg+0x730/0x9f0
      [    5.424197]  netlink_rcv_skb+0x166/0x300
      [    5.424428]  rtnetlink_rcv+0x11/0x20
      [    5.424639]  netlink_unicast+0x673/0x860
      [    5.424870]  netlink_sendmsg+0x6af/0x9f0
      [    5.425100]  __sys_sendto+0x58d/0x5a0
      [    5.425315]  __x64_sys_sendto+0xda/0xf0
      [    5.425539]  do_syscall_64+0x31/0x50
      [    5.425764]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
      [    5.426065]
      [    5.426157] The buggy address belongs to the object at ffff88800e312200
      [    5.426157]  which belongs to the cache kmalloc-128 of size 128
      [    5.426955] The buggy address is located 42 bytes to the right of
      [    5.426955]  128-byte region [ffff88800e312200, ffff88800e312280)
      [    5.427688] The buggy address belongs to the page:
      [    5.427992] page:000000009875fabc refcount:1 mapcount:0
      mapping:0000000000000000 index:0x0 pfn:0xe312
      [    5.428562] flags: 0x100000000000200(slab)
      [    5.428812] raw: 0100000000000200 dead000000000100 dead000000000122
      ffff888007843680
      [    5.429325] raw: 0000000000000000 0000000000100010 00000001ffffffff
      ffff88800e312401
      [    5.429875] page dumped because: kasan: bad access detected
      [    5.430214] page->mem_cgroup:ffff88800e312401
      [    5.430471]
      [    5.430564] Memory state around the buggy address:
      [    5.430846]  ffff88800e312180: fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc fc
      [    5.431267]  ffff88800e312200: 00 00 00 00 00 00 00 00 00 00 00 00
      00 00 00 fc
      [    5.431705] >ffff88800e312280: fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc fc
      [    5.432123]                                   ^
      [    5.432391]  ffff88800e312300: 00 00 00 00 00 00 00 00 00 00 00 00
      00 00 00 fc
      [    5.432810]  ffff88800e312380: fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc fc
      [    5.433229] ==================================================================
      [    5.433648] Disabling lock debugging due to kernel taint
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarKyle Zeng <zengyhkyle@gmail.com>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      caa4b35b
    • Jamal Hadi Salim's avatar
      net: sched: atm: dont intepret cls results when asked to drop · a2965c7b
      Jamal Hadi Salim authored
      If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume
      res.class contains a valid pointer
      Fixes: b0188d4d ("[NET_SCHED]: sch_atm: Lindent")
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2965c7b
  2. 01 Jan, 2023 13 commits
  3. 30 Dec, 2022 18 commits
  4. 28 Dec, 2022 7 commits
    • Eli Cohen's avatar
      net/mlx5: Lag, fix failure to cancel delayed bond work · 4d1c1379
      Eli Cohen authored
      Commit 0d4e8ed1 ("net/mlx5: Lag, avoid lockdep warnings")
      accidentally removed a call to cancel delayed bond work thus it may
      cause queued delay to expire and fall on an already destroyed work
      queue.
      
      Fix by restoring the call cancel_delayed_work_sync() before
      destroying the workqueue.
      
      This prevents call trace such as this:
      
      [  329.230417] BUG: kernel NULL pointer dereference, address: 0000000000000000
       [  329.231444] #PF: supervisor write access in kernel mode
       [  329.232233] #PF: error_code(0x0002) - not-present page
       [  329.233007] PGD 0 P4D 0
       [  329.233476] Oops: 0002 [#1] SMP
       [  329.234012] CPU: 5 PID: 145 Comm: kworker/u20:4 Tainted: G OE      6.0.0-rc5_mlnx #1
       [  329.235282] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       [  329.236868] Workqueue: mlx5_cmd_0000:08:00.1 cmd_work_handler [mlx5_core]
       [  329.237886] RIP: 0010:_raw_spin_lock+0xc/0x20
       [  329.238585] Code: f0 0f b1 17 75 02 f3 c3 89 c6 e9 6f 3c 5f ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 f3 c3 89 c6 e9 45 3c 5f ff 0f 1f 44 00 00 0f 1f
       [  329.241156] RSP: 0018:ffffc900001b0e98 EFLAGS: 00010046
       [  329.241940] RAX: 0000000000000000 RBX: ffffffff82374ae0 RCX: 0000000000000000
       [  329.242954] RDX: 0000000000000001 RSI: 0000000000000014 RDI: 0000000000000000
       [  329.243974] RBP: ffff888106ccf000 R08: ffff8881004000c8 R09: ffff888100400000
       [  329.244990] R10: 0000000000000000 R11: ffffffff826669f8 R12: 0000000000002000
       [  329.246009] R13: 0000000000000005 R14: ffff888100aa7ce0 R15: ffff88852ca80000
       [  329.247030] FS:  0000000000000000(0000) GS:ffff88852ca80000(0000) knlGS:0000000000000000
       [  329.248260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [  329.249111] CR2: 0000000000000000 CR3: 000000016d675001 CR4: 0000000000770ee0
       [  329.250133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [  329.251152] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [  329.252176] PKRU: 55555554
      
      Fixes: 0d4e8ed1 ("net/mlx5: Lag, avoid lockdep warnings")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4d1c1379
    • Maor Dickman's avatar
      net/mlx5e: Set geneve_tlv_option_0_exist when matching on geneve option · e54638a8
      Maor Dickman authored
      The cited patch added support of matching on geneve option by setting
      geneve_tlv_option_0_data mask and key but didn't set geneve_tlv_option_0_exist
      bit which is required on some HWs when matching geneve_tlv_option_0_data parameter,
      this may cause in some cases for packets to wrongly match on rules with different
      geneve option.
      
      Example of such case is packet with geneve_tlv_object class=789 and data=456
      will wrongly match on rule with match geneve_tlv_object class=123 and data=456.
      
      Fix it by setting geneve_tlv_option_0_exist bit when supported by the HW when matching
      on geneve_tlv_option_0_data parameter.
      
      Fixes: 9272e3df ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      e54638a8
    • Adham Faris's avatar
      net/mlx5e: Fix hw mtu initializing at XDP SQ allocation · 1e267ab8
      Adham Faris authored
      Current xdp xmit functions logic (mlx5e_xmit_xdp_frame_mpwqe or
      mlx5e_xmit_xdp_frame), validates xdp packet length by comparing it to
      hw mtu (configured at xdp sq allocation) before xmiting it. This check
      does not account for ethernet fcs length (calculated and filled by the
      nic). Hence, when we try sending packets with length > (hw-mtu -
      ethernet-fcs-size), the device port drops it and tx_errors_phy is
      incremented. Desired behavior is to catch these packets and drop them
      by the driver.
      
      Fix this behavior in XDP SQ allocation function (mlx5e_alloc_xdpsq) by
      subtracting ethernet FCS header size (4 Bytes) from current hw mtu
      value, since ethernet FCS is calculated and written to ethernet frames
      by the nic.
      
      Fixes: d8bec2b2 ("net/mlx5e: Support bpf_xdp_adjust_head()")
      Signed-off-by: default avatarAdham Faris <afaris@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1e267ab8
    • Chris Mi's avatar
      net/mlx5e: Always clear dest encap in neigh-update-del · 2951b2e1
      Chris Mi authored
      The cited commit introduced a bug for multiple encapsulations flow.
      If one dest encap becomes invalid, the flow is set slow path flag.
      But when other dests encap become invalid, they are not cleared due
      to slow path flag of the flow. When neigh-update-add is running, it
      will use invalid encap.
      
      Fix it by checking slow path flag after clearing dest encap.
      
      Fixes: 9a5f9cc7 ("net/mlx5e: Fix possible use-after-free deleting fdb rule")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2951b2e1
    • Chris Mi's avatar
      net/mlx5e: CT: Fix ct debugfs folder name · 849190e3
      Chris Mi authored
      Need to use sprintf to build a string instead of sscanf. Otherwise
      dirname is null and both "ct_nic" and "ct_fdb" won't be created.
      But its redundant anyway as driver could be in switchdev mode but
      still add nic rules. So use "ct" as folder name.
      
      Fixes: 77422a8f ("net/mlx5e: CT: Add ct driver counters")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      849190e3
    • Tariq Toukan's avatar
      net/mlx5e: Fix RX reporter for XSK RQs · f8c18a57
      Tariq Toukan authored
      RX reporter mistakenly reads from the regular (inactive) RQ
      when XSK RQ is active. Fix it here.
      
      Fixes: 3db4c85c ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f8c18a57
    • Dragos Tatulea's avatar
      net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default · b12d581e
      Dragos Tatulea authored
      mlx5e_build_nic_params will turn CQE compression on if the hardware
      capability is enabled and the slow_pci_heuristic condition is detected.
      As IPoIB doesn't support CQE compression, make sure to disable the
      feature in the IPoIB profile init.
      
      Please note that the feature is not exposed to the user for IPoIB
      interfaces, so it can't be subsequently turned on.
      
      Fixes: b797a684 ("net/mlx5e: Enable CQE compression when PCI is slower than link")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b12d581e