1. 12 Aug, 2022 12 commits
    • Ivan Vecera's avatar
      iavf: Fix deadlock in initialization · cbe9e511
      Ivan Vecera authored
      Fix deadlock that occurs when iavf interface is a part of failover
      configuration.
      
      1. Mutex crit_lock is taken at the beginning of iavf_watchdog_task()
      2. Function iavf_init_config_adapter() is called when adapter
         state is __IAVF_INIT_CONFIG_ADAPTER
      3. iavf_init_config_adapter() calls register_netdevice() that emits
         NETDEV_REGISTER event
      4. Notifier function failover_event() then calls
         net_failover_slave_register() that calls dev_open()
      5. dev_open() calls iavf_open() that tries to take crit_lock in
         end-less loop
      
      Stack trace:
      ...
      [  790.251876]  usleep_range_state+0x5b/0x80
      [  790.252547]  iavf_open+0x37/0x1d0 [iavf]
      [  790.253139]  __dev_open+0xcd/0x160
      [  790.253699]  dev_open+0x47/0x90
      [  790.254323]  net_failover_slave_register+0x122/0x220 [net_failover]
      [  790.255213]  failover_slave_register.part.7+0xd2/0x180 [failover]
      [  790.256050]  failover_event+0x122/0x1ab [failover]
      [  790.256821]  notifier_call_chain+0x47/0x70
      [  790.257510]  register_netdevice+0x20f/0x550
      [  790.258263]  iavf_watchdog_task+0x7c8/0xea0 [iavf]
      [  790.259009]  process_one_work+0x1a7/0x360
      [  790.259705]  worker_thread+0x30/0x390
      
      To fix the situation we should check the current adapter state after
      first unsuccessful mutex_trylock() and return with -EBUSY if it is
      __IAVF_INIT_CONFIG_ADAPTER.
      
      Fixes: 226d5285 ("iavf: fix locking of critical sections")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      cbe9e511
    • Przemyslaw Patynowski's avatar
      iavf: Fix reset error handling · 31071173
      Przemyslaw Patynowski authored
      Do not call iavf_close in iavf_reset_task error handling. Doing so can
      lead to double call of napi_disable, which can lead to deadlock there.
      Removing VF would lead to iavf_remove task being stuck, because it
      requires crit_lock, which is held by iavf_close.
      Call iavf_disable_vf if reset fail, so that driver will clean up
      remaining invalid resources.
      During rapid VF resets, HW can fail to setup VF mailbox. Wrong
      error handling can lead to iavf_remove being stuck with:
      [ 5218.999087] iavf 0000:82:01.0: Failed to init adminq: -53
      ...
      [ 5267.189211] INFO: task repro.sh:11219 blocked for more than 30 seconds.
      [ 5267.189520]       Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 5267.189764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 5267.190062] task:repro.sh        state:D stack:    0 pid:11219 ppid:  8162 flags:0x00000000
      [ 5267.190347] Call Trace:
      [ 5267.190647]  <TASK>
      [ 5267.190927]  __schedule+0x460/0x9f0
      [ 5267.191264]  schedule+0x44/0xb0
      [ 5267.191563]  schedule_preempt_disabled+0x14/0x20
      [ 5267.191890]  __mutex_lock.isra.12+0x6e3/0xac0
      [ 5267.192237]  ? iavf_remove+0xf9/0x6c0 [iavf]
      [ 5267.192565]  iavf_remove+0x12a/0x6c0 [iavf]
      [ 5267.192911]  ? _raw_spin_unlock_irqrestore+0x1e/0x40
      [ 5267.193285]  pci_device_remove+0x36/0xb0
      [ 5267.193619]  device_release_driver_internal+0xc1/0x150
      [ 5267.193974]  pci_stop_bus_device+0x69/0x90
      [ 5267.194361]  pci_stop_and_remove_bus_device+0xe/0x20
      [ 5267.194735]  pci_iov_remove_virtfn+0xba/0x120
      [ 5267.195130]  sriov_disable+0x2f/0xe0
      [ 5267.195506]  ice_free_vfs+0x7d/0x2f0 [ice]
      [ 5267.196056]  ? pci_get_device+0x4f/0x70
      [ 5267.196496]  ice_sriov_configure+0x78/0x1a0 [ice]
      [ 5267.196995]  sriov_numvfs_store+0xfe/0x140
      [ 5267.197466]  kernfs_fop_write_iter+0x12e/0x1c0
      [ 5267.197918]  new_sync_write+0x10c/0x190
      [ 5267.198404]  vfs_write+0x24e/0x2d0
      [ 5267.198886]  ksys_write+0x5c/0xd0
      [ 5267.199367]  do_syscall_64+0x3a/0x80
      [ 5267.199827]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 5267.200317] RIP: 0033:0x7f5b381205c8
      [ 5267.200814] RSP: 002b:00007fff8c7e8c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 5267.201981] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f5b381205c8
      [ 5267.202620] RDX: 0000000000000002 RSI: 00005569420ee900 RDI: 0000000000000001
      [ 5267.203426] RBP: 00005569420ee900 R08: 000000000000000a R09: 00007f5b38180820
      [ 5267.204327] R10: 000000000000000a R11: 0000000000000246 R12: 00007f5b383c06e0
      [ 5267.205193] R13: 0000000000000002 R14: 00007f5b383bb880 R15: 0000000000000002
      [ 5267.206041]  </TASK>
      [ 5267.206970] Kernel panic - not syncing: hung_task: blocked tasks
      [ 5267.207809] CPU: 48 PID: 551 Comm: khungtaskd Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 5267.208726] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
      [ 5267.209623] Call Trace:
      [ 5267.210569]  <TASK>
      [ 5267.211480]  dump_stack_lvl+0x33/0x42
      [ 5267.212472]  panic+0x107/0x294
      [ 5267.213467]  watchdog.cold.8+0xc/0xbb
      [ 5267.214413]  ? proc_dohung_task_timeout_secs+0x30/0x30
      [ 5267.215511]  kthread+0xf4/0x120
      [ 5267.216459]  ? kthread_complete_and_exit+0x20/0x20
      [ 5267.217505]  ret_from_fork+0x22/0x30
      [ 5267.218459]  </TASK>
      
      Fixes: f0db7892 ("i40evf: use netdev variable in reset task")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      31071173
    • Przemyslaw Patynowski's avatar
      iavf: Fix NULL pointer dereference in iavf_get_link_ksettings · 541a1af4
      Przemyslaw Patynowski authored
      Fix possible NULL pointer dereference, due to freeing of adapter->vf_res
      in iavf_init_get_resources. Previous commit introduced a regression,
      where receiving IAVF_ERR_ADMIN_QUEUE_NO_WORK from iavf_get_vf_config
      would free adapter->vf_res. However, netdev is still registered, so
      ethtool_ops can be called. Calling iavf_get_link_ksettings with no vf_res,
      will result with:
      [ 9385.242676] BUG: kernel NULL pointer dereference, address: 0000000000000008
      [ 9385.242683] #PF: supervisor read access in kernel mode
      [ 9385.242686] #PF: error_code(0x0000) - not-present page
      [ 9385.242690] PGD 0 P4D 0
      [ 9385.242696] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
      [ 9385.242701] CPU: 6 PID: 3217 Comm: pmdalinux Kdump: loaded Tainted: G S          E     5.18.0-04958-ga54ce370-dirty #1
      [ 9385.242708] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.11.0 11/02/2019
      [ 9385.242710] RIP: 0010:iavf_get_link_ksettings+0x29/0xd0 [iavf]
      [ 9385.242745] Code: 00 0f 1f 44 00 00 b8 01 ef ff ff 48 c7 46 30 00 00 00 00 48 c7 46 38 00 00 00 00 c6 46 0b 00 66 89 46 08 48 8b 87 68 0e 00 00 <f6> 40 08 80 75 50 8b 87 5c 0e 00 00 83 f8 08 74 7a 76 1d 83 f8 20
      [ 9385.242749] RSP: 0018:ffffc0560ec7fbd0 EFLAGS: 00010246
      [ 9385.242755] RAX: 0000000000000000 RBX: ffffc0560ec7fc08 RCX: 0000000000000000
      [ 9385.242759] RDX: ffffffffc0ad4550 RSI: ffffc0560ec7fc08 RDI: ffffa0fc66674000
      [ 9385.242762] RBP: 00007ffd1fb2bf50 R08: b6a2d54b892363ee R09: ffffa101dc14fb00
      [ 9385.242765] R10: 0000000000000000 R11: 0000000000000004 R12: ffffa0fc66674000
      [ 9385.242768] R13: 0000000000000000 R14: ffffa0fc66674000 R15: 00000000ffffffa1
      [ 9385.242771] FS:  00007f93711a2980(0000) GS:ffffa0fad72c0000(0000) knlGS:0000000000000000
      [ 9385.242775] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9385.242778] CR2: 0000000000000008 CR3: 0000000a8e61c003 CR4: 00000000003706e0
      [ 9385.242781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9385.242784] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9385.242787] Call Trace:
      [ 9385.242791]  <TASK>
      [ 9385.242793]  ethtool_get_settings+0x71/0x1a0
      [ 9385.242814]  __dev_ethtool+0x426/0x2f40
      [ 9385.242823]  ? slab_post_alloc_hook+0x4f/0x280
      [ 9385.242836]  ? kmem_cache_alloc_trace+0x15d/0x2f0
      [ 9385.242841]  ? dev_ethtool+0x59/0x170
      [ 9385.242848]  dev_ethtool+0xa7/0x170
      [ 9385.242856]  dev_ioctl+0xc3/0x520
      [ 9385.242866]  sock_do_ioctl+0xa0/0xe0
      [ 9385.242877]  sock_ioctl+0x22f/0x320
      [ 9385.242885]  __x64_sys_ioctl+0x84/0xc0
      [ 9385.242896]  do_syscall_64+0x3a/0x80
      [ 9385.242904]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 9385.242918] RIP: 0033:0x7f93702396db
      [ 9385.242923] Code: 73 01 c3 48 8b 0d ad 57 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 7d 57 38 00 f7 d8 64 89 01 48
      [ 9385.242927] RSP: 002b:00007ffd1fb2bf18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 9385.242932] RAX: ffffffffffffffda RBX: 000055671b1d2fe0 RCX: 00007f93702396db
      [ 9385.242935] RDX: 00007ffd1fb2bf20 RSI: 0000000000008946 RDI: 0000000000000007
      [ 9385.242937] RBP: 00007ffd1fb2bf20 R08: 0000000000000003 R09: 0030763066307330
      [ 9385.242940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd1fb2bf80
      [ 9385.242942] R13: 0000000000000007 R14: 0000556719f6de90 R15: 00007ffd1fb2c1b0
      [ 9385.242948]  </TASK>
      [ 9385.242949] Modules linked in: iavf(E) xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat nf_nat_tftp nft_objref nf_conntrack_tftp bridge stp llc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink vfat fat irdma ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support ice irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl i40e pcspkr intel_cstate joydev mei_me intel_uncore mxm_wmi mei ipmi_ssif lpc_ich ipmi_si acpi_power_meter xfs libcrc32c mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft crc64 syscopyarea sg sysfillrect sysimgblt fb_sys_fops drm ixgbe ahci libahci libata crc32c_intel mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
      [ 9385.243065]  [last unloaded: iavf]
      
      Dereference happens in if (ADV_LINK_SUPPORT(adapter)) statement
      
      Fixes: 209f2f9c ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      541a1af4
    • Przemyslaw Patynowski's avatar
      iavf: Fix adminq error handling · 41983161
      Przemyslaw Patynowski authored
      iavf_alloc_asq_bufs/iavf_alloc_arq_bufs allocates with dma_alloc_coherent
      memory for VF mailbox.
      Free DMA regions for both ASQ and ARQ in case error happens during
      configuration of ASQ/ARQ registers.
      Without this change it is possible to see when unloading interface:
      74626.583369: dma_debug_device_change: device driver has pending DMA allocations while released from device [count=32]
      One of leaked entries details: [device address=0x0000000b27ff9000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent]
      
      Fixes: d358aa9a ("i40evf: init code and hardware support")
      Signed-off-by: default avatarPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      41983161
    • Li Qiong's avatar
      net: lan966x: fix checking for return value of platform_get_irq_byname() · 40b4ac88
      Li Qiong authored
      The platform_get_irq_byname() returns non-zero IRQ number
      or negative error number. "if (irq)" always true, chang it
      to "if (irq > 0)"
      Signed-off-by: default avatarLi Qiong <liqiong@nfschina.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40b4ac88
    • Jason Wang's avatar
      net: cxgb3: Fix comment typo · 75d8620d
      Jason Wang authored
      The double `the' is duplicated in the comment, remove one.
      Signed-off-by: default avatarJason Wang <wangborong@cdjrlc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75d8620d
    • Jason Wang's avatar
      bnx2x: Fix comment typo · 0619d0fa
      Jason Wang authored
      The double `the' is duplicated in the comment, remove one.
      Signed-off-by: default avatarJason Wang <wangborong@cdjrlc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0619d0fa
    • Jason Wang's avatar
      net: ipa: Fix comment typo · 9221b289
      Jason Wang authored
      The double `is' is duplicated in the comment, remove one.
      Signed-off-by: default avatarJason Wang <wangborong@cdjrlc.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9221b289
    • Michael S. Tsirkin's avatar
      virtio_net: fix endian-ness for RSS · 95bb6330
      Michael S. Tsirkin authored
      Using native endian-ness for device supplied fields is wrong
      on BE platforms. Sparse warns about this.
      
      Fixes: 91f41f01 ("drivers/net/virtio_net: Added RSS hash report.")
      Cc: "Andrew Melnychenko" <andrew@daynix.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95bb6330
    • Xin Xiong's avatar
      net/sunrpc: fix potential memory leaks in rpc_sysfs_xprt_state_change() · bfc48f1b
      Xin Xiong authored
      The issue happens on some error handling paths. When the function
      fails to grab the object `xprt`, it simply returns 0, forgetting to
      decrease the reference count of another object `xps`, which is
      increased by rpc_sysfs_xprt_kobj_get_xprt_switch(), causing refcount
      leaks. Also, the function forgets to check whether `xps` is valid
      before using it, which may result in NULL-dereferencing issues.
      
      Fix it by adding proper error handling code when either `xprt` or
      `xps` is NULL.
      
      Fixes: 5b7eb784 ("SUNRPC: take a xprt offline using sysfs")
      Signed-off-by: default avatarXin Xiong <xiongx18@fudan.edu.cn>
      Signed-off-by: default avatarXin Tan <tanxin.ctf@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfc48f1b
    • Jilin Yuan's avatar
      skfp/h: fix repeated words in comments · 86d2155e
      Jilin Yuan authored
      Delete the redundant word 'the'.
      Signed-off-by: default avatarJilin Yuan <yuanjilin@cdjrlc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86d2155e
    • Mikulas Patocka's avatar
      rds: add missing barrier to release_refill · 9f414eb4
      Mikulas Patocka authored
      The functions clear_bit and set_bit do not imply a memory barrier, thus it
      may be possible that the waitqueue_active function (which does not take
      any locks) is moved before clear_bit and it could miss a wakeup event.
      
      Fix this bug by adding a memory barrier after clear_bit.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f414eb4
  2. 11 Aug, 2022 28 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 7ebfc85e
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth, bpf, can and netfilter.
      
        A little larger than usual but it's all fixes, no late features. It's
        large partially because of timing, and partially because of follow ups
        to stuff that got merged a week or so before the merge window and
        wasn't as widely tested. Maybe the Bluetooth fixes are a little
        alarming so we'll address that, but the rest seems okay and not scary.
      
        Notably we're including a fix for the netfilter Kconfig [1], your WiFi
        warning [2] and a bluetooth fix which should unblock syzbot [3].
      
        Current release - regressions:
      
         - Bluetooth:
            - don't try to cancel uninitialized works [3]
            - L2CAP: fix use-after-free caused by l2cap_chan_put
      
         - tls: rx: fix device offload after recent rework
      
         - devlink: fix UAF on failed reload and leftover locks in mlxsw
      
        Current release - new code bugs:
      
         - netfilter:
            - flowtable: fix incorrect Kconfig dependencies [1]
            - nf_tables: fix crash when nf_trace is enabled
      
         - bpf:
            - use proper target btf when exporting attach_btf_obj_id
            - arm64: fixes for bpf trampoline support
      
         - Bluetooth:
            - ISO: unlock on error path in iso_sock_setsockopt()
            - ISO: fix info leak in iso_sock_getsockopt()
            - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
            - ISO: fix memory corruption on iso_pinfo.base
            - ISO: fix not using the correct QoS
            - hci_conn: fix updating ISO QoS PHY
      
         - phy: dp83867: fix get nvmem cell fail
      
        Previous releases - regressions:
      
         - wifi: cfg80211: fix validating BSS pointers in
           __cfg80211_connect_result [2]
      
         - atm: bring back zatm uAPI after ATM had been removed
      
         - properly fix old bug making bonding ARP monitor mode not being able
           to work with software devices with lockless Tx
      
         - tap: fix null-deref on skb->dev in dev_parse_header_protocol
      
         - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
           devices and breaks others
      
         - netfilter:
            - nf_tables: many fixes rejecting cross-object linking which may
              lead to UAFs
            - nf_tables: fix null deref due to zeroed list head
            - nf_tables: validate variable length element extension
      
         - bgmac: fix a BUG triggered by wrong bytes_compl
      
         - bcmgenet: indicate MAC is in charge of PHY PM
      
        Previous releases - always broken:
      
         - bpf:
            - fix bad pointer deref in bpf_sys_bpf() injected via test infra
            - disallow non-builtin bpf programs calling the prog_run command
            - don't reinit map value in prealloc_lru_pop
            - fix UAFs during the read of map iterator fd
            - fix invalidity check for values in sk local storage map
            - reject sleepable program for non-resched map iterator
      
         - mptcp:
            - move subflow cleanup in mptcp_destroy_common()
            - do not queue data on closed subflows
      
         - virtio_net: fix memory leak inside XDP_TX with mergeable
      
         - vsock: fix memory leak when multiple threads try to connect()
      
         - rework sk_user_data sharing to prevent psock leaks
      
         - geneve: fix TOS inheriting for ipv4
      
         - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
      
         - phy: c45 baset1: do not skip aneg configuration if clock role is
           not specified
      
         - rose: avoid overflow when /proc displays timer information
      
         - x25: fix call timeouts in blocking connects
      
         - can: mcp251x: fix race condition on receive interrupt
      
         - can: j1939:
            - replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
            - fix memory leak of skbs in j1939_session_destroy()
      
        Misc:
      
         - docs: bpf: clarify that many things are not uAPI
      
         - seg6: initialize induction variable to first valid array index (to
           silence clang vs objtool warning)
      
         - can: ems_usb: fix clang 14's -Wunaligned-access warning"
      
      * tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
        net: atm: bring back zatm uAPI
        dpaa2-eth: trace the allocated address instead of page struct
        net: add missing kdoc for struct genl_multicast_group::flags
        nfp: fix use-after-free in area_cache_get()
        MAINTAINERS: use my korg address for mt7601u
        mlxsw: minimal: Fix deadlock in ports creation
        bonding: fix reference count leak in balance-alb mode
        net: usb: qmi_wwan: Add support for Cinterion MV32
        bpf: Shut up kern_sys_bpf warning.
        net/tls: Use RCU API to access tls_ctx->netdev
        tls: rx: device: don't try to copy too much on detach
        tls: rx: device: bound the frag walk
        net_sched: cls_route: remove from list when handle is 0
        selftests: forwarding: Fix failing tests with old libnet
        net: refactor bpf_sk_reuseport_detach()
        net: fix refcount bug in sk_psock_get (2)
        selftests/bpf: Ensure sleepable program is rejected by hash map iter
        selftests/bpf: Add write tests for sk local storage map iterator
        selftests/bpf: Add tests for reading a dangling map iter fd
        bpf: Only allow sleepable program for resched-able iterator
        ...
      7ebfc85e
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e091ba5c
      Linus Torvalds authored
      Pull more ACPI updates from Rafael Wysocki:
       "These fix up direct references to the fwnode field in struct device
        and extend ACPI device properties support.
      
        Specifics:
      
         - Replace direct references to the fwnode field in struct device with
           dev_fwnode() and device_match_fwnode() (Andy Shevchenko)
      
         - Make the ACPI code handling device properties support properties
           with buffer values (Sakari Ailus)"
      
      * tag 'acpi-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: property: Fix error handling in acpi_init_properties()
        ACPI: VIOT: Do not dereference fwnode in struct device
        ACPI: property: Read buffer properties as integers
        ACPI: property: Add support for parsing buffer property UUID
        ACPI: property: Unify integer value reading functions
        ACPI: property: Switch node property referencing from ifs to a switch
        ACPI: property: Move property ref argument parsing into a new function
        ACPI: property: Use acpi_object_type consistently in property ref parsing
        ACPI: property: Tie data nodes to acpi handles
        ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
      e091ba5c
    • Linus Torvalds's avatar
      Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 8745889a
      Linus Torvalds authored
      Pull more iomap updates from Darrick Wong:
       "In the past 10 days or so I've not heard any ZOMG STOP style
        complaints about removing ->writepage support from gfs2 or zonefs, so
        here's the pull request removing them (and the underlying fs iomap
        support) from the kernel:
      
         - Remove iomap_writepage and all callers, since the mm apparently
           never called the zonefs or gfs2 writepage functions"
      
      * tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: remove iomap_writepage
        zonefs: remove ->writepage
        gfs2: remove ->writepage
        gfs2: stop using generic_writepages in gfs2_ail1_start_one
      8745889a
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client · 786da5da
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "We have a good pile of various fixes and cleanups from Xiubo, Jeff,
        Luis and others, almost exclusively in the filesystem.
      
        Several patches touch files outside of our normal purview to set the
        stage for bringing in Jeff's long awaited ceph+fscrypt series in the
        near future. All of them have appropriate acks and sat in linux-next
        for a while"
      
      * tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits)
        libceph: clean up ceph_osdc_start_request prototype
        libceph: fix ceph_pagelist_reserve() comment typo
        ceph: remove useless check for the folio
        ceph: don't truncate file in atomic_open
        ceph: make f_bsize always equal to f_frsize
        ceph: flush the dirty caps immediatelly when quota is approaching
        libceph: print fsid and epoch with osd id
        libceph: check pointer before assigned to "c->rules[]"
        ceph: don't get the inline data for new creating files
        ceph: update the auth cap when the async create req is forwarded
        ceph: make change_auth_cap_ses a global symbol
        ceph: fix incorrect old_size length in ceph_mds_request_args
        ceph: switch back to testing for NULL folio->private in ceph_dirty_folio
        ceph: call netfs_subreq_terminated with was_async == false
        ceph: convert to generic_file_llseek
        ceph: fix the incorrect comment for the ceph_mds_caps struct
        ceph: don't leak snap_rwsem in handle_cap_grant
        ceph: prevent a client from exceeding the MDS maximum xattr size
        ceph: choose auth MDS for getxattr with the Xs caps
        ceph: add session already open notify support
        ...
      786da5da
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · e18a9042
      Linus Torvalds authored
      Pull more kvm updates from Paolo Bonzini:
      
       - Xen timer fixes
      
       - Documentation formatting fixes
      
       - Make rseq selftest compatible with glibc-2.35
      
       - Fix handling of illegal LEA reg, reg
      
       - Cleanup creation of debugfs entries
      
       - Fix steal time cache handling bug
      
       - Fixes for MMIO caching
      
       - Optimize computation of number of LBRs
      
       - Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
        KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table
        Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline
        KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh
        KVM: VMX: Use proper type-safe functions for vCPU => LBRs helpers
        KVM: x86: Refresh PMU after writes to MSR_IA32_PERF_CAPABILITIES
        KVM: selftests: Test all possible "invalid" PERF_CAPABILITIES.LBR_FMT vals
        KVM: selftests: Use getcpu() instead of sched_getcpu() in rseq_test
        KVM: selftests: Make rseq compatible with glibc-2.35
        KVM: Actually create debugfs in kvm_create_vm()
        KVM: Pass the name of the VM fd to kvm_create_vm_debugfs()
        KVM: Get an fd before creating the VM
        KVM: Shove vcpu stats_id init into kvm_vcpu_init()
        KVM: Shove vm stats_id init into kvm_create_vm()
        KVM: x86/mmu: Add sanity check that MMIO SPTE mask doesn't overlap gen
        KVM: x86/mmu: rename trace function name for asynchronous page fault
        KVM: x86/xen: Stop Xen timer before changing IRQ
        KVM: x86/xen: Initialize Xen timer only once
        KVM: SVM: Disable SEV-ES support if MMIO caching is disable
        KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
        KVM: x86: Tag kvm_mmu_x86_module_init() with __init
        ...
      e18a9042
    • Jakub Kicinski's avatar
      net: atm: bring back zatm uAPI · c2e75634
      Jakub Kicinski authored
      Jiri reports that linux-atm does not build without this header.
      Bring it back. It's completely dead code but we can't break
      the build for user space :(
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Fixes: 052e1f01 ("net: atm: remove support for ZeitNet ZN122x ATM devices")
      Link: https://lore.kernel.org/all/8576aef3-37e4-8bae-bab5-08f82a78efd3@kernel.org/
      Link: https://lore.kernel.org/r/20220810164547.484378-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c2e75634
    • Chen Lin's avatar
      dpaa2-eth: trace the allocated address instead of page struct · e34f4934
      Chen Lin authored
      We should trace the allocated address instead of page struct.
      
      Fixes: 27c87486 ("dpaa2-eth: Use a single page per Rx buffer")
      Signed-off-by: default avatarChen Lin <chen.lin5@zte.com.cn>
      Reviewed-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Link: https://lore.kernel.org/r/20220811151651.3327-1-chen45464546@163.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e34f4934
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-properties' · da2679f2
      Rafael J. Wysocki authored
      Merge changes adding support for device properties with buffer values
      to the ACPI device properties handling code.
      
      * acpi-properties:
        ACPI: property: Fix error handling in acpi_init_properties()
        ACPI: property: Read buffer properties as integers
        ACPI: property: Add support for parsing buffer property UUID
        ACPI: property: Unify integer value reading functions
        ACPI: property: Switch node property referencing from ifs to a switch
        ACPI: property: Move property ref argument parsing into a new function
        ACPI: property: Use acpi_object_type consistently in property ref parsing
        ACPI: property: Tie data nodes to acpi handles
        ACPI: property: Return type of acpi_add_nondev_subnodes() should be bool
      da2679f2
    • Jakub Kicinski's avatar
      net: add missing kdoc for struct genl_multicast_group::flags · 5c221f0a
      Jakub Kicinski authored
      Multicast group flags were added in commit 4d54cc32 ("mptcp: avoid
      lock_fast usage in accept path"), but it missed adding the kdoc.
      
      Mention which flags go into that field, and do the same for
      op structs.
      
      Link: https://lore.kernel.org/r/20220809232012.403730-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5c221f0a
    • Linus Torvalds's avatar
      Merge tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 2ae08b36
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
      
       - changes to input core to properly queue synthetic events (such as
         autorepeat) and to release multitouch contacts when an input device
         is inhibited or suspended
      
       - reworked quirk handling in i8042 driver that consolidates multiple
         DMI tables into one and adds several quirks for TUXEDO line of
         laptops
      
       - update to mt6779 keypad to better reflect organization of the
         hardware
      
       - changes to mtk-pmic-keys driver preparing it to handle more variants
      
       - facelift of adp5588-keys driver
      
       - improvements to iqs7222 driver
      
       - adjustments to various DT binding documents for input devices
      
       - other assorted driver fixes.
      
      * tag 'input-for-v5.20-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (54 commits)
        Input: adc-joystick - fix ordering in adc_joystick_probe()
        dt-bindings: input: ariel-pwrbutton: use spi-peripheral-props.yaml
        Input: deactivate MT slots when inhibiting or suspending devices
        Input: properly queue synthetic events
        dt-bindings: input: iqs7222: Use central 'linux,code' definition
        Input: i8042 - add dritek quirk for Acer Aspire One AO532
        dt-bindings: input: gpio-keys: accept also interrupt-extended
        dt-bindings: input: gpio-keys: reference input.yaml and document properties
        dt-bindings: input: gpio-keys: enforce node names to match all properties
        dt-bindings: input: Convert adc-keys to DT schema
        dt-bindings: input: Centralize 'linux,input-type' definition
        dt-bindings: input: Use common 'linux,keycodes' definition
        dt-bindings: input: Centralize 'linux,code' definition
        dt-bindings: input: Increase maximum keycode value to 0x2ff
        Input: mt6779-keypad - implement row/column selection
        Input: mt6779-keypad - match hardware matrix organization
        Input: i8042 - add additional TUXEDO devices to i8042 quirk tables
        Input: goodix - switch use of acpi_gpio_get_*_resource() APIs
        Input: i8042 - add TUXEDO devices to i8042 quirk tables
        Input: i8042 - add debug output for quirks
        ...
      2ae08b36
    • Jialiang Wang's avatar
      nfp: fix use-after-free in area_cache_get() · 02e1a114
      Jialiang Wang authored
      area_cache_get() is used to distribute cache->area and set cache->id,
       and if cache->id is not 0 and cache->area->kref refcount is 0, it will
       release the cache->area by nfp_cpp_area_release(). area_cache_get()
       set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
      
      But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
       is already set but the refcount is not increased as expected. At this
       time, calling the nfp_cpp_area_release() will cause use-after-free.
      
      To avoid the use-after-free, set cache->id after area_init() and
       nfp_cpp_area_acquire() complete successfully.
      
      Note: This vulnerability is triggerable by providing emulated device
       equipped with specified configuration.
      
       BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
        Write of size 4 at addr ffff888005b7f4a0 by task swapper/0/1
      
       Call Trace:
        <TASK>
       nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
       area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)
      
       Allocated by task 1:
       nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303)
       nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802)
       nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230)
       nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215)
       nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)
      
       Freed by task 1:
       kfree (mm/slub.c:4562)
       area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873)
       nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973)
       nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)
      Signed-off-by: default avatarJialiang Wang <wangjialiang0806@163.com>
      Reviewed-by: default avatarYinjun Zhang <yinjun.zhang@corigine.com>
      Acked-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      02e1a114
    • Jakub Kicinski's avatar
      MAINTAINERS: use my korg address for mt7601u · cef8e326
      Jakub Kicinski authored
      Change my address for mt7601u to the main one.
      
      Link: https://lore.kernel.org/r/20220809233843.408004-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cef8e326
    • Vadim Pasternak's avatar
      mlxsw: minimal: Fix deadlock in ports creation · 4f98cb04
      Vadim Pasternak authored
      Drop devl_lock() / devl_unlock() from ports creation and removal flows
      since the devlink instance lock is now taken by mlxsw_core.
      
      Fixes: 72a4c8c9 ("mlxsw: convert driver to use unlocked devlink API during init/fini")
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/f4afce5ab0318617f3866b85274be52542d59b32.1660211614.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f98cb04
    • Jay Vosburgh's avatar
      bonding: fix reference count leak in balance-alb mode · 4f5d33f4
      Jay Vosburgh authored
      Commit d5410ac7 ("net:bonding:support balance-alb interface
      with vlan to bridge") introduced a reference count leak by not releasing
      the reference acquired by ip_dev_find().  Remedy this by insuring the
      reference is released.
      
      Fixes: d5410ac7 ("net:bonding:support balance-alb interface with vlan to bridge")
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Reviewed-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/26758.1660194413@famineSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f5d33f4
    • Linus Torvalds's avatar
      Revert "Makefile.extrawarn: re-enable -Wformat for clang" · 21f9c8a1
      Linus Torvalds authored
      This reverts commit 258fafcd.
      
      The clang -Wformat warning is terminally broken, and the clang people
      can't seem to get their act together.
      
      This test program causes a warning with clang:
      
      	#include <stdio.h>
      
      	int main(int argc, char **argv)
      	{
      		printf("%hhu\n", 'a');
      	}
      
      resulting in
      
        t.c:5:19: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]
                printf("%hhu\n", 'a');
                        ~~~~     ^~~
                        %d
      
      and apparently clang people consider that a feature, because they don't
      want to face the reality of how either C character constants, C
      arithmetic, and C varargs functions work.
      
      The rest of the world just shakes their head at that kind of
      incompetence, and turns off -Wformat for clang again.
      
      And no, the "you should use a pointless cast to shut this up" is not a
      valid answer.  That warning should not exist in the first place, or at
      least be optinal with some "-Wformat-me-harder" kind of option.
      
      [ Admittedly, there's also very little reason to *ever* use '%hh[ud]' in
        C, but what little reason there is is entirely about 'I want to see
        only the low 8 bits of the argument'. So I would suggest nobody ever
        use that format in the first place, but if they do, the clang
        behavious is simply always wrong. Because '%hhu' takes an 'int'. It's
        that simple. ]
      Reported-by: default avatarSudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      21f9c8a1
    • Slark Xiao's avatar
      net: usb: qmi_wwan: Add support for Cinterion MV32 · ae7107ba
      Slark Xiao authored
      There are 2 models for MV32 serials. MV32-W-A is designed
      based on Qualcomm SDX62 chip, and MV32-W-B is designed based
      on Qualcomm SDX65 chip. So we use 2 different PID to separate it.
      
      Test evidence as below:
      T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#=  3 Spd=480 MxCh= 0
      D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=1e2d ProdID=00f3 Rev=05.04
      S:  Manufacturer=Cinterion
      S:  Product=Cinterion PID 0x00F3 USB Mobile Broadband
      S:  SerialNumber=d7b4be8d
      C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
      I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
      I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
      I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
      
      T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=03 Dev#= 10 Spd=480 MxCh= 0
      D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=1e2d ProdID=00f4 Rev=05.04
      S:  Manufacturer=Cinterion
      S:  Product=Cinterion PID 0x00F4 USB Mobile Broadband
      S:  SerialNumber=d095087d
      C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
      I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
      I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
      I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
      Signed-off-by: default avatarSlark Xiao <slark_xiao@163.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Link: https://lore.kernel.org/r/20220810014521.9383-1-slark_xiao@163.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ae7107ba
    • Jakub Kicinski's avatar
    • Alexei Starovoitov's avatar
      bpf: Shut up kern_sys_bpf warning. · 4e4588f1
      Alexei Starovoitov authored
      Shut up this warning:
      kernel/bpf/syscall.c:5089:5: warning: no previous prototype for function 'kern_sys_bpf' [-Wmissing-prototypes]
      int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size)
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4e4588f1
    • Bagas Sanjaya's avatar
      KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table · 19a7cc81
      Bagas Sanjaya authored
      There is unexpected warning on KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability
      table, which cause the table to be rendered as paragraph text instead.
      
      The warning is due to missing colon at capability name and returns keyword,
      as well as improper alignment on multi-line returns field.
      
      Fix the warning by adding missing colons and aligning the field.
      
      Link: https://lore.kernel.org/lkml/20220627181937.3be67263@canb.auug.org.au/
      Fixes: 084cc29f ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis")
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: David Matlack <dmatlack@google.com>
      Cc: Ben Gardon <bgardon@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-next@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarBagas Sanjaya <bagasdotme@gmail.com>
      Message-Id: <20220627095151.19339-3-bagasdotme@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      19a7cc81
    • Bagas Sanjaya's avatar
      Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline · b4aed4d8
      Bagas Sanjaya authored
      Extend heading underline for KVM_CAP_VM_DISABLE_NX_HUGE_PAGE to match
      the heading text length.
      
      Link: https://lore.kernel.org/lkml/20220627181937.3be67263@canb.auug.org.au/
      Fixes: 084cc29f ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis")
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: David Matlack <dmatlack@google.com>
      Cc: Ben Gardon <bgardon@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-next@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarBagas Sanjaya <bagasdotme@gmail.com>
      Message-Id: <20220627095151.19339-2-bagasdotme@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b4aed4d8
    • Maxim Mikityanskiy's avatar
      net/tls: Use RCU API to access tls_ctx->netdev · 94ce3b64
      Maxim Mikityanskiy authored
      Currently, tls_device_down synchronizes with tls_device_resync_rx using
      RCU, however, the pointer to netdev is stored using WRITE_ONCE and
      loaded using READ_ONCE.
      
      Although such approach is technically correct (rcu_dereference is
      essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store
      NULL), using special RCU helpers for pointers is more valid, as it
      includes additional checks and might change the implementation
      transparently to the callers.
      
      Mark the netdev pointer as __rcu and use the correct RCU helpers to
      access it. For non-concurrent access pass the right conditions that
      guarantee safe access (locks taken, refcount value). Also use the
      correct helper in mlx5e, where even READ_ONCE was missing.
      
      The transition to RCU exposes existing issues, fixed by this commit:
      
      1. bond_tls_device_xmit could read netdev twice, and it could become
      NULL the second time, after the NULL check passed.
      
      2. Drivers shouldn't stop processing the last packet if tls_device_down
      just set netdev to NULL, before tls_dev_del was called. This prevents a
      possible packet drop when transitioning to the fallback software mode.
      
      Fixes: 89df6a81 ("net/bonding: Implement TLS TX device offload")
      Fixes: c55dcdd4 ("net/tls: Fix use-after-free after the TLS device goes down and up")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Link: https://lore.kernel.org/r/20220810081602.1435800-1-maximmi@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      94ce3b64
    • Jakub Kicinski's avatar
      tls: rx: device: don't try to copy too much on detach · d800a7b3
      Jakub Kicinski authored
      Another device offload bug, we use the length of the output
      skb as an indication of how much data to copy. But that skb
      is sized to offset + record length, and we start from offset.
      So we end up double-counting the offset which leads to
      skb_copy_bits() returning -EFAULT.
      Reported-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser")
      Tested-by: default avatarRan Rozenstein <ranro@nvidia.com>
      Link: https://lore.kernel.org/r/20220809175544.354343-2-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d800a7b3
    • Jakub Kicinski's avatar
      tls: rx: device: bound the frag walk · 86b259f6
      Jakub Kicinski authored
      We can't do skb_walk_frags() on the input skbs, because
      the input skbs is really just a pointer to the tcp read
      queue. We need to bound the "is decrypted" check by the
      amount of data in the message.
      
      Note that the walk in tls_device_reencrypt() is after a
      CoW so the skb there is safe to walk. Actually in the
      current implementation it can't have frags at all, but
      whatever, maybe one day it will.
      Reported-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser")
      Tested-by: default avatarRan Rozenstein <ranro@nvidia.com>
      Link: https://lore.kernel.org/r/20220809175544.354343-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      86b259f6
    • Thadeu Lima de Souza Cascardo's avatar
      net_sched: cls_route: remove from list when handle is 0 · 9ad36309
      Thadeu Lima de Souza Cascardo authored
      When a route filter is replaced and the old filter has a 0 handle, the old
      one won't be removed from the hashtable, while it will still be freed.
      
      The test was there since before commit 1109c005 ("net: sched: RCU
      cls_route"), when a new filter was not allocated when there was an old one.
      The old filter was reused and the reinserting would only be necessary if an
      old filter was replaced. That was still wrong for the same case where the
      old handle was 0.
      
      Remove the old filter from the list independently from its handle value.
      
      This fixes CVE-2022-2588, also reported as ZDI-CAN-17440.
      Reported-by: default avatarZhenpeng Lin <zplin@u.northwestern.edu>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: default avatarKamal Mostafa <kamal@canonical.com>
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20220809170518.164662-1-cascardo@canonical.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9ad36309
    • Ido Schimmel's avatar
      selftests: forwarding: Fix failing tests with old libnet · 8bcfb4ae
      Ido Schimmel authored
      The custom multipath hash tests use mausezahn in order to test how
      changes in various packet fields affect the packet distribution across
      the available nexthops.
      
      The tool uses the libnet library for various low-level packet
      construction and injection. The library started using the
      "SO_BINDTODEVICE" socket option for IPv6 sockets in version 1.1.6 and
      for IPv4 sockets in version 1.2.
      
      When the option is not set, packets are not routed according to the
      table associated with the VRF master device and tests fail.
      
      Fix this by prefixing the command with "ip vrf exec", which will cause
      the route lookup to occur in the VRF routing table. This makes the tests
      pass regardless of the libnet library version.
      
      Fixes: 511e8db5 ("selftests: forwarding: Add test for custom multipath hash")
      Fixes: 185b0c19 ("selftests: forwarding: Add test for custom multipath hash with IPv4 GRE")
      Fixes: b7715acb ("selftests: forwarding: Add test for custom multipath hash with IPv6 GRE")
      Reported-by: default avatarIvan Vecera <ivecera@redhat.com>
      Tested-by: default avatarIvan Vecera <ivecera@redhat.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Link: https://lore.kernel.org/r/20220809113320.751413-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8bcfb4ae
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · fbe8870f
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      bpf 2022-08-10
      
      We've added 23 non-merge commits during the last 7 day(s) which contain
      a total of 19 files changed, 424 insertions(+), 35 deletions(-).
      
      The main changes are:
      
      1) Several fixes for BPF map iterator such as UAFs along with selftests, from Hou Tao.
      
      2) Fix BPF syscall program's {copy,strncpy}_from_bpfptr() to not fault, from Jinghao Jia.
      
      3) Reject BPF syscall programs calling BPF_PROG_RUN, from Alexei Starovoitov and YiFei Zhu.
      
      4) Fix attach_btf_obj_id info to pick proper target BTF, from Stanislav Fomichev.
      
      5) BPF design Q/A doc update to clarify what is not stable ABI, from Paul E. McKenney.
      
      6) Fix BPF map's prealloc_lru_pop to not reinitialize, from Kumar Kartikeya Dwivedi.
      
      7) Fix bpf_trampoline_put to avoid leaking ftrace hash, from Jiri Olsa.
      
      8) Fix arm64 JIT to address sparse errors around BPF trampoline, from Xu Kuohai.
      
      9) Fix arm64 JIT to use kvcalloc instead of kcalloc for internal program address
         offset buffer, from Aijun Sun.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits)
        selftests/bpf: Ensure sleepable program is rejected by hash map iter
        selftests/bpf: Add write tests for sk local storage map iterator
        selftests/bpf: Add tests for reading a dangling map iter fd
        bpf: Only allow sleepable program for resched-able iterator
        bpf: Check the validity of max_rdwr_access for sock local storage map iterator
        bpf: Acquire map uref in .init_seq_private for sock{map,hash} iterator
        bpf: Acquire map uref in .init_seq_private for sock local storage map iterator
        bpf: Acquire map uref in .init_seq_private for hash map iterator
        bpf: Acquire map uref in .init_seq_private for array map iterator
        bpf: Disallow bpf programs call prog_run command.
        bpf, arm64: Fix bpf trampoline instruction endianness
        selftests/bpf: Add test for prealloc_lru_pop bug
        bpf: Don't reinit map value in prealloc_lru_pop
        bpf: Allow calling bpf_prog_test kfuncs in tracing programs
        bpf, arm64: Allocate program buffer using kvcalloc instead of kcalloc
        selftests/bpf: Excercise bpf_obj_get_info_by_fd for bpf2bpf
        bpf: Use proper target btf when exporting attach_btf_obj_id
        mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled
        bpf: Cleanup ftrace hash in bpf_trampoline_put
        BPF: Fix potential bad pointer dereference in bpf_sys_bpf()
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20220810190624.10748-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fbe8870f
    • Jakub Kicinski's avatar
      Merge branch 'net-enhancements-to-sk_user_data-field' · dd48f383
      Jakub Kicinski authored
      Hawkins Jiawei says:
      
      ====================
      net: enhancements to sk_user_data field
      
      This patchset fixes refcount bug by adding SK_USER_DATA_PSOCK flag bit in
      sk_user_data field. The bug cause following info:
      
      WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
      Modules linked in:
      CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda #0
       <TASK>
       __refcount_add_not_zero include/linux/refcount.h:163 [inline]
       __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
       refcount_inc_not_zero include/linux/refcount.h:245 [inline]
       sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
       tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
       tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
       tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
       tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
       tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
       sk_backlog_rcv include/net/sock.h:1061 [inline]
       __release_sock+0x134/0x3b0 net/core/sock.c:2849
       release_sock+0x54/0x1b0 net/core/sock.c:3404
       inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
       __sys_shutdown_sock net/socket.c:2331 [inline]
       __sys_shutdown_sock net/socket.c:2325 [inline]
       __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
       __do_sys_shutdown net/socket.c:2351 [inline]
       __se_sys_shutdown net/socket.c:2349 [inline]
       __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
       </TASK>
      
      To improve code maintainability, this patchset refactors sk_user_data
      flags code to be more generic.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1659676823.git.yin31149@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dd48f383
    • Hawkins Jiawei's avatar
      net: refactor bpf_sk_reuseport_detach() · cf8c1e96
      Hawkins Jiawei authored
      Refactor sk_user_data dereference using more generic function
      __rcu_dereference_sk_user_data_with_flags(), which improve its
      maintainability
      Suggested-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarHawkins Jiawei <yin31149@gmail.com>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf8c1e96