1. 12 Sep, 2024 10 commits
    • Linus Torvalds's avatar
      Merge tag 'wq-for-6.11-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 5da02886
      Linus Torvalds authored
      Pull workqueue fix from Tejun Heo:
       "A fix for a NULL worker->pool deref bug which can be triggered when a
        worker is created and then destroyed immediately"
      
      * tag 'wq-for-6.11-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: Clear worker->pool in the worker thread context
      5da02886
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 8581ae1e
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - Two fixes for smp_processor_id() calls in preemptible sections: one
         if the perf driver, and one in the fence.i prctl.
      
      * tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
        drivers: perf: Fix smp_processor_id() use in preemptible code
      8581ae1e
    • Linus Torvalds's avatar
      Merge tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 5abfdfd4
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from netfilter.
      
        There is a recently notified BT regression with no fix yet. I do not
        think a fix will land in the next week.
      
        Current release - regressions:
      
         - core: tighten bad gso csum offset check in virtio_net_hdr
      
         - netfilter: move nf flowtable bpf initialization in
           nf_flow_table_module_init()
      
         - eth: ice: stop calling pci_disable_device() as we use pcim
      
         - eth: fou: fix null-ptr-deref in GRO.
      
        Current release - new code bugs:
      
         - hsr: prevent NULL pointer dereference in hsr_proxy_announce()
      
        Previous releases - regressions:
      
         - hsr: remove seqnr_lock
      
         - netfilter: nft_socket: fix sk refcount leaks
      
         - mptcp: pm: fix uaf in __timer_delete_sync
      
         - phy: dp83822: fix NULL pointer dereference on DP83825 devices
      
         - eth: revert "virtio_net: rx enable premapped mode by default"
      
         - eth: octeontx2-af: Modify SMQ flush sequence to drop packets
      
        Previous releases - always broken:
      
         - eth: mlx5: fix bridge mode operations when there are no VFs
      
         - eth: igb: Always call igb_xdp_ring_update_tail() under Tx lock"
      
      * tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
        net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
        net: tighten bad gso csum offset check in virtio_net_hdr
        netlink: specs: mptcp: fix port endianness
        net: dpaa: Pad packets to ETH_ZLEN
        mptcp: pm: Fix uaf in __timer_delete_sync
        net: libwx: fix number of Rx and Tx descriptors
        net: dsa: felix: ignore pending status of TAS module when it's disabled
        net: hsr: prevent NULL pointer dereference in hsr_proxy_announce()
        selftests: mptcp: include net_helper.sh file
        selftests: mptcp: include lib.sh file
        selftests: mptcp: join: restrict fullmesh endp on 1st sf
        netfilter: nft_socket: make cgroupsv2 matching work with namespaces
        netfilter: nft_socket: fix sk refcount leaks
        MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
        dt-bindings: net: tja11xx: fix the broken binding
        selftests: net: csum: Fix checksums for packets with non-zero padding
        net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
        virtio_net: disable premapped mode by default
        Revert "virtio_net: big mode skip the unmap check"
        Revert "virtio_net: rx remove premapped failover code"
        ...
      5abfdfd4
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.11-7' of... · 42c5b519
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Ilpo Järvinen:
      
       - asus-wmi: Disable OOBE that interferes with backlight control
      
       - panasonic-laptop: Two fixes to SINF array handling
      
      * tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86: asus-wmi: Disable OOBE experience on Zenbook S 16
        platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf array
        platform/x86: panasonic-laptop: Fix SINF array out of bounds accesses
      42c5b519
    • Linus Torvalds's avatar
      mm: avoid leaving partial pfn mappings around in error case · 79a61cc3
      Linus Torvalds authored
      As Jann points out, PFN mappings are special, because unlike normal
      memory mappings, there is no lifetime information associated with the
      mapping - it is just a raw mapping of PFNs with no reference counting of
      a 'struct page'.
      
      That's all very much intentional, but it does mean that it's easy to
      mess up the cleanup in case of errors.  Yes, a failed mmap() will always
      eventually clean up any partial mappings, but without any explicit
      lifetime in the page table mapping itself, it's very easy to do the
      error handling in the wrong order.
      
      In particular, it's easy to mistakenly free the physical backing store
      before the page tables are actually cleaned up and (temporarily) have
      stale dangling PTE entries.
      
      To make this situation less error-prone, just make sure that any partial
      pfn mapping is torn down early, before any other error handling.
      Reported-and-tested-by: default avatarJann Horn <jannh@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Simona Vetter <simona.vetter@ffwll.ch>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79a61cc3
    • Lorenzo Bianconi's avatar
      net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init() · 3e705251
      Lorenzo Bianconi authored
      Move nf flowtable bpf initialization in nf_flow_table module load
      routine since nf_flow_table_bpf is part of nf_flow_table module and not
      nf_flow_table_inet one. This patch allows to avoid the following kernel
      warning running the reproducer below:
      
      $modprobe nf_flow_table_inet
      $rmmod nf_flow_table_inet
      $modprobe nf_flow_table_inet
      modprobe: ERROR: could not insert 'nf_flow_table_inet': Invalid argument
      
      [  184.081501] ------------[ cut here ]------------
      [  184.081527] WARNING: CPU: 0 PID: 1362 at kernel/bpf/btf.c:8206 btf_populate_kfunc_set+0x23c/0x330
      [  184.081550] CPU: 0 UID: 0 PID: 1362 Comm: modprobe Kdump: loaded Not tainted 6.11.0-0.rc5.22.el10.x86_64 #1
      [  184.081553] Hardware name: Red Hat OpenStack Compute, BIOS 1.14.0-1.module+el8.4.0+8855+a9e237a9 04/01/2014
      [  184.081554] RIP: 0010:btf_populate_kfunc_set+0x23c/0x330
      [  184.081558] RSP: 0018:ff22cfb38071fc90 EFLAGS: 00010202
      [  184.081559] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
      [  184.081560] RDX: 000000000000006e RSI: ffffffff95c00000 RDI: ff13805543436350
      [  184.081561] RBP: ffffffffc0e22180 R08: ff13805543410808 R09: 000000000001ec00
      [  184.081562] R10: ff13805541c8113c R11: 0000000000000010 R12: ff13805541b83c00
      [  184.081563] R13: ff13805543410800 R14: 0000000000000001 R15: ffffffffc0e2259a
      [  184.081564] FS:  00007fa436c46740(0000) GS:ff1380557ba00000(0000) knlGS:0000000000000000
      [  184.081569] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  184.081570] CR2: 000055e7b3187000 CR3: 0000000100c48003 CR4: 0000000000771ef0
      [  184.081571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  184.081572] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  184.081572] PKRU: 55555554
      [  184.081574] Call Trace:
      [  184.081575]  <TASK>
      [  184.081578]  ? show_trace_log_lvl+0x1b0/0x2f0
      [  184.081580]  ? show_trace_log_lvl+0x1b0/0x2f0
      [  184.081582]  ? __register_btf_kfunc_id_set+0x199/0x200
      [  184.081585]  ? btf_populate_kfunc_set+0x23c/0x330
      [  184.081586]  ? __warn.cold+0x93/0xed
      [  184.081590]  ? btf_populate_kfunc_set+0x23c/0x330
      [  184.081592]  ? report_bug+0xff/0x140
      [  184.081594]  ? handle_bug+0x3a/0x70
      [  184.081596]  ? exc_invalid_op+0x17/0x70
      [  184.081597]  ? asm_exc_invalid_op+0x1a/0x20
      [  184.081601]  ? btf_populate_kfunc_set+0x23c/0x330
      [  184.081602]  __register_btf_kfunc_id_set+0x199/0x200
      [  184.081605]  ? __pfx_nf_flow_inet_module_init+0x10/0x10 [nf_flow_table_inet]
      [  184.081607]  do_one_initcall+0x58/0x300
      [  184.081611]  do_init_module+0x60/0x230
      [  184.081614]  __do_sys_init_module+0x17a/0x1b0
      [  184.081617]  do_syscall_64+0x7d/0x160
      [  184.081620]  ? __count_memcg_events+0x58/0xf0
      [  184.081623]  ? handle_mm_fault+0x234/0x350
      [  184.081626]  ? do_user_addr_fault+0x347/0x640
      [  184.081630]  ? clear_bhb_loop+0x25/0x80
      [  184.081633]  ? clear_bhb_loop+0x25/0x80
      [  184.081634]  ? clear_bhb_loop+0x25/0x80
      [  184.081637]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
      [  184.081639] RIP: 0033:0x7fa43652e4ce
      [  184.081647] RSP: 002b:00007ffe8213be18 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
      [  184.081649] RAX: ffffffffffffffda RBX: 000055e7b3176c20 RCX: 00007fa43652e4ce
      [  184.081650] RDX: 000055e7737fde79 RSI: 0000000000003990 RDI: 000055e7b3185380
      [  184.081651] RBP: 000055e7737fde79 R08: 0000000000000007 R09: 000055e7b3179bd0
      [  184.081651] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000040000
      [  184.081652] R13: 000055e7b3176fa0 R14: 0000000000000000 R15: 000055e7b3179b80
      
      Fixes: 391bb659 ("netfilter: Add bpf_xdp_flow_lookup kfunc")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Link: https://patch.msgid.link/20240911-nf-flowtable-bpf-modprob-fix-v1-1-f9fc075aafc3@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3e705251
    • Paolo Abeni's avatar
      Merge tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 87009709
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following batch contains two fixes from Florian Westphal:
      
      Patch #1 fixes a sk refcount leak in nft_socket on mismatch.
      
      Patch #2 fixes cgroupsv2 matching from containers due to incorrect
      	 level in subtree.
      
      netfilter pull request 24-09-12
      
      * tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nft_socket: make cgroupsv2 matching work with namespaces
        netfilter: nft_socket: fix sk refcount leaks
      ====================
      
      Link: https://patch.msgid.link/20240911222520.3606-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      87009709
    • Lai Jiangshan's avatar
      workqueue: Clear worker->pool in the worker thread context · 73613840
      Lai Jiangshan authored
      Marc Hartmayer reported:
              [   23.133876] Unable to handle kernel pointer dereference in virtual kernel address space
              [   23.133950] Failing address: 0000000000000000 TEID: 0000000000000483
              [   23.133954] Fault in home space mode while using kernel ASCE.
              [   23.133957] AS:000000001b8f0007 R3:0000000056cf4007 S:0000000056cf3800 P:000000000000003d
              [   23.134207] Oops: 0004 ilc:2 [#1] SMP
      	(snip)
              [   23.134516] Call Trace:
              [   23.134520]  [<0000024e326caf28>] worker_thread+0x48/0x430
              [   23.134525] ([<0000024e326caf18>] worker_thread+0x38/0x430)
              [   23.134528]  [<0000024e326d3a3e>] kthread+0x11e/0x130
              [   23.134533]  [<0000024e3264b0dc>] __ret_from_fork+0x3c/0x60
              [   23.134536]  [<0000024e333fb37a>] ret_from_fork+0xa/0x38
              [   23.134552] Last Breaking-Event-Address:
              [   23.134553]  [<0000024e333f4c04>] mutex_unlock+0x24/0x30
              [   23.134562] Kernel panic - not syncing: Fatal exception: panic_on_oops
      
      With debuging and analysis, worker_thread() accesses to the nullified
      worker->pool when the newly created worker is destroyed before being
      waken-up, in which case worker_thread() can see the result detach_worker()
      reseting worker->pool to NULL at the begining.
      
      Move the code "worker->pool = NULL;" out from detach_worker() to fix the
      problem.
      
      worker->pool had been designed to be constant for regular workers and
      changeable for rescuer. To share attaching/detaching code for regular
      and rescuer workers and to avoid worker->pool being accessed inadvertently
      when the worker has been detached, worker->pool is reset to NULL when
      detached no matter the worker is rescuer or not.
      
      To maintain worker->pool being reset after detached, move the code
      "worker->pool = NULL;" in the worker thread context after detached.
      
      It is either be in the regular worker thread context after PF_WQ_WORKER
      is cleared or in rescuer worker thread context with wq_pool_attach_mutex
      held. So it is safe to do so.
      
      Cc: Marc Hartmayer <mhartmay@linux.ibm.com>
      Link: https://lore.kernel.org/lkml/87wmjj971b.fsf@linux.ibm.com/Reported-by: default avatarMarc Hartmayer <mhartmay@linux.ibm.com>
      Fixes: f4b7b53c ("workqueue: Detach workers directly in idle_cull_fn()")
      Cc: stable@vger.kernel.org # v6.11+
      Signed-off-by: default avatarLai Jiangshan <jiangshan.ljs@antgroup.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      73613840
    • Willem de Bruijn's avatar
      net: tighten bad gso csum offset check in virtio_net_hdr · 6513eb3d
      Willem de Bruijn authored
      The referenced commit drops bad input, but has false positives.
      Tighten the check to avoid these.
      
      The check detects illegal checksum offload requests, which produce
      csum_start/csum_off beyond end of packet after segmentation.
      
      But it is based on two incorrect assumptions:
      
      1. virtio_net_hdr_to_skb with VIRTIO_NET_HDR_GSO_TCP[46] implies GSO.
      True in callers that inject into the tx path, such as tap.
      But false in callers that inject into rx, like virtio-net.
      Here, the flags indicate GRO, and CHECKSUM_UNNECESSARY or
      CHECKSUM_NONE without VIRTIO_NET_HDR_F_NEEDS_CSUM is normal.
      
      2. TSO requires checksum offload, i.e., ip_summed == CHECKSUM_PARTIAL.
      False, as tcp[46]_gso_segment will fix up csum_start and offset for
      all other ip_summed by calling __tcp_v4_send_check.
      
      Because of 2, we can limit the scope of the fix to virtio_net_hdr
      that do try to set these fields, with a bogus value.
      
      Link: https://lore.kernel.org/netdev/20240909094527.GA3048202@port70.net/
      Fixes: 89add400 ("net: drop bad gso csum_start and offset in virtio_net_hdr")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Cc: stable@vger.kernel.org
      Link: https://patch.msgid.link/20240910213553.839926-1-willemdebruijn.kernel@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6513eb3d
    • Asbjørn Sloth Tønnesen's avatar
      netlink: specs: mptcp: fix port endianness · 09a45a55
      Asbjørn Sloth Tønnesen authored
      The MPTCP port attribute is in host endianness, but was documented
      as big-endian in the ynl specification.
      
      Below are two examples from net/mptcp/pm_netlink.c showing that the
      attribute is converted to/from host endianness for use with netlink.
      
      Import from netlink:
        addr->port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]))
      
      Export to netlink:
        nla_put_u16(skb, MPTCP_PM_ADDR_ATTR_PORT, ntohs(addr->port))
      
      Where addr->port is defined as __be16.
      
      No functional change intended.
      
      Fixes: bc8aeb20 ("Documentation: netlink: add a YAML spec for mptcp")
      Signed-off-by: default avatarAsbjørn Sloth Tønnesen <ast@fiberby.net>
      Reviewed-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Link: https://patch.msgid.link/20240911091003.1112179-1-ast@fiberby.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      09a45a55
  2. 11 Sep, 2024 24 commits
  3. 10 Sep, 2024 6 commits