1. 03 May, 2024 12 commits
    • Chen-Yu Tsai's avatar
      arm64: dts: mediatek: mt8183-pico6: Fix bluetooth node · cd17bcbd
      Chen-Yu Tsai authored
      Bluetooth is not a random device connected to the MMC/SD controller. It
      is function 2 of the SDIO device.
      
      Fix the address of the bluetooth node. Also fix the node name and drop
      the label.
      
      Fixes: 055ef10c ("arm64: dts: mt8183: Add jacuzzi pico/pico6 board")
      Signed-off-by: default avatarChen-Yu Tsai <wenst@chromium.org>
      Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      cd17bcbd
    • Johan Hovold's avatar
      Bluetooth: qca: fix info leak when fetching board id · 0adcf6be
      Johan Hovold authored
      Add the missing sanity check when fetching the board id to avoid leaking
      slab data when later requesting the firmware.
      
      Fixes: a7f8dedb ("Bluetooth: qca: add support for QCA2066")
      Cc: stable@vger.kernel.org	# 6.7
      Cc: Tim Jiang <quic_tjiang@quicinc.com>
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      0adcf6be
    • Johan Hovold's avatar
      Bluetooth: qca: fix info leak when fetching fw build id · cda0d6a1
      Johan Hovold authored
      Add the missing sanity checks and move the 255-byte build-id buffer off
      the stack to avoid leaking stack data through debugfs in case the
      build-info reply is malformed.
      
      Fixes: c0187b0b ("Bluetooth: btqca: Add support to read FW build version for WCN3991 BTSoC")
      Cc: stable@vger.kernel.org	# 5.12
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      cda0d6a1
    • Johan Hovold's avatar
      Bluetooth: qca: generalise device address check · dd336649
      Johan Hovold authored
      The default device address apparently comes from the NVM configuration
      file and can differ quite a bit between controllers.
      
      Store the default address when parsing the configuration file and use it
      to determine whether the controller has been provisioned with an
      address.
      
      This makes sure that devices without a unique address start as
      unconfigured unless a valid address has been provided in the devicetree.
      
      Fixes: 32868e12 ("Bluetooth: qca: fix invalid device address check")
      Cc: stable@vger.kernel.org      # 6.5
      Cc: Doug Anderson <dianders@chromium.org>
      Cc: Janaki Ramaiah Thota <quic_janathot@quicinc.com>
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Tested-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      dd336649
    • Johan Hovold's avatar
      Bluetooth: qca: fix NVM configuration parsing · a112d3c7
      Johan Hovold authored
      The NVM configuration files used by WCN3988 and WCN3990/1/8 have two
      sets of configuration tags that are enclosed by a type-length header of
      type four which the current parser fails to account for.
      
      Instead the driver happily parses random data as if it were valid tags,
      something which can lead to the configuration data being corrupted if it
      ever encounters the words 0x0011 or 0x001b.
      
      As is clear from commit b6388254 ("Bluetooth: btqca: Fix the NVM
      baudrate tag offcet for wcn3991") the intention has always been to
      process the configuration data also for WCN3991 and WCN3998 which
      encodes the baud rate at a different offset.
      
      Fix the parser so that it can handle the WCN3xxx configuration files,
      which has an enclosing type-length header of type four and two sets of
      TLV tags enclosed by a type-length header of type two and three,
      respectively.
      
      Note that only the first set, which contains the tags the driver is
      currently looking for, will be parsed for now.
      
      With the parser fixed, the software in-band sleep bit will now be set
      for WCN3991 and WCN3998 (as it is for later controllers) and the default
      baud rate 3200000 may be updated by the driver also for WCN3xxx
      controllers.
      
      Notably the deep-sleep feature bit is already set by default in all
      configuration files in linux-firmware.
      
      Fixes: 4219d468 ("Bluetooth: btqca: Add wcn3990 firmware download support.")
      Cc: stable@vger.kernel.org	# 4.19
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      a112d3c7
    • Johan Hovold's avatar
      Bluetooth: qca: add missing firmware sanity checks · 2e4edfa1
      Johan Hovold authored
      Add the missing sanity checks when parsing the firmware files before
      downloading them to avoid accessing and corrupting memory beyond the
      vmalloced buffer.
      
      Fixes: 83e81961 ("Bluetooth: btqca: Introduce generic QCA ROME support")
      Cc: stable@vger.kernel.org	# 4.10
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      2e4edfa1
    • Sungwoo Kim's avatar
      Bluetooth: msft: fix slab-use-after-free in msft_do_close() · 10f9f426
      Sungwoo Kim authored
      Tying the msft->data lifetime to hdev by freeing it in
      hci_release_dev() to fix the following case:
      
      [use]
      msft_do_close()
        msft = hdev->msft_data;
        if (!msft)                      ...(1) <- passed.
          return;
        mutex_lock(&msft->filter_lock); ...(4) <- used after freed.
      
      [free]
      msft_unregister()
        msft = hdev->msft_data;
        hdev->msft_data = NULL;         ...(2)
        kfree(msft);                    ...(3) <- msft is freed.
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in __mutex_lock_common
      kernel/locking/mutex.c:587 [inline]
      BUG: KASAN: slab-use-after-free in __mutex_lock+0x8f/0xc30
      kernel/locking/mutex.c:752
      Read of size 8 at addr ffff888106cbbca8 by task kworker/u5:2/309
      
      Fixes: bf6a4e30 ("Bluetooth: disable advertisement filters during suspend")
      Signed-off-by: default avatarSungwoo Kim <iam@sung-woo.kim>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      10f9f426
    • Sungwoo Kim's avatar
      Bluetooth: L2CAP: Fix slab-use-after-free in l2cap_connect() · 4d7b41c0
      Sungwoo Kim authored
      Extend a critical section to prevent chan from early freeing.
      Also make the l2cap_connect() return type void. Nothing is using the
      returned value but it is ugly to return a potentially freed pointer.
      Making it void will help with backports because earlier kernels did use
      the return value. Now the compile will break for kernels where this
      patch is not a complete fix.
      
      Call stack summary:
      
      [use]
      l2cap_bredr_sig_cmd
        l2cap_connect
        ┌ mutex_lock(&conn->chan_lock);
        │ chan = pchan->ops->new_connection(pchan); <- alloc chan
        │ __l2cap_chan_add(conn, chan);
        │   l2cap_chan_hold(chan);
        │   list_add(&chan->list, &conn->chan_l);   ... (1)
        └ mutex_unlock(&conn->chan_lock);
          chan->conf_state              ... (4) <- use after free
      
      [free]
      l2cap_conn_del
      ┌ mutex_lock(&conn->chan_lock);
      │ foreach chan in conn->chan_l:            ... (2)
      │   l2cap_chan_put(chan);
      │     l2cap_chan_destroy
      │       kfree(chan)               ... (3) <- chan freed
      └ mutex_unlock(&conn->chan_lock);
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in instrument_atomic_read
      include/linux/instrumented.h:68 [inline]
      BUG: KASAN: slab-use-after-free in _test_bit
      include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
      BUG: KASAN: slab-use-after-free in l2cap_connect+0xa67/0x11a0
      net/bluetooth/l2cap_core.c:4260
      Read of size 8 at addr ffff88810bf040a0 by task kworker/u3:1/311
      
      Fixes: 73ffa904 ("Bluetooth: Move conf_{req,rsp} stuff to struct l2cap_chan")
      Signed-off-by: default avatarSungwoo Kim <iam@sung-woo.kim>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      4d7b41c0
    • Johan Hovold's avatar
      Bluetooth: qca: fix wcn3991 device address check · 66c39332
      Johan Hovold authored
      Qualcomm Bluetooth controllers may not have been provisioned with a
      valid device address and instead end up using the default address
      00:00:00:00:5a:ad.
      
      This address is now used to determine if a controller has a valid
      address or if one needs to be provided through devicetree or by user
      space before the controller can be used.
      
      It turns out that the WCN3991 controllers used in Chromium Trogdor
      machines use a different default address, 39:98:00:00:5a:ad, which also
      needs to be marked as invalid so that the correct address is fetched
      from the devicetree.
      
      Qualcomm has unfortunately not yet provided any answers as to whether
      the 39:98 encodes a hardware id and if there are other variants of the
      default address that needs to be handled by the driver.
      
      For now, add the Trogdor WCN3991 default address to the device address
      check to avoid having these controllers start with the default address
      instead of their assigned addresses.
      
      Fixes: 32868e12 ("Bluetooth: qca: fix invalid device address check")
      Cc: stable@vger.kernel.org      # 6.5
      Cc: Doug Anderson <dianders@chromium.org>
      Cc: Janaki Ramaiah Thota <quic_janathot@quicinc.com>
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Tested-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      66c39332
    • Duoming Zhou's avatar
      Bluetooth: Fix use-after-free bugs caused by sco_sock_timeout · 483bc081
      Duoming Zhou authored
      When the sco connection is established and then, the sco socket
      is releasing, timeout_work will be scheduled to judge whether
      the sco disconnection is timeout. The sock will be deallocated
      later, but it is dereferenced again in sco_sock_timeout. As a
      result, the use-after-free bugs will happen. The root cause is
      shown below:
      
          Cleanup Thread               |      Worker Thread
      sco_sock_release                 |
        sco_sock_close                 |
          __sco_sock_close             |
            sco_sock_set_timer         |
              schedule_delayed_work    |
        sco_sock_kill                  |    (wait a time)
          sock_put(sk) //FREE          |  sco_sock_timeout
                                       |    sock_hold(sk) //USE
      
      The KASAN report triggered by POC is shown below:
      
      [   95.890016] ==================================================================
      [   95.890496] BUG: KASAN: slab-use-after-free in sco_sock_timeout+0x5e/0x1c0
      [   95.890755] Write of size 4 at addr ffff88800c388080 by task kworker/0:0/7
      ...
      [   95.890755] Workqueue: events sco_sock_timeout
      [   95.890755] Call Trace:
      [   95.890755]  <TASK>
      [   95.890755]  dump_stack_lvl+0x45/0x110
      [   95.890755]  print_address_description+0x78/0x390
      [   95.890755]  print_report+0x11b/0x250
      [   95.890755]  ? __virt_addr_valid+0xbe/0xf0
      [   95.890755]  ? sco_sock_timeout+0x5e/0x1c0
      [   95.890755]  kasan_report+0x139/0x170
      [   95.890755]  ? update_load_avg+0xe5/0x9f0
      [   95.890755]  ? sco_sock_timeout+0x5e/0x1c0
      [   95.890755]  kasan_check_range+0x2c3/0x2e0
      [   95.890755]  sco_sock_timeout+0x5e/0x1c0
      [   95.890755]  process_one_work+0x561/0xc50
      [   95.890755]  worker_thread+0xab2/0x13c0
      [   95.890755]  ? pr_cont_work+0x490/0x490
      [   95.890755]  kthread+0x279/0x300
      [   95.890755]  ? pr_cont_work+0x490/0x490
      [   95.890755]  ? kthread_blkcg+0xa0/0xa0
      [   95.890755]  ret_from_fork+0x34/0x60
      [   95.890755]  ? kthread_blkcg+0xa0/0xa0
      [   95.890755]  ret_from_fork_asm+0x11/0x20
      [   95.890755]  </TASK>
      [   95.890755]
      [   95.890755] Allocated by task 506:
      [   95.890755]  kasan_save_track+0x3f/0x70
      [   95.890755]  __kasan_kmalloc+0x86/0x90
      [   95.890755]  __kmalloc+0x17f/0x360
      [   95.890755]  sk_prot_alloc+0xe1/0x1a0
      [   95.890755]  sk_alloc+0x31/0x4e0
      [   95.890755]  bt_sock_alloc+0x2b/0x2a0
      [   95.890755]  sco_sock_create+0xad/0x320
      [   95.890755]  bt_sock_create+0x145/0x320
      [   95.890755]  __sock_create+0x2e1/0x650
      [   95.890755]  __sys_socket+0xd0/0x280
      [   95.890755]  __x64_sys_socket+0x75/0x80
      [   95.890755]  do_syscall_64+0xc4/0x1b0
      [   95.890755]  entry_SYSCALL_64_after_hwframe+0x67/0x6f
      [   95.890755]
      [   95.890755] Freed by task 506:
      [   95.890755]  kasan_save_track+0x3f/0x70
      [   95.890755]  kasan_save_free_info+0x40/0x50
      [   95.890755]  poison_slab_object+0x118/0x180
      [   95.890755]  __kasan_slab_free+0x12/0x30
      [   95.890755]  kfree+0xb2/0x240
      [   95.890755]  __sk_destruct+0x317/0x410
      [   95.890755]  sco_sock_release+0x232/0x280
      [   95.890755]  sock_close+0xb2/0x210
      [   95.890755]  __fput+0x37f/0x770
      [   95.890755]  task_work_run+0x1ae/0x210
      [   95.890755]  get_signal+0xe17/0xf70
      [   95.890755]  arch_do_signal_or_restart+0x3f/0x520
      [   95.890755]  syscall_exit_to_user_mode+0x55/0x120
      [   95.890755]  do_syscall_64+0xd1/0x1b0
      [   95.890755]  entry_SYSCALL_64_after_hwframe+0x67/0x6f
      [   95.890755]
      [   95.890755] The buggy address belongs to the object at ffff88800c388000
      [   95.890755]  which belongs to the cache kmalloc-1k of size 1024
      [   95.890755] The buggy address is located 128 bytes inside of
      [   95.890755]  freed 1024-byte region [ffff88800c388000, ffff88800c388400)
      [   95.890755]
      [   95.890755] The buggy address belongs to the physical page:
      [   95.890755] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800c38a800 pfn:0xc388
      [   95.890755] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      [   95.890755] anon flags: 0x100000000000840(slab|head|node=0|zone=1)
      [   95.890755] page_type: 0xffffffff()
      [   95.890755] raw: 0100000000000840 ffff888006842dc0 0000000000000000 0000000000000001
      [   95.890755] raw: ffff88800c38a800 000000000010000a 00000001ffffffff 0000000000000000
      [   95.890755] head: 0100000000000840 ffff888006842dc0 0000000000000000 0000000000000001
      [   95.890755] head: ffff88800c38a800 000000000010000a 00000001ffffffff 0000000000000000
      [   95.890755] head: 0100000000000003 ffffea000030e201 ffffea000030e248 00000000ffffffff
      [   95.890755] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
      [   95.890755] page dumped because: kasan: bad access detected
      [   95.890755]
      [   95.890755] Memory state around the buggy address:
      [   95.890755]  ffff88800c387f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   95.890755]  ffff88800c388000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   95.890755] >ffff88800c388080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   95.890755]                    ^
      [   95.890755]  ffff88800c388100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   95.890755]  ffff88800c388180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   95.890755] ==================================================================
      
      Fix this problem by adding a check protected by sco_conn_lock to judget
      whether the conn->hcon is null. Because the conn->hcon will be set to null,
      when the sock is releasing.
      
      Fixes: ba316be1 ("Bluetooth: schedule SCO timeouts with delayed_work")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      483bc081
    • Kuniyuki Iwashima's avatar
      tcp: Use refcount_inc_not_zero() in tcp_twsk_unique(). · f2db7230
      Kuniyuki Iwashima authored
      Anderson Nascimento reported a use-after-free splat in tcp_twsk_unique()
      with nice analysis.
      
      Since commit ec94c269 ("tcp/dccp: avoid one atomic operation for
      timewait hashdance"), inet_twsk_hashdance() sets TIME-WAIT socket's
      sk_refcnt after putting it into ehash and releasing the bucket lock.
      
      Thus, there is a small race window where other threads could try to
      reuse the port during connect() and call sock_hold() in tcp_twsk_unique()
      for the TIME-WAIT socket with zero refcnt.
      
      If that happens, the refcnt taken by tcp_twsk_unique() is overwritten
      and sock_put() will cause underflow, triggering a real use-after-free
      somewhere else.
      
      To avoid the use-after-free, we need to use refcount_inc_not_zero() in
      tcp_twsk_unique() and give up on reusing the port if it returns false.
      
      [0]:
      refcount_t: addition on 0; use-after-free.
      WARNING: CPU: 0 PID: 1039313 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
      CPU: 0 PID: 1039313 Comm: trigger Not tainted 6.8.6-200.fc39.x86_64 #1
      Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
      RIP: 0010:refcount_warn_saturate+0xe5/0x110
      Code: 42 8e ff 0f 0b c3 cc cc cc cc 80 3d aa 13 ea 01 00 0f 85 5e ff ff ff 48 c7 c7 f8 8e b7 82 c6 05 96 13 ea 01 01 e8 7b 42 8e ff <0f> 0b c3 cc cc cc cc 48 c7 c7 50 8f b7 82 c6 05 7a 13 ea 01 01 e8
      RSP: 0018:ffffc90006b43b60 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: ffff888009bb3ef0 RCX: 0000000000000027
      RDX: ffff88807be218c8 RSI: 0000000000000001 RDI: ffff88807be218c0
      RBP: 0000000000069d70 R08: 0000000000000000 R09: ffffc90006b439f0
      R10: ffffc90006b439e8 R11: 0000000000000003 R12: ffff8880029ede84
      R13: 0000000000004e20 R14: ffffffff84356dc0 R15: ffff888009bb3ef0
      FS:  00007f62c10926c0(0000) GS:ffff88807be00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020ccb000 CR3: 000000004628c005 CR4: 0000000000f70ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? refcount_warn_saturate+0xe5/0x110
       ? __warn+0x81/0x130
       ? refcount_warn_saturate+0xe5/0x110
       ? report_bug+0x171/0x1a0
       ? refcount_warn_saturate+0xe5/0x110
       ? handle_bug+0x3c/0x80
       ? exc_invalid_op+0x17/0x70
       ? asm_exc_invalid_op+0x1a/0x20
       ? refcount_warn_saturate+0xe5/0x110
       tcp_twsk_unique+0x186/0x190
       __inet_check_established+0x176/0x2d0
       __inet_hash_connect+0x74/0x7d0
       ? __pfx___inet_check_established+0x10/0x10
       tcp_v4_connect+0x278/0x530
       __inet_stream_connect+0x10f/0x3d0
       inet_stream_connect+0x3a/0x60
       __sys_connect+0xa8/0xd0
       __x64_sys_connect+0x18/0x20
       do_syscall_64+0x83/0x170
       entry_SYSCALL_64_after_hwframe+0x78/0x80
      RIP: 0033:0x7f62c11a885d
      Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48
      RSP: 002b:00007f62c1091e58 EFLAGS: 00000296 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000020ccb004 RCX: 00007f62c11a885d
      RDX: 0000000000000010 RSI: 0000000020ccb000 RDI: 0000000000000003
      RBP: 00007f62c1091e90 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000296 R12: 00007f62c10926c0
      R13: ffffffffffffff88 R14: 0000000000000000 R15: 00007ffe237885b0
       </TASK>
      
      Fixes: ec94c269 ("tcp/dccp: avoid one atomic operation for timewait hashdance")
      Reported-by: default avatarAnderson Nascimento <anderson@allelesecurity.com>
      Closes: https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240501213145.62261-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f2db7230
    • Eric Dumazet's avatar
      tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets · 94062790
      Eric Dumazet authored
      TCP_SYN_RECV state is really special, it is only used by
      cross-syn connections, mostly used by fuzzers.
      
      In the following crash [1], syzbot managed to trigger a divide
      by zero in tcp_rcv_space_adjust()
      
      A socket makes the following state transitions,
      without ever calling tcp_init_transfer(),
      meaning tcp_init_buffer_space() is also not called.
      
               TCP_CLOSE
      connect()
               TCP_SYN_SENT
               TCP_SYN_RECV
      shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN)
               TCP_FIN_WAIT1
      
      To fix this issue, change tcp_shutdown() to not
      perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition,
      which makes no sense anyway.
      
      When tcp_rcv_state_process() later changes socket state
      from TCP_SYN_RECV to TCP_ESTABLISH, then look at
      sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state,
      and send a FIN packet from a sane socket state.
      
      This means tcp_send_fin() can now be called from BH
      context, and must use GFP_ATOMIC allocations.
      
      [1]
      divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
      CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dcc #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
       RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767
      Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48
      RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246
      RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7
      R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30
      R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da
      FS:  00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0
      Call Trace:
       <TASK>
        tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513
        tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578
        inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680
        sock_recvmsg_nosec net/socket.c:1046 [inline]
        sock_recvmsg+0x109/0x280 net/socket.c:1068
        ____sys_recvmsg+0x1db/0x470 net/socket.c:2803
        ___sys_recvmsg net/socket.c:2845 [inline]
        do_recvmmsg+0x474/0xae0 net/socket.c:2939
        __sys_recvmmsg net/socket.c:3018 [inline]
        __do_sys_recvmmsg net/socket.c:3041 [inline]
        __se_sys_recvmmsg net/socket.c:3034 [inline]
        __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7faeb6363db9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9
      RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005
      RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c
      R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Link: https://lore.kernel.org/r/20240501125448.896529-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      94062790
  2. 02 May, 2024 11 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 545c4944
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf.
      
        Relatively calm week, likely due to public holiday in most places. No
        known outstanding regressions.
      
        Current release - regressions:
      
         - rxrpc: fix wrong alignmask in __page_frag_alloc_align()
      
         - eth: e1000e: change usleep_range to udelay in PHY mdic access
      
        Previous releases - regressions:
      
         - gro: fix udp bad offset in socket lookup
      
         - bpf: fix incorrect runtime stat for arm64
      
         - tipc: fix UAF in error path
      
         - netfs: fix a potential infinite loop in extract_user_to_sg()
      
         - eth: ice: ensure the copied buf is NUL terminated
      
         - eth: qeth: fix kernel panic after setting hsuid
      
        Previous releases - always broken:
      
         - bpf:
             - verifier: prevent userspace memory access
             - xdp: use flags field to disambiguate broadcast redirect
      
         - bridge: fix multicast-to-unicast with fraglist GSO
      
         - mptcp: ensure snd_nxt is properly initialized on connect
      
         - nsh: fix outer header access in nsh_gso_segment().
      
         - eth: bcmgenet: fix racing registers access
      
         - eth: vxlan: fix stats counters.
      
        Misc:
      
         - a bunch of MAINTAINERS file updates"
      
      * tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
        MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
        MAINTAINERS: remove Ariel Elior
        net: gro: add flush check in udp_gro_receive_segment
        net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
        ipv4: Fix uninit-value access in __ip_make_skb()
        s390/qeth: Fix kernel panic after setting hsuid
        vxlan: Pull inner IP header in vxlan_rcv().
        tipc: fix a possible memleak in tipc_buf_append
        tipc: fix UAF in error path
        rxrpc: Clients must accept conn from any address
        net: core: reject skb_copy(_expand) for fraglist GSO skbs
        net: bridge: fix multicast-to-unicast with fraglist GSO
        mptcp: ensure snd_nxt is properly initialized on connect
        e1000e: change usleep_range to udelay in PHY mdic access
        net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
        cxgb4: Properly lock TX queue for the selftest.
        rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
        vxlan: Add missing VNI filter counter update in arp_reduce().
        vxlan: Fix racy device stats updates.
        net: qede: use return from qede_parse_actions()
        ...
      545c4944
    • Jakub Kicinski's avatar
      MAINTAINERS: mark MYRICOM MYRI-10G as Orphan · 78cfe547
      Jakub Kicinski authored
      Chris's email address bounces and lore hasn't seen an email
      from anyone with his name for almost a decade.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240430233532.1356982-1-kuba@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      78cfe547
    • Jakub Kicinski's avatar
      MAINTAINERS: remove Ariel Elior · c9ccbcd9
      Jakub Kicinski authored
      aelior@marvell.com bounces, we haven't seen Ariel on lore
      since March 2022.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/20240430233305.1356105-1-kuba@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c9ccbcd9
    • Paolo Abeni's avatar
      Merge branch 'net-gro-add-flush-flush_id-checks-and-fix-wrong-offset-in-udp' · a257f093
      Paolo Abeni authored
      Richard Gobert says:
      
      ====================
      net: gro: add flush/flush_id checks and fix wrong offset in udp
      
      This series fixes a bug in the complete phase of UDP in GRO, in which
      socket lookup fails due to using network_header when parsing encapsulated
      packets. The fix is to add network_offset and inner_network_offset to
      napi_gro_cb and use these offsets for socket lookup.
      
      In addition p->flush/flush_id should be checked in all UDP flows. The
      same logic from tcp_gro_receive is applied for all flows in
      udp_gro_receive_segment. This prevents packets with mismatching network
      headers (flush/flush_id turned on) from merging in UDP GRO.
      
      The original series includes a change to vxlan test which adds the local
      parameter to prevent similar future bugs. I plan to submit it separately to
      net-next.
      
      This series is part of a previously submitted series to net-next:
      https://lore.kernel.org/all/20240408141720.98832-1-richardbgobert@gmail.com/
      
      v3 -> v4:
       - Store network offsets, and use them only in udp_gro_complete flows
       - Correct commit hash used in Fixes tag
       - v3:
       https://lore.kernel.org/netdev/20240424163045.123528-1-richardbgobert@gmail.com/
      
      v2 -> v3:
       - Add network_offsets and fix udp bug in a single commit to make backporting easier
       - Write to inner_network_offset in {inet,ipv6}_gro_receive
       - Use network_offsets union in tcp[46]_gro_complete as well
       - v2:
       https://lore.kernel.org/netdev/20240419153542.121087-1-richardbgobert@gmail.com/
      
      v1 -> v2:
       - Use network_offsets instead of p_poff param as suggested by Willem
       - Check flush before postpull, and for all UDP GRO flows
       - v1:
       https://lore.kernel.org/netdev/20240412152120.115067-1-richardbgobert@gmail.com/
      ====================
      
      Link: https://lore.kernel.org/r/20240430143555.126083-1-richardbgobert@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      a257f093
    • Richard Gobert's avatar
      net: gro: add flush check in udp_gro_receive_segment · 5babae77
      Richard Gobert authored
      GRO-GSO path is supposed to be transparent and as such L3 flush checks are
      relevant to all UDP flows merging in GRO. This patch uses the same logic
      and code from tcp_gro_receive, terminating merge if flush is non zero.
      
      Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
      Signed-off-by: default avatarRichard Gobert <richardbgobert@gmail.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5babae77
    • Richard Gobert's avatar
      net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb · 5ef31ea5
      Richard Gobert authored
      Commits a6024562 ("udp: Add GRO functions to UDP socket") and 57c67ff4 ("udp:
      additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
      complete phase of gro. The functions always return skb->network_header,
      which in the case of encapsulated packets at the gro complete phase, is
      always set to the innermost L3 of the packet. That means that calling
      {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
      gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
      L3/L4 may return an unexpected value.
      
      This incorrect usage leads to a bug in GRO's UDP socket lookup.
      udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
      *_hdr functions return network_header which will point to the innermost L3,
      resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
      encapsulated packets.
      
      This patch adds network_offset and inner_network_offset to napi_gro_cb, and
      makes sure both are set correctly.
      
      To fix the issue, network_offsets union is used inside napi_gro_cb, in
      which both the outer and the inner network offsets are saved.
      
      Reproduction example:
      
      Endpoint configuration example (fou + local address bind)
      
          # ip fou add port 6666 ipproto 4
          # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
          # ip link set tun1 up
          # ip a add 1.1.1.2/24 dev tun1
      
      Netperf TCP_STREAM result on net-next before patch is applied:
      
      net-next main, GRO enabled:
          $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
          Recv   Send    Send
          Socket Socket  Message  Elapsed
          Size   Size    Size     Time     Throughput
          bytes  bytes   bytes    secs.    10^6bits/sec
      
          131072  16384  16384    5.28        2.37
      
      net-next main, GRO disabled:
          $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
          Recv   Send    Send
          Socket Socket  Message  Elapsed
          Size   Size    Size     Time     Throughput
          bytes  bytes   bytes    secs.    10^6bits/sec
      
          131072  16384  16384    5.01     2745.06
      
      patch applied, GRO enabled:
          $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
          Recv   Send    Send
          Socket Socket  Message  Elapsed
          Size   Size    Size     Time     Throughput
          bytes  bytes   bytes    secs.    10^6bits/sec
      
          131072  16384  16384    5.01     2877.38
      
      Fixes: a6024562 ("udp: Add GRO functions to UDP socket")
      Signed-off-by: default avatarRichard Gobert <richardbgobert@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5ef31ea5
    • Shigeru Yoshida's avatar
      ipv4: Fix uninit-value access in __ip_make_skb() · fc1092f5
      Shigeru Yoshida authored
      KMSAN reported uninit-value access in __ip_make_skb() [1].  __ip_make_skb()
      tests HDRINCL to know if the skb has icmphdr. However, HDRINCL can cause a
      race condition. If calling setsockopt(2) with IP_HDRINCL changes HDRINCL
      while __ip_make_skb() is running, the function will access icmphdr in the
      skb even if it is not included. This causes the issue reported by KMSAN.
      
      Check FLOWI_FLAG_KNOWN_NH on fl4->flowi4_flags instead of testing HDRINCL
      on the socket.
      
      Also, fl4->fl4_icmp_type and fl4->fl4_icmp_code are not initialized. These
      are union in struct flowi4 and are implicitly initialized by
      flowi4_init_output(), but we should not rely on specific union layout.
      
      Initialize these explicitly in raw_sendmsg().
      
      [1]
      BUG: KMSAN: uninit-value in __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481
       __ip_make_skb+0x2b74/0x2d20 net/ipv4/ip_output.c:1481
       ip_finish_skb include/net/ip.h:243 [inline]
       ip_push_pending_frames+0x4c/0x5c0 net/ipv4/ip_output.c:1508
       raw_sendmsg+0x2381/0x2690 net/ipv4/raw.c:654
       inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x274/0x3c0 net/socket.c:745
       __sys_sendto+0x62c/0x7b0 net/socket.c:2191
       __do_sys_sendto net/socket.c:2203 [inline]
       __se_sys_sendto net/socket.c:2199 [inline]
       __x64_sys_sendto+0x130/0x200 net/socket.c:2199
       do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was created at:
       slab_post_alloc_hook mm/slub.c:3804 [inline]
       slab_alloc_node mm/slub.c:3845 [inline]
       kmem_cache_alloc_node+0x5f6/0xc50 mm/slub.c:3888
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:577
       __alloc_skb+0x35a/0x7c0 net/core/skbuff.c:668
       alloc_skb include/linux/skbuff.h:1318 [inline]
       __ip_append_data+0x49ab/0x68c0 net/ipv4/ip_output.c:1128
       ip_append_data+0x1e7/0x260 net/ipv4/ip_output.c:1365
       raw_sendmsg+0x22b1/0x2690 net/ipv4/raw.c:648
       inet_sendmsg+0x27b/0x2a0 net/ipv4/af_inet.c:851
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x274/0x3c0 net/socket.c:745
       __sys_sendto+0x62c/0x7b0 net/socket.c:2191
       __do_sys_sendto net/socket.c:2203 [inline]
       __se_sys_sendto net/socket.c:2199 [inline]
       __x64_sys_sendto+0x130/0x200 net/socket.c:2199
       do_syscall_64+0xd8/0x1f0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      CPU: 1 PID: 15709 Comm: syz-executor.7 Not tainted 6.8.0-11567-gb3603fcb #25
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
      
      Fixes: 99e5acae ("ipv4: Fix potential uninit variable access bug in __ip_make_skb()")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Link: https://lore.kernel.org/r/20240430123945.2057348-1-syoshida@redhat.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fc1092f5
    • Alexandra Winter's avatar
      s390/qeth: Fix kernel panic after setting hsuid · 8a2e4d37
      Alexandra Winter authored
      Symptom:
      When the hsuid attribute is set for the first time on an IQD Layer3
      device while the corresponding network interface is already UP,
      the kernel will try to execute a napi function pointer that is NULL.
      
      Example:
      ---------------------------------------------------------------------------
      [ 2057.572696] illegal operation: 0001 ilc:1 [#1] SMP
      [ 2057.572702] Modules linked in: af_iucv qeth_l3 zfcp scsi_transport_fc sunrpc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
      nft_reject nft_ct nf_tables_set nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink ghash_s390 prng xts aes_s390 des_s390 de
      s_generic sha3_512_s390 sha3_256_s390 sha512_s390 vfio_ccw vfio_mdev mdev vfio_iommu_type1 eadm_sch vfio ext4 mbcache jbd2 qeth_l2 bridge stp llc dasd_eckd_mod qeth dasd_mod
       qdio ccwgroup pkey zcrypt
      [ 2057.572739] CPU: 6 PID: 60182 Comm: stress_client Kdump: loaded Not tainted 4.18.0-541.el8.s390x #1
      [ 2057.572742] Hardware name: IBM 3931 A01 704 (LPAR)
      [ 2057.572744] Krnl PSW : 0704f00180000000 0000000000000002 (0x2)
      [ 2057.572748]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
      [ 2057.572751] Krnl GPRS: 0000000000000004 0000000000000000 00000000a3b008d8 0000000000000000
      [ 2057.572754]            00000000a3b008d8 cb923a29c779abc5 0000000000000000 00000000814cfd80
      [ 2057.572756]            000000000000012c 0000000000000000 00000000a3b008d8 00000000a3b008d8
      [ 2057.572758]            00000000bab6d500 00000000814cfd80 0000000091317e46 00000000814cfc68
      [ 2057.572762] Krnl Code:#0000000000000000: 0000                illegal
                               >0000000000000002: 0000                illegal
                                0000000000000004: 0000                illegal
                                0000000000000006: 0000                illegal
                                0000000000000008: 0000                illegal
                                000000000000000a: 0000                illegal
                                000000000000000c: 0000                illegal
                                000000000000000e: 0000                illegal
      [ 2057.572800] Call Trace:
      [ 2057.572801] ([<00000000ec639700>] 0xec639700)
      [ 2057.572803]  [<00000000913183e2>] net_rx_action+0x2ba/0x398
      [ 2057.572809]  [<0000000091515f76>] __do_softirq+0x11e/0x3a0
      [ 2057.572813]  [<0000000090ce160c>] do_softirq_own_stack+0x3c/0x58
      [ 2057.572817] ([<0000000090d2cbd6>] do_softirq.part.1+0x56/0x60)
      [ 2057.572822]  [<0000000090d2cc60>] __local_bh_enable_ip+0x80/0x98
      [ 2057.572825]  [<0000000091314706>] __dev_queue_xmit+0x2be/0xd70
      [ 2057.572827]  [<000003ff803dd6d6>] afiucv_hs_send+0x24e/0x300 [af_iucv]
      [ 2057.572830]  [<000003ff803dd88a>] iucv_send_ctrl+0x102/0x138 [af_iucv]
      [ 2057.572833]  [<000003ff803de72a>] iucv_sock_connect+0x37a/0x468 [af_iucv]
      [ 2057.572835]  [<00000000912e7e90>] __sys_connect+0xa0/0xd8
      [ 2057.572839]  [<00000000912e9580>] sys_socketcall+0x228/0x348
      [ 2057.572841]  [<0000000091514e1a>] system_call+0x2a6/0x2c8
      [ 2057.572843] Last Breaking-Event-Address:
      [ 2057.572844]  [<0000000091317e44>] __napi_poll+0x4c/0x1d8
      [ 2057.572846]
      [ 2057.572847] Kernel panic - not syncing: Fatal exception in interrupt
      -------------------------------------------------------------------------------------------
      
      Analysis:
      There is one napi structure per out_q: card->qdio.out_qs[i].napi
      The napi.poll functions are set during qeth_open().
      
      Since
      commit 1cfef80d ("s390/qeth: Don't call dev_close/dev_open (DOWN/UP)")
      qeth_set_offline()/qeth_set_online() no longer call dev_close()/
      dev_open(). So if qeth_free_qdio_queues() cleared
      card->qdio.out_qs[i].napi.poll while the network interface was UP and the
      card was offline, they are not set again.
      
      Reproduction:
      chzdev -e $devno layer2=0
      ip link set dev $network_interface up
      echo 0 > /sys/bus/ccwgroup/devices/0.0.$devno/online
      echo foo > /sys/bus/ccwgroup/devices/0.0.$devno/hsuid
      echo 1 > /sys/bus/ccwgroup/devices/0.0.$devno/online
      -> Crash (can be enforced e.g. by af_iucv connect(), ip link down/up, ...)
      
      Note that a Completion Queue (CQ) is only enabled or disabled, when hsuid
      is set for the first time or when it is removed.
      
      Workarounds:
      - Set hsuid before setting the device online for the first time
      or
      - Use chzdev -d $devno; chzdev $devno hsuid=xxx; chzdev -e $devno;
      to set hsuid on an existing device. (this will remove and recreate the
      network interface)
      
      Fix:
      There is no need to free the output queues when a completion queue is
      added or removed.
      card->qdio.state now indicates whether the inbound buffer pool and the
      outbound queues are allocated.
      card->qdio.c_q indicates whether a CQ is allocated.
      
      Fixes: 1cfef80d ("s390/qeth: Don't call dev_close/dev_open (DOWN/UP)")
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240430091004.2265683-1-wintera@linux.ibm.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8a2e4d37
    • Guillaume Nault's avatar
      vxlan: Pull inner IP header in vxlan_rcv(). · f7789419
      Guillaume Nault authored
      Ensure the inner IP header is part of skb's linear data before reading
      its ECN bits. Otherwise we might read garbage.
      One symptom is the system erroneously logging errors like
      "vxlan: non-ECT from xxx.xxx.xxx.xxx with TOS=xxxx".
      
      Similar bugs have been fixed in geneve, ip_tunnel and ip6_tunnel (see
      commit 1ca1ba46 ("geneve: make sure to pull inner header in
      geneve_rx()") for example). So let's reuse the same code structure for
      consistency. Maybe we'll can add a common helper in the future.
      
      Fixes: d342894c ("vxlan: virtual extensible lan")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Link: https://lore.kernel.org/r/1239c8db54efec341dd6455c77e0380f58923a3c.1714495737.git.gnault@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f7789419
    • Xin Long's avatar
      tipc: fix a possible memleak in tipc_buf_append · 97bf6f81
      Xin Long authored
      __skb_linearize() doesn't free the skb when it fails, so move
      '*buf = NULL' after __skb_linearize(), so that the skb can be
      freed on the err path.
      
      Fixes: b7df21cf ("tipc: skb_linearize the head skb when reassembling msgs")
      Reported-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Link: https://lore.kernel.org/r/90710748c29a1521efac4f75ea01b3b7e61414cf.1714485818.git.lucien.xin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      97bf6f81
    • Paolo Abeni's avatar
      tipc: fix UAF in error path · 080cbb89
      Paolo Abeni authored
      Sam Page (sam4k) working with Trend Micro Zero Day Initiative reported
      a UAF in the tipc_buf_append() error path:
      
      BUG: KASAN: slab-use-after-free in kfree_skb_list_reason+0x47e/0x4c0
      linux/net/core/skbuff.c:1183
      Read of size 8 at addr ffff88804d2a7c80 by task poc/8034
      
      CPU: 1 PID: 8034 Comm: poc Not tainted 6.8.2 #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.16.0-debian-1.16.0-5 04/01/2014
      Call Trace:
       <IRQ>
       __dump_stack linux/lib/dump_stack.c:88
       dump_stack_lvl+0xd9/0x1b0 linux/lib/dump_stack.c:106
       print_address_description linux/mm/kasan/report.c:377
       print_report+0xc4/0x620 linux/mm/kasan/report.c:488
       kasan_report+0xda/0x110 linux/mm/kasan/report.c:601
       kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183
       skb_release_data+0x5af/0x880 linux/net/core/skbuff.c:1026
       skb_release_all linux/net/core/skbuff.c:1094
       __kfree_skb linux/net/core/skbuff.c:1108
       kfree_skb_reason+0x12d/0x210 linux/net/core/skbuff.c:1144
       kfree_skb linux/./include/linux/skbuff.h:1244
       tipc_buf_append+0x425/0xb50 linux/net/tipc/msg.c:186
       tipc_link_input+0x224/0x7c0 linux/net/tipc/link.c:1324
       tipc_link_rcv+0x76e/0x2d70 linux/net/tipc/link.c:1824
       tipc_rcv+0x45f/0x10f0 linux/net/tipc/node.c:2159
       tipc_udp_recv+0x73b/0x8f0 linux/net/tipc/udp_media.c:390
       udp_queue_rcv_one_skb+0xad2/0x1850 linux/net/ipv4/udp.c:2108
       udp_queue_rcv_skb+0x131/0xb00 linux/net/ipv4/udp.c:2186
       udp_unicast_rcv_skb+0x165/0x3b0 linux/net/ipv4/udp.c:2346
       __udp4_lib_rcv+0x2594/0x3400 linux/net/ipv4/udp.c:2422
       ip_protocol_deliver_rcu+0x30c/0x4e0 linux/net/ipv4/ip_input.c:205
       ip_local_deliver_finish+0x2e4/0x520 linux/net/ipv4/ip_input.c:233
       NF_HOOK linux/./include/linux/netfilter.h:314
       NF_HOOK linux/./include/linux/netfilter.h:308
       ip_local_deliver+0x18e/0x1f0 linux/net/ipv4/ip_input.c:254
       dst_input linux/./include/net/dst.h:461
       ip_rcv_finish linux/net/ipv4/ip_input.c:449
       NF_HOOK linux/./include/linux/netfilter.h:314
       NF_HOOK linux/./include/linux/netfilter.h:308
       ip_rcv+0x2c5/0x5d0 linux/net/ipv4/ip_input.c:569
       __netif_receive_skb_one_core+0x199/0x1e0 linux/net/core/dev.c:5534
       __netif_receive_skb+0x1f/0x1c0 linux/net/core/dev.c:5648
       process_backlog+0x101/0x6b0 linux/net/core/dev.c:5976
       __napi_poll.constprop.0+0xba/0x550 linux/net/core/dev.c:6576
       napi_poll linux/net/core/dev.c:6645
       net_rx_action+0x95a/0xe90 linux/net/core/dev.c:6781
       __do_softirq+0x21f/0x8e7 linux/kernel/softirq.c:553
       do_softirq linux/kernel/softirq.c:454
       do_softirq+0xb2/0xf0 linux/kernel/softirq.c:441
       </IRQ>
       <TASK>
       __local_bh_enable_ip+0x100/0x120 linux/kernel/softirq.c:381
       local_bh_enable linux/./include/linux/bottom_half.h:33
       rcu_read_unlock_bh linux/./include/linux/rcupdate.h:851
       __dev_queue_xmit+0x871/0x3ee0 linux/net/core/dev.c:4378
       dev_queue_xmit linux/./include/linux/netdevice.h:3169
       neigh_hh_output linux/./include/net/neighbour.h:526
       neigh_output linux/./include/net/neighbour.h:540
       ip_finish_output2+0x169f/0x2550 linux/net/ipv4/ip_output.c:235
       __ip_finish_output linux/net/ipv4/ip_output.c:313
       __ip_finish_output+0x49e/0x950 linux/net/ipv4/ip_output.c:295
       ip_finish_output+0x31/0x310 linux/net/ipv4/ip_output.c:323
       NF_HOOK_COND linux/./include/linux/netfilter.h:303
       ip_output+0x13b/0x2a0 linux/net/ipv4/ip_output.c:433
       dst_output linux/./include/net/dst.h:451
       ip_local_out linux/net/ipv4/ip_output.c:129
       ip_send_skb+0x3e5/0x560 linux/net/ipv4/ip_output.c:1492
       udp_send_skb+0x73f/0x1530 linux/net/ipv4/udp.c:963
       udp_sendmsg+0x1a36/0x2b40 linux/net/ipv4/udp.c:1250
       inet_sendmsg+0x105/0x140 linux/net/ipv4/af_inet.c:850
       sock_sendmsg_nosec linux/net/socket.c:730
       __sock_sendmsg linux/net/socket.c:745
       __sys_sendto+0x42c/0x4e0 linux/net/socket.c:2191
       __do_sys_sendto linux/net/socket.c:2203
       __se_sys_sendto linux/net/socket.c:2199
       __x64_sys_sendto+0xe0/0x1c0 linux/net/socket.c:2199
       do_syscall_x64 linux/arch/x86/entry/common.c:52
       do_syscall_64+0xd8/0x270 linux/arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x6f/0x77 linux/arch/x86/entry/entry_64.S:120
      RIP: 0033:0x7f3434974f29
      Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
      89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
      01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
      RSP: 002b:00007fff9154f2b8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3434974f29
      RDX: 00000000000032c8 RSI: 00007fff9154f300 RDI: 0000000000000003
      RBP: 00007fff915532e0 R08: 00007fff91553360 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000212 R12: 000055ed86d261d0
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      
      In the critical scenario, either the relevant skb is freed or its
      ownership is transferred into a frag_lists. In both cases, the cleanup
      code must not free it again: we need to clear the skb reference earlier.
      
      Fixes: 1149557d ("tipc: eliminate unnecessary linearization of incoming buffers")
      Cc: stable@vger.kernel.org
      Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23852
      Acked-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/752f1ccf762223d109845365d07f55414058e5a3.1714484273.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      080cbb89
  3. 01 May, 2024 9 commits
  4. 30 Apr, 2024 5 commits
  5. 29 Apr, 2024 3 commits
    • Linus Torvalds's avatar
      Merge tag 'wq-for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 98369dcc
      Linus Torvalds authored
      Pull workqueue fixes from Tejun Heo:
       "Two doc update patches and the following three fixes:
      
         - On single node systems, the default pool is used but the
           node_nr_active for the default pool was set to min_active. This
           effectively limited the max concurrency of unbound pools on single
           node systems to 8 causing performance regressions on some
           workloads. Fixed by setting the default pool's node_nr_active to
           max_active.
      
         - wq_update_node_max_active() could trigger divide-by-zero if the
           intersection between the allowed CPUs for an unbound workqueue and
           online CPUs becomes empty.
      
         - When kick_pool() was trying to repatriate a worker to a CPU in its
           pod by setting task->wake_cpu, it didn't consider whether the CPU
           being selected is online or not which obviously can lead to
           subobtimal behaviors. On s390, this triggered a crash in arch code.
           The workqueue patch removes the gross misbehavior but doesn't fix
           the crash completely as there's a race window in which CPUs can go
           down after wake_cpu is set. Need to decide whether the fix should
           be on the core or arch side"
      
      * tag 'wq-for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: Fix divide error in wq_update_node_max_active()
        workqueue: The default node_nr_active should have its max set to max_active
        workqueue: Fix selection of wake_cpu in kick_pool()
        docs/zh_CN: core-api: Update translation of workqueue.rst to 6.9-rc1
        Documentation/core-api: Update events_freezable_power references.
      98369dcc
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · d03d4188
      Linus Torvalds authored
      Pull SCSI fix from James Bottomley:
       "Minor core fix to prevent the sd driver printing the stream count
        every time we rescan and instead print only if it's changed"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sd: Only print updates to permanent stream count
      d03d4188
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · a91bae87
      Linus Torvalds authored
      Pull nfsd fix from Chuck Lever:
      
       - Avoid freeing unallocated memory (v6.7 regression)
      
      * tag 'nfsd-6.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        NFSD: Fix nfsd4_encode_fattr4() crasher
      a91bae87