1. 19 Jun, 2024 10 commits
  2. 18 Jun, 2024 16 commits
    • Simon Horman's avatar
      selftests: openvswitch: Use bash as interpreter · e2b447c9
      Simon Horman authored
      openvswitch.sh makes use of substitutions of the form ${ns:0:1}, to
      obtain the first character of $ns. Empirically, this is works with bash
      but not dash. When run with dash these evaluate to an empty string and
      printing an error to stdout.
      
       # dash -c 'ns=client; echo "${ns:0:1}"' 2>error
       # cat error
       dash: 1: Bad substitution
       # bash -c 'ns=client; echo "${ns:0:1}"' 2>error
       c
       # cat error
      
      This leads to tests that neither pass nor fail.
      F.e.
      
       TEST: arp_ping                                                      [START]
       adding sandbox 'test_arp_ping'
       Adding DP/Bridge IF: sbx:test_arp_ping dp:arpping {, , }
       create namespaces
       ./openvswitch.sh: 282: eval: Bad substitution
       TEST: ct_connect_v4                                                 [START]
       adding sandbox 'test_ct_connect_v4'
       Adding DP/Bridge IF: sbx:test_ct_connect_v4 dp:ct4 {, , }
       ./openvswitch.sh: 322: eval: Bad substitution
       create namespaces
      
      Resolve this by making openvswitch.sh a bash script.
      
      Fixes: 918423fd ("selftests: openvswitch: add an initial flow programming case")
      Signed-off-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Link: https://lore.kernel.org/r/20240617-ovs-selftest-bash-v1-1-7ae6ccd3617b@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2b447c9
    • Dan Carpenter's avatar
      ptp: fix integer overflow in max_vclocks_store · 81d23d2a
      Dan Carpenter authored
      On 32bit systems, the "4 * max" multiply can overflow.  Use kcalloc()
      to do the allocation to prevent this.
      
      Fixes: 44c494c8 ("ptp: track available ptp vclocks information")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarHeng Qi <hengqi@linux.alibaba.com>
      Link: https://lore.kernel.org/r/ee8110ed-6619-4bd7-9024-28c1f2ac24f4@moroto.mountainSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      81d23d2a
    • Xin Long's avatar
      sched: act_ct: add netns into the key of tcf_ct_flow_table · 88c67aeb
      Xin Long authored
      zones_ht is a global hashtable for flow_table with zone as key. However,
      it does not consider netns when getting a flow_table from zones_ht in
      tcf_ct_init(), and it means an act_ct action in netns A may get a
      flow_table that belongs to netns B if it has the same zone value.
      
      In Shuang's test with the TOPO:
      
        tcf2_c <---> tcf2_sw1 <---> tcf2_sw2 <---> tcf2_s
      
      tcf2_sw1 and tcf2_sw2 saw the same flow and used the same flow table,
      which caused their ct entries entering unexpected states and the
      TCP connection not able to end normally.
      
      This patch fixes the issue simply by adding netns into the key of
      tcf_ct_flow_table so that an act_ct action gets a flow_table that
      belongs to its own netns in tcf_ct_init().
      
      Note that for easy coding we don't use tcf_ct_flow_table.nf_ft.net,
      as the ct_ft is initialized after inserting it to the hashtable in
      tcf_ct_flow_table_get() and also it requires to implement several
      functions in rhashtable_params including hashfn, obj_hashfn and
      obj_cmpfn.
      
      Fixes: 64ff70b8 ("net/sched: act_ct: Offload established connections to flow table")
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/1db5b6cc6902c5fc6f8c6cbd85494a2008087be5.1718488050.git.lucien.xin@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      88c67aeb
    • Xin Long's avatar
      tipc: force a dst refcount before doing decryption · 2ebe8f84
      Xin Long authored
      As it says in commit 3bc07321 ("xfrm: Force a dst refcount before
      entering the xfrm type handlers"):
      
      "Crypto requests might return asynchronous. In this case we leave the
       rcu protected region, so force a refcount on the skb's destination
       entry before we enter the xfrm type input/output handlers."
      
      On TIPC decryption path it has the same problem, and skb_dst_force()
      should be called before doing decryption to avoid a possible crash.
      
      Shuang reported this issue when this warning is triggered:
      
        [] WARNING: include/net/dst.h:337 tipc_sk_rcv+0x1055/0x1ea0 [tipc]
        [] Kdump: loaded Tainted: G W --------- - - 4.18.0-496.el8.x86_64+debug
        [] Workqueue: crypto cryptd_queue_worker
        [] RIP: 0010:tipc_sk_rcv+0x1055/0x1ea0 [tipc]
        [] Call Trace:
        [] tipc_sk_mcast_rcv+0x548/0xea0 [tipc]
        [] tipc_rcv+0xcf5/0x1060 [tipc]
        [] tipc_aead_decrypt_done+0x215/0x2e0 [tipc]
        [] cryptd_aead_crypt+0xdb/0x190
        [] cryptd_queue_worker+0xed/0x190
        [] process_one_work+0x93d/0x17e0
      
      Fixes: fc1b6d6d ("tipc: introduce TIPC encryption & authentication")
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/fbe3195fad6997a4eec62d9bf076b2ad03ac336b.1718476040.git.lucien.xin@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2ebe8f84
    • David Ruth's avatar
      net/sched: act_api: fix possible infinite loop in tcf_idr_check_alloc() · d8643198
      David Ruth authored
      syzbot found hanging tasks waiting on rtnl_lock [1]
      
      A reproducer is available in the syzbot bug.
      
      When a request to add multiple actions with the same index is sent, the
      second request will block forever on the first request. This holds
      rtnl_lock, and causes tasks to hang.
      
      Return -EAGAIN to prevent infinite looping, while keeping documented
      behavior.
      
      [1]
      
      INFO: task kworker/1:0:5088 blocked for more than 143 seconds.
      Not tainted 6.9.0-rc4-syzkaller-00173-g3cdb4559 #0
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:kworker/1:0 state:D stack:23744 pid:5088 tgid:5088 ppid:2 flags:0x00004000
      Workqueue: events_power_efficient reg_check_chans_work
      Call Trace:
      <TASK>
      context_switch kernel/sched/core.c:5409 [inline]
      __schedule+0xf15/0x5d00 kernel/sched/core.c:6746
      __schedule_loop kernel/sched/core.c:6823 [inline]
      schedule+0xe7/0x350 kernel/sched/core.c:6838
      schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6895
      __mutex_lock_common kernel/locking/mutex.c:684 [inline]
      __mutex_lock+0x5b8/0x9c0 kernel/locking/mutex.c:752
      wiphy_lock include/net/cfg80211.h:5953 [inline]
      reg_leave_invalid_chans net/wireless/reg.c:2466 [inline]
      reg_check_chans_work+0x10a/0x10e0 net/wireless/reg.c:2481
      
      Fixes: 0190c1d4 ("net: sched: atomically check-allocate action")
      Reported-by: syzbot+b87c222546179f4513a7@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=b87c222546179f4513a7Signed-off-by: default avatarDavid Ruth <druth@chromium.org>
      Reviewed-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20240614190326.1349786-1-druth@chromium.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d8643198
    • Paolo Abeni's avatar
      Merge branch 'net-lan743x-fixes-for-multiple-wol-related-issues' · 0271d289
      Paolo Abeni authored
      Raju Lakkaraju says:
      
      ====================
      net: lan743x: Fixes for multiple WOL related issues
      
      This patch series implement the following fixes:
      1. Disable WOL upon resume in order to restore full data path operation
      2. Support WOL at both the PHY and MAC appropriately
      3. Remove interrupt mask clearing from config_init
      
      Patch-3 was sent seperately earlier. Review comments in link:
      https://lore.kernel.org/lkml/4a565d54-f468-4e32-8a2c-102c1203f72c@lunn.ch/T/
      ====================
      
      Link: https://lore.kernel.org/r/20240614171157.190871-1-Raju.Lakkaraju@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0271d289
    • Raju Lakkaraju's avatar
      net: phy: mxl-gpy: Remove interrupt mask clearing from config_init · c44d3ffd
      Raju Lakkaraju authored
      When the system resumes from sleep, the phy_init_hw() function invokes
      config_init(), which clears all interrupt masks and causes wake events to be
      lost in subsequent wake sequences. Remove interrupt mask clearing from
      config_init() and preserve relevant masks in config_intr().
      
      Fixes: 7d901a1e ("net: phy: add Maxlinear GPY115/21x/24x driver")
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarRaju Lakkaraju <Raju.Lakkaraju@microchip.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c44d3ffd
    • Raju Lakkaraju's avatar
      net: lan743x: Support WOL at both the PHY and MAC appropriately · 8c248cd8
      Raju Lakkaraju authored
      Prevent options not supported by the PHY from being requested to it by the MAC
      Whenever a WOL option is supported by both, the PHY is given priority
      since that usually leads to better power savings.
      
      Fixes: e9e13b6a ("lan743x: fix for potential NULL pointer dereference with bare card")
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarRaju Lakkaraju <Raju.Lakkaraju@microchip.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8c248cd8
    • Raju Lakkaraju's avatar
      net: lan743x: disable WOL upon resume to restore full data path operation · 77253639
      Raju Lakkaraju authored
      When Wake-on-LAN (WoL) is active and the system is in suspend mode, triggering
      a system event can wake the system from sleep, which may block the data path.
      To restore normal data path functionality after waking, disable all wake-up
      events. Furthermore, clear all Write 1 to Clear (W1C) status bits by writing
      1's to them.
      
      Fixes: 4d94282a ("lan743x: Add power management support")
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarRaju Lakkaraju <Raju.Lakkaraju@microchip.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      77253639
    • Stefan Wahren's avatar
      qca_spi: Make interrupt remembering atomic · 2d719827
      Stefan Wahren authored
      The whole mechanism to remember occurred SPI interrupts is not atomic,
      which could lead to unexpected behavior. So fix this by using atomic bit
      operations instead.
      
      Fixes: 291ab06e ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
      Signed-off-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://lore.kernel.org/r/20240614145030.7781-1-wahrenst@gmx.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2d719827
    • Yue Haibing's avatar
      netns: Make get_net_ns() handle zero refcount net · ff960f9d
      Yue Haibing authored
      Syzkaller hit a warning:
      refcount_t: addition on 0; use-after-free.
      WARNING: CPU: 3 PID: 7890 at lib/refcount.c:25 refcount_warn_saturate+0xdf/0x1d0
      Modules linked in:
      CPU: 3 PID: 7890 Comm: tun Not tainted 6.10.0-rc3-00100-gcaa4f9578aba-dirty #310
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:refcount_warn_saturate+0xdf/0x1d0
      Code: 41 49 04 31 ff 89 de e8 9f 1e cd fe 84 db 75 9c e8 76 26 cd fe c6 05 b6 41 49 04 01 90 48 c7 c7 b8 8e 25 86 e8 d2 05 b5 fe 90 <0f> 0b 90 90 e9 79 ff ff ff e8 53 26 cd fe 0f b6 1
      RSP: 0018:ffff8881067b7da0 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff811c72ac
      RDX: ffff8881026a2140 RSI: ffffffff811c72b5 RDI: 0000000000000001
      RBP: ffff8881067b7db0 R08: 0000000000000000 R09: 205b5d3730353139
      R10: 0000000000000000 R11: 205d303938375420 R12: ffff8881086500c4
      R13: ffff8881086500c4 R14: ffff8881086500b0 R15: ffff888108650040
      FS:  00007f5b2961a4c0(0000) GS:ffff88823bd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055d7ed36fd18 CR3: 00000001482f6000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       ? show_regs+0xa3/0xc0
       ? __warn+0xa5/0x1c0
       ? refcount_warn_saturate+0xdf/0x1d0
       ? report_bug+0x1fc/0x2d0
       ? refcount_warn_saturate+0xdf/0x1d0
       ? handle_bug+0xa1/0x110
       ? exc_invalid_op+0x3c/0xb0
       ? asm_exc_invalid_op+0x1f/0x30
       ? __warn_printk+0xcc/0x140
       ? __warn_printk+0xd5/0x140
       ? refcount_warn_saturate+0xdf/0x1d0
       get_net_ns+0xa4/0xc0
       ? __pfx_get_net_ns+0x10/0x10
       open_related_ns+0x5a/0x130
       __tun_chr_ioctl+0x1616/0x2370
       ? __sanitizer_cov_trace_switch+0x58/0xa0
       ? __sanitizer_cov_trace_const_cmp2+0x1c/0x30
       ? __pfx_tun_chr_ioctl+0x10/0x10
       tun_chr_ioctl+0x2f/0x40
       __x64_sys_ioctl+0x11b/0x160
       x64_sys_call+0x1211/0x20d0
       do_syscall_64+0x9e/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f5b28f165d7
      Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 8
      RSP: 002b:00007ffc2b59c5e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5b28f165d7
      RDX: 0000000000000000 RSI: 00000000000054e3 RDI: 0000000000000003
      RBP: 00007ffc2b59c650 R08: 00007f5b291ed8c0 R09: 00007f5b2961a4c0
      R10: 0000000029690010 R11: 0000000000000246 R12: 0000000000400730
      R13: 00007ffc2b59cf40 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      Kernel panic - not syncing: kernel: panic_on_warn set ...
      
      This is trigger as below:
                ns0                                    ns1
      tun_set_iff() //dev is tun0
         tun->dev = dev
      //ip link set tun0 netns ns1
                                             put_net() //ref is 0
      __tun_chr_ioctl() //TUNGETDEVNETNS
         net = dev_net(tun->dev);
         open_related_ns(&net->ns, get_net_ns); //ns1
           get_net_ns()
              get_net() //addition on 0
      
      Use maybe_get_net() in get_net_ns in case net's ref is zero to fix this
      
      Fixes: 0c3e0e3b ("tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of tun device")
      Signed-off-by: default avatarYue Haibing <yuehaibing@huawei.com>
      Link: https://lore.kernel.org/r/20240614131302.2698509-1-yuehaibing@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ff960f9d
    • Eric Dumazet's avatar
      xfrm6: check ip6_dst_idev() return value in xfrm6_get_saddr() · d4640105
      Eric Dumazet authored
      ip6_dst_idev() can return NULL, xfrm6_get_saddr() must act accordingly.
      
      syzbot reported:
      
      Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 1 PID: 12 Comm: kworker/u8:1 Not tainted 6.10.0-rc2-syzkaller-00383-gb8481381 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
      Workqueue: wg-kex-wg1 wg_packet_handshake_send_worker
       RIP: 0010:xfrm6_get_saddr+0x93/0x130 net/ipv6/xfrm6_policy.c:64
      Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 97 00 00 00 4c 8b ab d8 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 86 00 00 00 4d 8b 6d 00 e8 ca 13 47 01 48 b8 00
      RSP: 0018:ffffc90000117378 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: ffff88807b079dc0 RCX: ffffffff89a0d6d7
      RDX: 0000000000000000 RSI: ffffffff89a0d6e9 RDI: ffff88807b079e98
      RBP: ffff88807ad73248 R08: 0000000000000007 R09: fffffffffffff000
      R10: ffff88807b079dc0 R11: 0000000000000007 R12: ffffc90000117480
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f4586d00440 CR3: 0000000079042000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
        xfrm_get_saddr net/xfrm/xfrm_policy.c:2452 [inline]
        xfrm_tmpl_resolve_one net/xfrm/xfrm_policy.c:2481 [inline]
        xfrm_tmpl_resolve+0xa26/0xf10 net/xfrm/xfrm_policy.c:2541
        xfrm_resolve_and_create_bundle+0x140/0x2570 net/xfrm/xfrm_policy.c:2835
        xfrm_bundle_lookup net/xfrm/xfrm_policy.c:3070 [inline]
        xfrm_lookup_with_ifid+0x4d1/0x1e60 net/xfrm/xfrm_policy.c:3201
        xfrm_lookup net/xfrm/xfrm_policy.c:3298 [inline]
        xfrm_lookup_route+0x3b/0x200 net/xfrm/xfrm_policy.c:3309
        ip6_dst_lookup_flow+0x15c/0x1d0 net/ipv6/ip6_output.c:1256
        send6+0x611/0xd20 drivers/net/wireguard/socket.c:139
        wg_socket_send_skb_to_peer+0xf9/0x220 drivers/net/wireguard/socket.c:178
        wg_socket_send_buffer_to_peer+0x12b/0x190 drivers/net/wireguard/socket.c:200
        wg_packet_send_handshake_initiation+0x227/0x360 drivers/net/wireguard/send.c:40
        wg_packet_handshake_send_worker+0x1c/0x30 drivers/net/wireguard/send.c:51
        process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
        process_scheduled_works kernel/workqueue.c:3312 [inline]
        worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
        kthread+0x2c1/0x3a0 kernel/kthread.c:389
        ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240615154231.234442-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d4640105
    • Eric Dumazet's avatar
      ipv6: prevent possible NULL dereference in rt6_probe() · b86762db
      Eric Dumazet authored
      syzbot caught a NULL dereference in rt6_probe() [1]
      
      Bail out if  __in6_dev_get() returns NULL.
      
      [1]
      Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cb: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x0000000000000658-0x000000000000065f]
      CPU: 1 PID: 22444 Comm: syz-executor.0 Not tainted 6.10.0-rc2-syzkaller-00383-gb8481381 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
       RIP: 0010:rt6_probe net/ipv6/route.c:656 [inline]
       RIP: 0010:find_match+0x8c4/0xf50 net/ipv6/route.c:758
      Code: 14 fd f7 48 8b 85 38 ff ff ff 48 c7 45 b0 00 00 00 00 48 8d b8 5c 06 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 19
      RSP: 0018:ffffc900034af070 EFLAGS: 00010203
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90004521000
      RDX: 00000000000000cb RSI: ffffffff8990d0cd RDI: 000000000000065c
      RBP: ffffc900034af150 R08: 0000000000000005 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000002 R12: 000000000000000a
      R13: 1ffff92000695e18 R14: ffff8880244a1d20 R15: 0000000000000000
      FS:  00007f4844a5a6c0(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b31b27000 CR3: 000000002d42c000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
        rt6_nh_find_match+0xfa/0x1a0 net/ipv6/route.c:784
        nexthop_for_each_fib6_nh+0x26d/0x4a0 net/ipv4/nexthop.c:1496
        __find_rr_leaf+0x6e7/0xe00 net/ipv6/route.c:825
        find_rr_leaf net/ipv6/route.c:853 [inline]
        rt6_select net/ipv6/route.c:897 [inline]
        fib6_table_lookup+0x57e/0xa30 net/ipv6/route.c:2195
        ip6_pol_route+0x1cd/0x1150 net/ipv6/route.c:2231
        pol_lookup_func include/net/ip6_fib.h:616 [inline]
        fib6_rule_lookup+0x386/0x720 net/ipv6/fib6_rules.c:121
        ip6_route_output_flags_noref net/ipv6/route.c:2639 [inline]
        ip6_route_output_flags+0x1d0/0x640 net/ipv6/route.c:2651
        ip6_dst_lookup_tail.constprop.0+0x961/0x1760 net/ipv6/ip6_output.c:1147
        ip6_dst_lookup_flow+0x99/0x1d0 net/ipv6/ip6_output.c:1250
        rawv6_sendmsg+0xdab/0x4340 net/ipv6/raw.c:898
        inet_sendmsg+0x119/0x140 net/ipv4/af_inet.c:853
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg net/socket.c:745 [inline]
        sock_write_iter+0x4b8/0x5c0 net/socket.c:1160
        new_sync_write fs/read_write.c:497 [inline]
        vfs_write+0x6b6/0x1140 fs/read_write.c:590
        ksys_write+0x1f8/0x260 fs/read_write.c:643
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 52e16356 ("[IPV6]: ROUTE: Add router_probe_interval sysctl.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarJason Xing <kerneljasonxing@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240615151454.166404-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b86762db
    • Eric Dumazet's avatar
      ipv6: prevent possible NULL deref in fib6_nh_init() · 2eab4543
      Eric Dumazet authored
      syzbot reminds us that in6_dev_get() can return NULL.
      
      fib6_nh_init()
          ip6_validate_gw(  &idev  )
              ip6_route_check_nh(  idev  )
                  *idev = in6_dev_get(dev); // can be NULL
      
      Oops: general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7]
      CPU: 0 PID: 11237 Comm: syz-executor.3 Not tainted 6.10.0-rc2-syzkaller-00249-gbe27b896 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
       RIP: 0010:fib6_nh_init+0x640/0x2160 net/ipv6/route.c:3606
      Code: 00 00 fc ff df 4c 8b 64 24 58 48 8b 44 24 28 4c 8b 74 24 30 48 89 c1 48 89 44 24 28 48 8d 98 e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 0f 85 b3 17 00 00 8b 1b 31 ff 89 de e8 b8 8b
      RSP: 0018:ffffc900032775a0 EFLAGS: 00010202
      RAX: 00000000000000bc RBX: 00000000000005e0 RCX: 0000000000000000
      RDX: 0000000000000010 RSI: ffffc90003277a54 RDI: ffff88802b3a08d8
      RBP: ffffc900032778b0 R08: 00000000000002fc R09: 0000000000000000
      R10: 00000000000002fc R11: 0000000000000000 R12: ffff88802b3a08b8
      R13: 1ffff9200064eec8 R14: ffffc90003277a00 R15: dffffc0000000000
      FS:  00007f940feb06c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 00000000245e8000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
        ip6_route_info_create+0x99e/0x12b0 net/ipv6/route.c:3809
        ip6_route_add+0x28/0x160 net/ipv6/route.c:3853
        ipv6_route_ioctl+0x588/0x870 net/ipv6/route.c:4483
        inet6_ioctl+0x21a/0x280 net/ipv6/af_inet6.c:579
        sock_do_ioctl+0x158/0x460 net/socket.c:1222
        sock_ioctl+0x629/0x8e0 net/socket.c:1341
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:907 [inline]
        __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f940f07cea9
      
      Fixes: 428604fb ("ipv6: do not set routes if disable_ipv6 has been enabled")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240614082002.26407-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2eab4543
    • Matthieu Baerts (NGI0)'s avatar
      selftests: mptcp: userspace_pm: fixed subtest names · e874557f
      Matthieu Baerts (NGI0) authored
      It is important to have fixed (sub)test names in TAP, because these
      names are used to identify them. If they are not fixed, tracking cannot
      be done.
      
      Some subtests from the userspace_pm selftest were using random numbers
      in their names: the client and server address IDs from $RANDOM, and the
      client port number randomly picked by the kernel when creating the
      connection. These values have been replaced by 'client' and 'server'
      words: that's even more helpful than showing random numbers. Note that
      the addresses IDs are incremented and decremented in the test: +1 or -1
      are then displayed in these cases.
      
      Not to loose info that can be useful for debugging in case of issues,
      these random numbers are now displayed at the beginning of the test.
      
      Fixes: f589234e ("selftests: mptcp: userspace_pm: format subtests results in TAP")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240614-upstream-net-20240614-selftests-mptcp-uspace-pm-fixed-test-names-v1-1-460ad3edb429@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e874557f
    • Eric Dumazet's avatar
      tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack() · 9e046bb1
      Eric Dumazet authored
      Some applications were reporting ETIMEDOUT errors on apparently
      good looking flows, according to packet dumps.
      
      We were able to root cause the issue to an accidental setting
      of tp->retrans_stamp in the following scenario:
      
      - client sends TFO SYN with data.
      - server has TFO disabled, ACKs only SYN but not payload.
      - client receives SYNACK covering only SYN.
      - tcp_ack() eats SYN and sets tp->retrans_stamp to 0.
      - tcp_rcv_fastopen_synack() calls tcp_xmit_retransmit_queue()
        to retransmit TFO payload w/o SYN, sets tp->retrans_stamp to "now",
        but we are not in any loss recovery state.
      - TFO payload is ACKed.
      - we are not in any loss recovery state, and don't see any dupacks,
        so we don't get to any code path that clears tp->retrans_stamp.
      - tp->retrans_stamp stays non-zero for the lifetime of the connection.
      - after first RTO, tcp_clamp_rto_to_user_timeout() clamps second RTO
        to 1 jiffy due to bogus tp->retrans_stamp.
      - on clamped RTO with non-zero icsk_retransmits, retransmits_timed_out()
        sets start_ts from tp->retrans_stamp from TFO payload retransmit
        hours/days ago, and computes bogus long elapsed time for loss recovery,
        and suffers ETIMEDOUT early.
      
      Fixes: a7abf3cd ("tcp: consider using standard rtx logic in tcp_rcv_fastopen_synack()")
      CC: stable@vger.kernel.org
      Co-developed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Co-developed-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240614130615.396837-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9e046bb1
  3. 17 Jun, 2024 1 commit
    • Gavrilov Ilia's avatar
      netrom: Fix a memory leak in nr_heartbeat_expiry() · 0b913024
      Gavrilov Ilia authored
      syzbot reported a memory leak in nr_create() [0].
      
      Commit 409db27e ("netrom: Fix use-after-free of a listening socket.")
      added sock_hold() to the nr_heartbeat_expiry() function, where
      a) a socket has a SOCK_DESTROY flag or
      b) a listening socket has a SOCK_DEAD flag.
      
      But in the case "a," when the SOCK_DESTROY flag is set, the file descriptor
      has already been closed and the nr_release() function has been called.
      So it makes no sense to hold the reference count because no one will
      call another nr_destroy_socket() and put it as in the case "b."
      
      nr_connect
        nr_establish_data_link
          nr_start_heartbeat
      
      nr_release
        switch (nr->state)
        case NR_STATE_3
          nr->state = NR_STATE_2
          sock_set_flag(sk, SOCK_DESTROY);
      
                              nr_rx_frame
                                nr_process_rx_frame
                                  switch (nr->state)
                                  case NR_STATE_2
                                    nr_state2_machine()
                                      nr_disconnect()
                                        nr_sk(sk)->state = NR_STATE_0
                                        sock_set_flag(sk, SOCK_DEAD)
      
                              nr_heartbeat_expiry
                                switch (nr->state)
                                case NR_STATE_0
                                  if (sock_flag(sk, SOCK_DESTROY) ||
                                     (sk->sk_state == TCP_LISTEN
                                       && sock_flag(sk, SOCK_DEAD)))
                                     sock_hold()  // ( !!! )
                                     nr_destroy_socket()
      
      To fix the memory leak, let's call sock_hold() only for a listening socket.
      
      Found by InfoTeCS on behalf of Linux Verification Center
      (linuxtesting.org) with Syzkaller.
      
      [0]: https://syzkaller.appspot.com/bug?extid=d327a1f3b12e1e206c16
      
      Reported-by: syzbot+d327a1f3b12e1e206c16@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=d327a1f3b12e1e206c16
      Fixes: 409db27e ("netrom: Fix use-after-free of a listening socket.")
      Signed-off-by: default avatarGavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b913024
  4. 15 Jun, 2024 3 commits
  5. 14 Jun, 2024 8 commits
  6. 13 Jun, 2024 2 commits