1. 14 Nov, 2023 16 commits
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test · 28628fa9
      Jozsef Kadlecsik authored
      Linkui Xiao reported that there's a race condition when ipset swap and destroy is
      called, which can lead to crash in add/del/test element operations. Swap then
      destroy are usual operations to replace a set with another one in a production
      system. The issue can in some cases be reproduced with the script:
      
      ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576
      ipset add hash_ip1 172.20.0.0/16
      ipset add hash_ip1 192.168.0.0/16
      iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT
      while [ 1 ]
      do
      	# ... Ongoing traffic...
              ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576
              ipset add hash_ip2 172.20.0.0/16
              ipset swap hash_ip1 hash_ip2
              ipset destroy hash_ip2
              sleep 0.05
      done
      
      In the race case the possible order of the operations are
      
      	CPU0			CPU1
      	ip_set_test
      				ipset swap hash_ip1 hash_ip2
      				ipset destroy hash_ip2
      	hash_net_kadt
      
      Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which
      is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy
      removed it, hash_net_kadt crashes.
      
      The fix is to force ip_set_swap() to wait for all readers to finish accessing the
      old set pointers by calling synchronize_rcu().
      
      The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>.
      
      v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden
          ip_set_destroy() unnecessarily when all sets are destroyed.
      v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held
          and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair.
          So there's no need to extend the rcu read locked area in ipset itself.
      
      Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/
      Reported by: Linkui Xiao <xiaolinkui@kylinos.cn>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      28628fa9
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: bogus ENOENT when destroying element which does not exist · a7d5a955
      Pablo Neira Ayuso authored
      destroy element command bogusly reports ENOENT in case a set element
      does not exist. ENOENT errors are skipped, however, err is still set
      and propagated to userspace.
      
       # nft destroy element ip raw BLACKLIST { 1.2.3.4 }
       Error: Could not process rule: No such file or directory
       destroy element ip raw BLACKLIST { 1.2.3.4 }
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
      Fixes: f80a612d ("netfilter: nf_tables: add support to destroy operation")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a7d5a955
    • Dan Carpenter's avatar
      netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval() · c301f098
      Dan Carpenter authored
      The problem is in nft_byteorder_eval() where we are iterating through a
      loop and writing to dst[0], dst[1], dst[2] and so on...  On each
      iteration we are writing 8 bytes.  But dst[] is an array of u32 so each
      element only has space for 4 bytes.  That means that every iteration
      overwrites part of the previous element.
      
      I spotted this bug while reviewing commit caf3ef74 ("netfilter:
      nf_tables: prevent OOB access in nft_byteorder_eval") which is a related
      issue.  I think that the reason we have not detected this bug in testing
      is that most of time we only write one element.
      
      Fixes: ce1e7989 ("netfilter: nft_byteorder: provide 64bit le/be conversion")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c301f098
    • Linkui Xiao's avatar
      netfilter: nf_conntrack_bridge: initialize err to 0 · a44af08e
      Linkui Xiao authored
      K2CI reported a problem:
      
      	consume_skb(skb);
      	return err;
      [nf_br_ip_fragment() error]  uninitialized symbol 'err'.
      
      err is not initialized, because returning 0 is expected, initialize err
      to 0.
      
      Fixes: 3c171f49 ("netfilter: bridge: add connection tracking system")
      Reported-by: default avatark2ci <kernel-bot@kylinos.cn>
      Signed-off-by: default avatarLinkui Xiao <xiaolinkui@kylinos.cn>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a44af08e
    • Yang Li's avatar
      netfilter: nft_set_rbtree: Remove unused variable nft_net · 67059b61
      Yang Li authored
      The code that uses nft_net has been removed, and the nft_pernet function
      is merely obtaining a reference to shared data through the net pointer.
      The content of the net pointer is not modified or changed, so both of
      them should be removed.
      
      silence the warning:
      net/netfilter/nft_set_rbtree.c:627:26: warning: variable ‘nft_net’ set but not used
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7103Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      67059b61
    • Eric Dumazet's avatar
      af_unix: fix use-after-free in unix_stream_read_actor() · 4b7b4926
      Eric Dumazet authored
      syzbot reported the following crash [1]
      
      After releasing unix socket lock, u->oob_skb can be changed
      by another thread. We must temporarily increase skb refcount
      to make sure this other thread will not free the skb under us.
      
      [1]
      
      BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      Read of size 4 at addr ffff88801f3b9cc4 by task syz-executor107/5297
      
      CPU: 1 PID: 5297 Comm: syz-executor107 Not tainted 6.6.0-syzkaller-15910-gb8e3a87a #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:364 [inline]
      print_report+0xc4/0x620 mm/kasan/report.c:475
      kasan_report+0xda/0x110 mm/kasan/report.c:588
      unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      unix_stream_recv_urg net/unix/af_unix.c:2587 [inline]
      unix_stream_read_generic+0x19a5/0x2480 net/unix/af_unix.c:2666
      unix_stream_recvmsg+0x189/0x1b0 net/unix/af_unix.c:2903
      sock_recvmsg_nosec net/socket.c:1044 [inline]
      sock_recvmsg+0xe2/0x170 net/socket.c:1066
      ____sys_recvmsg+0x21f/0x5c0 net/socket.c:2803
      ___sys_recvmsg+0x115/0x1a0 net/socket.c:2845
      __sys_recvmsg+0x114/0x1e0 net/socket.c:2875
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7fc67492c559
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fc6748ab228 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fc67492c559
      RDX: 0000000040010083 RSI: 0000000020000140 RDI: 0000000000000004
      RBP: 00007fc6749b6348 R08: 00007fc6748ab6c0 R09: 00007fc6748ab6c0
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc6749b6340
      R13: 00007fc6749b634c R14: 00007ffe9fac52a0 R15: 00007ffe9fac5388
      </TASK>
      
      Allocated by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:328
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      kmem_cache_alloc_node+0x180/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Freed by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
      ____kasan_slab_free mm/kasan/common.c:236 [inline]
      ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
      kasan_slab_free include/linux/kasan.h:164 [inline]
      slab_free_hook mm/slub.c:1800 [inline]
      slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
      slab_free mm/slub.c:3809 [inline]
      kmem_cache_free+0xf8/0x340 mm/slub.c:3831
      kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:1015
      __kfree_skb net/core/skbuff.c:1073 [inline]
      consume_skb net/core/skbuff.c:1288 [inline]
      consume_skb+0xdf/0x170 net/core/skbuff.c:1282
      queue_oob net/unix/af_unix.c:2178 [inline]
      unix_stream_sendmsg+0xd49/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The buggy address belongs to the object at ffff88801f3b9c80
      which belongs to the cache skbuff_head_cache of size 240
      The buggy address is located 68 bytes inside of
      freed 240-byte region [ffff88801f3b9c80, ffff88801f3b9d70)
      
      The buggy address belongs to the physical page:
      page:ffffea00007cee40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f3b9
      flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xffffffff()
      raw: 00fff00000000800 ffff888142a60640 dead000000000122 0000000000000000
      raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5299, tgid 5283 (syz-executor107), ts 103803840339, free_ts 103600093431
      set_page_owner include/linux/page_owner.h:31 [inline]
      post_alloc_hook+0x2cf/0x340 mm/page_alloc.c:1537
      prep_new_page mm/page_alloc.c:1544 [inline]
      get_page_from_freelist+0xa25/0x36c0 mm/page_alloc.c:3312
      __alloc_pages+0x1d0/0x4a0 mm/page_alloc.c:4568
      alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
      alloc_slab_page mm/slub.c:1870 [inline]
      allocate_slab+0x251/0x380 mm/slub.c:2017
      new_slab mm/slub.c:2070 [inline]
      ___slab_alloc+0x8c7/0x1580 mm/slub.c:3223
      __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
      __slab_alloc_node mm/slub.c:3375 [inline]
      slab_alloc_node mm/slub.c:3468 [inline]
      kmem_cache_alloc_node+0x132/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      page last free stack trace:
      reset_page_owner include/linux/page_owner.h:24 [inline]
      free_pages_prepare mm/page_alloc.c:1137 [inline]
      free_unref_page_prepare+0x4f8/0xa90 mm/page_alloc.c:2347
      free_unref_page+0x33/0x3b0 mm/page_alloc.c:2487
      __unfreeze_partials+0x21d/0x240 mm/slub.c:2655
      qlink_free mm/kasan/quarantine.c:168 [inline]
      qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
      kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
      __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      slab_alloc mm/slub.c:3486 [inline]
      __kmem_cache_alloc_lru mm/slub.c:3493 [inline]
      kmem_cache_alloc+0x15d/0x380 mm/slub.c:3502
      vm_area_dup+0x21/0x2f0 kernel/fork.c:500
      __split_vma+0x17d/0x1070 mm/mmap.c:2365
      split_vma mm/mmap.c:2437 [inline]
      vma_modify+0x25d/0x450 mm/mmap.c:2472
      vma_modify_flags include/linux/mm.h:3271 [inline]
      mprotect_fixup+0x228/0xc80 mm/mprotect.c:635
      do_mprotect_pkey+0x852/0xd60 mm/mprotect.c:809
      __do_sys_mprotect mm/mprotect.c:830 [inline]
      __se_sys_mprotect mm/mprotect.c:827 [inline]
      __x64_sys_mprotect+0x78/0xb0 mm/mprotect.c:827
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Memory state around the buggy address:
      ffff88801f3b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ffff88801f3b9c00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      >ffff88801f3b9c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ^
      ffff88801f3b9d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
      ffff88801f3b9d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
      
      Fixes: 876c14ad ("af_unix: fix holding spinlock in oob handling")
      Reported-and-tested-by: syzbot+7a2d546fa43e49315ed3@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rao Shoaib <rao.shoaib@oracle.com>
      Reviewed-by: default avatarRao shoaib <rao.shoaib@oracle.com>
      Link: https://lore.kernel.org/r/20231113134938.168151-1-edumazet@google.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4b7b4926
    • Jakub Kicinski's avatar
      Merge branch 'r8169-fix-dash-devices-network-lost-issue' · 48c205c6
      Jakub Kicinski authored
      ChunHao Lin says:
      
      ====================
      r8169: fix DASH devices network lost issue
      
      This series are used to fix network lost issue on systems that support
      DASH. It has been tested on rtl8168ep and rtl8168fp.
      ====================
      
      Link: https://lore.kernel.org/r/20231109173400.4573-1-hau@realtek.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      48c205c6
    • ChunHao Lin's avatar
      r8169: fix network lost after resume on DASH systems · 868c3b95
      ChunHao Lin authored
      Device that support DASH may be reseted or powered off during suspend.
      So driver needs to handle DASH during system suspend and resume. Or
      DASH firmware will influence device behavior and causes network lost.
      
      Fixes: b646d900 ("r8169: magic.")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarChunHao Lin <hau@realtek.com>
      Link: https://lore.kernel.org/r/20231109173400.4573-3-hau@realtek.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      868c3b95
    • ChunHao Lin's avatar
      r8169: add handling DASH when DASH is disabled · 0ab0c45d
      ChunHao Lin authored
      For devices that support DASH, even DASH is disabled, there may still
      exist a default firmware that will influence device behavior.
      So driver needs to handle DASH for devices that support DASH, no
      matter the DASH status is.
      
      This patch also prepares for "fix network lost after resume on DASH
      systems".
      
      Fixes: ee7a1beb ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChunHao Lin <hau@realtek.com>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/20231109173400.4573-2-hau@realtek.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0ab0c45d
    • Jakub Kicinski's avatar
      Merge branch 'fix-large-frames-in-the-gemini-ethernet-driver' · 334e90b8
      Jakub Kicinski authored
      Linus Walleij says:
      
      ====================
      Fix large frames in the Gemini ethernet driver
      
      This is the result of a bug hunt for a problem with the
      RTL8366RB DSA switch leading me wrong all over the place.
      
      I am indebted to Vladimir Oltean who as usual pointed
      out where the real problem was, many thanks!
      
      Tryig to actually use big ("jumbo") frames on this
      hardware uncovered the real bugs. Then I tested it on
      the DSA switch and it indeed fixes the issue.
      
      To make sure it also works fine with big frames on
      non-DSA devices I also copied a large video file over
      scp to a device with maximum frame size, the data
      was transported in large TCP packets ending up in
      0x7ff sized frames using software checksumming at
      ~2.0 MB/s.
      
      If I set down the MTU to the standard 1500 bytes so
      that hardware checksumming is used, the scp transfer
      of the same file was slightly lower, ~1.8-1.9 MB/s.
      
      Despite this not being the best test it shows that
      we can now stress the hardware with large frames
      and that software checksum works fine.
      
      v3: https://lore.kernel.org/r/20231107-gemini-largeframe-fix-v3-0-e3803c080b75@linaro.org
      v2: https://lore.kernel.org/r/20231105-gemini-largeframe-fix-v2-0-cd3a5aa6c496@linaro.org
      v1: https://lore.kernel.org/r/20231104-gemini-largeframe-fix-v1-0-9c5513f22f33@linaro.org
      ====================
      
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-0-6e611528db08@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      334e90b8
    • Linus Walleij's avatar
      net: ethernet: cortina: Fix MTU max setting · dc6c0bfb
      Linus Walleij authored
      The RX max frame size is over 10000 for the Gemini ethernet,
      but the TX max frame size is actually just 2047 (0x7ff after
      checking the datasheet). Reflect this in what we offer to Linux,
      cap the MTU at the TX max frame minus ethernet headers.
      
      We delete the code disabling the hardware checksum for large
      MTUs as netdev->mtu can no longer be larger than
      netdev->max_mtu meaning the if()-clause in gmac_fix_features()
      is never true.
      
      Fixes: 4d5ae32f ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-3-6e611528db08@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dc6c0bfb
    • Linus Walleij's avatar
      net: ethernet: cortina: Handle large frames · d4d0c5b4
      Linus Walleij authored
      The Gemini ethernet controller provides hardware checksumming
      for frames up to 1514 bytes including ethernet headers but not
      FCS.
      
      If we start sending bigger frames (after first bumping up the MTU
      on both interfaces sending and receiving the frames), truncated
      packets start to appear on the target such as in this tcpdump
      resulting from ping -s 1474:
      
      23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown),
      ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing!
      (tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502)
      OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482
      
      If we bypass the hardware checksumming and provide a software
      fallback, everything starts working fine up to the max TX MTU
      of 2047 bytes, for example ping -s2000 192.168.1.2:
      
      00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown),
      ethertype IPv4 (0x0800), length 2042:
      (tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028)
      Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008
      
      The bit enabling to bypass hardware checksum (or any of the
      "TSS" bits) are undocumented in the hardware reference manual.
      The entire hardware checksum unit appears undocumented. The
      conclusion that we need to use the "bypass" bit was found by
      trial-and-error.
      
      Since no hardware checksum will happen, we slot in a software
      checksum fallback.
      
      Check for the condition where we need to compute checksum on the
      skb with either hardware or software using == CHECKSUM_PARTIAL instead
      of != CHECKSUM_NONE which is an incomplete check according to
      <linux/skbuff.h>.
      
      On the D-Link DIR-685 router this fixes a bug on the conduit
      interface to the RTL8366RB DSA switch: as the switch needs to add
      space for its tag it increases the MTU on the conduit interface
      to 1504 and that means that when the router sends packages
      of 1500 bytes these get an extra 4 bytes of DSA tag and the
      transfer fails because of the erroneous hardware checksumming,
      affecting such basic functionality as the LuCI web interface.
      
      Fixes: 4d5ae32f ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-2-6e611528db08@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d4d0c5b4
    • Linus Walleij's avatar
      net: ethernet: cortina: Fix max RX frame define · 510e35fb
      Linus Walleij authored
      Enumerator 3 is 1548 bytes according to the datasheet.
      Not 1542.
      
      Fixes: 4d5ae32f ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-1-6e611528db08@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      510e35fb
    • Eric Dumazet's avatar
      bonding: stop the device in bond_setup_by_slave() · 3cffa2dd
      Eric Dumazet authored
      Commit 9eed321c ("net: lapbether: only support ethernet devices")
      has been able to keep syzbot away from net/lapb, until today.
      
      In the following splat [1], the issue is that a lapbether device has
      been created on a bonding device without members. Then adding a non
      ARPHRD_ETHER member forced the bonding master to change its type.
      
      The fix is to make sure we call dev_close() in bond_setup_by_slave()
      so that the potential linked lapbether devices (or any other devices
      having assumptions on the physical device) are removed.
      
      A similar bug has been addressed in commit 40baec22
      ("bonding: fix panic on non-ARPHRD_ETHER enslave failure")
      
      [1]
      skbuff: skb_under_panic: text:ffff800089508810 len:44 put:40 head:ffff0000c78e7c00 data:ffff0000c78e7bea tail:0x16 end:0x140 dev:bond0
      kernel BUG at net/core/skbuff.c:192 !
      Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 6007 Comm: syz-executor383 Not tainted 6.6.0-rc3-syzkaller-gbf6547d8715b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023
      pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : skb_panic net/core/skbuff.c:188 [inline]
      pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202
      lr : skb_panic net/core/skbuff.c:188 [inline]
      lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202
      sp : ffff800096a06aa0
      x29: ffff800096a06ab0 x28: ffff800096a06ba0 x27: dfff800000000000
      x26: ffff0000ce9b9b50 x25: 0000000000000016 x24: ffff0000c78e7bea
      x23: ffff0000c78e7c00 x22: 000000000000002c x21: 0000000000000140
      x20: 0000000000000028 x19: ffff800089508810 x18: ffff800096a06100
      x17: 0000000000000000 x16: ffff80008a629a3c x15: 0000000000000001
      x14: 1fffe00036837a32 x13: 0000000000000000 x12: 0000000000000000
      x11: 0000000000000201 x10: 0000000000000000 x9 : cb50b496c519aa00
      x8 : cb50b496c519aa00 x7 : 0000000000000001 x6 : 0000000000000001
      x5 : ffff800096a063b8 x4 : ffff80008e280f80 x3 : ffff8000805ad11c
      x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086
      Call trace:
      skb_panic net/core/skbuff.c:188 [inline]
      skb_under_panic+0x13c/0x140 net/core/skbuff.c:202
      skb_push+0xf0/0x108 net/core/skbuff.c:2446
      ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1384
      dev_hard_header include/linux/netdevice.h:3136 [inline]
      lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257
      lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447
      lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149
      lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251
      __lapb_disconnect_request+0x9c/0x17c net/lapb/lapb_iface.c:326
      lapb_device_event+0x288/0x4e0 net/lapb/lapb_iface.c:492
      notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93
      raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461
      call_netdevice_notifiers_info net/core/dev.c:1970 [inline]
      call_netdevice_notifiers_extack net/core/dev.c:2008 [inline]
      call_netdevice_notifiers net/core/dev.c:2022 [inline]
      __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508
      dev_close_many+0x1e0/0x470 net/core/dev.c:1559
      dev_close+0x174/0x250 net/core/dev.c:1585
      lapbeth_device_event+0x2e4/0x958 drivers/net/wan/lapbether.c:466
      notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93
      raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461
      call_netdevice_notifiers_info net/core/dev.c:1970 [inline]
      call_netdevice_notifiers_extack net/core/dev.c:2008 [inline]
      call_netdevice_notifiers net/core/dev.c:2022 [inline]
      __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508
      dev_close_many+0x1e0/0x470 net/core/dev.c:1559
      dev_close+0x174/0x250 net/core/dev.c:1585
      bond_enslave+0x2298/0x30cc drivers/net/bonding/bond_main.c:2332
      bond_do_ioctl+0x268/0xc64 drivers/net/bonding/bond_main.c:4539
      dev_ifsioc+0x754/0x9ac
      dev_ioctl+0x4d8/0xd34 net/core/dev_ioctl.c:786
      sock_do_ioctl+0x1d4/0x2d0 net/socket.c:1217
      sock_ioctl+0x4e8/0x834 net/socket.c:1322
      vfs_ioctl fs/ioctl.c:51 [inline]
      __do_sys_ioctl fs/ioctl.c:871 [inline]
      __se_sys_ioctl fs/ioctl.c:857 [inline]
      __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:857
      __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
      invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
      el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
      do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
      el0_svc+0x58/0x16c arch/arm64/kernel/entry-common.c:678
      el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
      el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
      Code: aa1803e6 aa1903e7 a90023f5 94785b8b (d4210000)
      
      Fixes: 872254dd ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20231109180102.4085183-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3cffa2dd
    • Eric Dumazet's avatar
      ptp: annotate data-race around q->head and q->tail · 73bde5a3
      Eric Dumazet authored
      As I was working on a syzbot report, I found that KCSAN would
      probably complain that reading q->head or q->tail without
      barriers could lead to invalid results.
      
      Add corresponding READ_ONCE() and WRITE_ONCE() to avoid
      load-store tearing.
      
      Fixes: d94ba80e ("ptp: Added a brand new class driver for ptp clocks.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Link: https://lore.kernel.org/r/20231109174859.3995880-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      73bde5a3
    • Jakub Kicinski's avatar
      Revert "ptp: Fixes a null pointer dereference in ptp_ioctl" · 4b3812d9
      Jakub Kicinski authored
      This reverts commit 8a4f030d.
      
      Richard says:
      
        The test itself is harmless, but keeping it will make people think,
        "oh this pointer can be invalid."
      
        In fact the core stack ensures that ioctl() can't be invoked after
        release(), otherwise Bad Stuff happens.
      
      Fixes: 8a4f030d ("ptp: Fixes a null pointer dereference in ptp_ioctl")
      Link: https://lore.kernel.org/all/ZVAf_qdRfDAQYUt-@hoboy.vegasvil.org/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4b3812d9
  2. 13 Nov, 2023 14 commits
    • Willem de Bruijn's avatar
      ppp: limit MRU to 64K · c0a2a1b0
      Willem de Bruijn authored
      ppp_sync_ioctl allows setting device MRU, but does not sanity check
      this input.
      
      Limit to a sane upper bound of 64KB.
      
      No implementation I could find generates larger than 64KB frames.
      RFC 2823 mentions an upper bound of PPP over SDL of 64KB based on the
      16-bit length field. Other protocols will be smaller, such as PPPoE
      (9KB jumbo frame) and PPPoA (18190 maximum CPCS-SDU size, RFC 2364).
      PPTP and L2TP encapsulate in IP.
      
      Syzbot managed to trigger alloc warning in __alloc_pages:
      
      	if (WARN_ON_ONCE_GFP(order > MAX_ORDER, gfp))
      
          WARNING: CPU: 1 PID: 37 at mm/page_alloc.c:4544 __alloc_pages+0x3ab/0x4a0 mm/page_alloc.c:4544
      
          __alloc_skb+0x12b/0x330 net/core/skbuff.c:651
          __netdev_alloc_skb+0x72/0x3f0 net/core/skbuff.c:715
          netdev_alloc_skb include/linux/skbuff.h:3225 [inline]
          dev_alloc_skb include/linux/skbuff.h:3238 [inline]
          ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline]
          ppp_sync_receive+0xff/0x680 drivers/net/ppp/ppp_synctty.c:334
          tty_ldisc_receive_buf+0x14c/0x180 drivers/tty/tty_buffer.c:390
          tty_port_default_receive_buf+0x70/0xb0 drivers/tty/tty_port.c:37
          receive_buf drivers/tty/tty_buffer.c:444 [inline]
          flush_to_ldisc+0x261/0x780 drivers/tty/tty_buffer.c:494
          process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
      
      With call
      
          ioctl$PPPIOCSMRU1(r1, 0x40047452, &(0x7f0000000100)=0x5e6417a8)
      
      Similar code exists in other drivers that implement ppp_channel_ops
      ioctl PPPIOCSMRU. Those might also be in scope. Notably excluded from
      this are pppol2tp_ioctl and pppoe_ioctl.
      
      This code goes back to the start of git history.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: syzbot+6177e1f90d92583bcc58@syzkaller.appspotmail.com
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0a2a1b0
    • Sven Auhagen's avatar
      net: mvneta: fix calls to page_pool_get_stats · ca8add92
      Sven Auhagen authored
      Calling page_pool_get_stats in the mvneta driver without checks
      leads to kernel crashes.
      First the page pool is only available if the bm is not used.
      The page pool is also not allocated when the port is stopped.
      It can also be not allocated in case of errors.
      
      The current implementation leads to the following crash calling
      ethstats on a port that is down or when calling it at the wrong moment:
      
      ble to handle kernel NULL pointer dereference at virtual address 00000070
      [00000070] *pgd=00000000
      Internal error: Oops: 5 [#1] SMP ARM
      Hardware name: Marvell Armada 380/385 (Device Tree)
      PC is at page_pool_get_stats+0x18/0x1cc
      LR is at mvneta_ethtool_get_stats+0xa0/0xe0 [mvneta]
      pc : [<c0b413cc>]    lr : [<bf0a98d8>]    psr: a0000013
      sp : f1439d48  ip : f1439dc0  fp : 0000001d
      r10: 00000100  r9 : c4816b80  r8 : f0d75150
      r7 : bf0b400c  r6 : c238f000  r5 : 00000000  r4 : f1439d68
      r3 : c2091040  r2 : ffffffd8  r1 : f1439d68  r0 : 00000000
      Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c5387d  Table: 066b004a  DAC: 00000051
      Register r0 information: NULL pointer
      Register r1 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390
      Register r2 information: non-paged memory
      Register r3 information: slab kmalloc-2k start c2091000 pointer offset 64 size 2048
      Register r4 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390
      Register r5 information: NULL pointer
      Register r6 information: slab kmalloc-cg-4k start c238f000 pointer offset 0 size 4096
      Register r7 information: 15-page vmalloc region starting at 0xbf0a8000 allocated at load_module+0xa30/0x219c
      Register r8 information: 1-page vmalloc region starting at 0xf0d75000 allocated at ethtool_get_stats+0x138/0x208
      Register r9 information: slab task_struct start c4816b80 pointer offset 0
      Register r10 information: non-paged memory
      Register r11 information: non-paged memory
      Register r12 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390
      Process snmpd (pid: 733, stack limit = 0x38de3a88)
      Stack: (0xf1439d48 to 0xf143a000)
      9d40:                   000000c0 00000001 c238f000 bf0b400c f0d75150 c4816b80
      9d60: 00000100 bf0a98d8 00000000 00000000 00000000 00000000 00000000 00000000
      9d80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      9da0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      9dc0: 00000dc0 5335509c 00000035 c238f000 bf0b2214 01067f50 f0d75000 c0b9b9c8
      9de0: 0000001d 00000035 c2212094 5335509c c4816b80 c238f000 c5ad6e00 01067f50
      9e00: c1b0be80 c4816b80 00014813 c0b9d7f0 00000000 00000000 0000001d 0000001d
      9e20: 00000000 00001200 00000000 00000000 c216ed90 c73943b8 00000000 00000000
      9e40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      9e60: 00000000 c0ad9034 00000000 00000000 00000000 00000000 00000000 00000000
      9e80: 00000000 00000000 00000000 5335509c c1b0be80 f1439ee4 00008946 c1b0be80
      9ea0: 01067f50 f1439ee3 00000000 00000046 b6d77ae0 c0b383f0 00008946 becc83e8
      9ec0: c1b0be80 00000051 0000000b c68ca480 c7172d00 c0ad8ff0 f1439ee3 cf600e40
      9ee0: 01600e40 32687465 00000000 00000000 00000000 01067f50 00000000 00000000
      9f00: 00000000 5335509c 00008946 00008946 00000000 c68ca480 becc83e8 c05e2de0
      9f20: f1439fb0 c03002f0 00000006 5ac3c35a c4816b80 00000006 b6d77ae0 c030caf0
      9f40: c4817350 00000014 f1439e1c 0000000c 00000000 00000051 01000000 00000014
      9f60: 00003fec f1439edc 00000001 c0372abc b6d77ae0 c0372abc cf600e40 5335509c
      9f80: c21e6800 01015c9c 0000000b 00008946 00000036 c03002f0 c4816b80 00000036
      9fa0: b6d77ae0 c03000c0 01015c9c 0000000b 0000000b 00008946 becc83e8 00000000
      9fc0: 01015c9c 0000000b 00008946 00000036 00000035 010678a0 b6d797ec b6d77ae0
      9fe0: b6dbf738 becc838c b6d186d7 b6baa858 40000030 0000000b 00000000 00000000
       page_pool_get_stats from mvneta_ethtool_get_stats+0xa0/0xe0 [mvneta]
       mvneta_ethtool_get_stats [mvneta] from ethtool_get_stats+0x154/0x208
       ethtool_get_stats from dev_ethtool+0xf48/0x2480
       dev_ethtool from dev_ioctl+0x538/0x63c
       dev_ioctl from sock_ioctl+0x49c/0x53c
       sock_ioctl from sys_ioctl+0x134/0xbd8
       sys_ioctl from ret_fast_syscall+0x0/0x1c
      Exception stack(0xf1439fa8 to 0xf1439ff0)
      9fa0:                   01015c9c 0000000b 0000000b 00008946 becc83e8 00000000
      9fc0: 01015c9c 0000000b 00008946 00000036 00000035 010678a0 b6d797ec b6d77ae0
      9fe0: b6dbf738 becc838c b6d186d7 b6baa858
      Code: e28dd004 e1a05000 e2514000 0a00006a (e5902070)
      
      This commit adds the proper checks before calling page_pool_get_stats.
      
      Fixes: b3fc7922 ("net: mvneta: add support for page_pool_get_stats")
      Signed-off-by: default avatarSven Auhagen <sven.auhagen@voleatech.de>
      Reported-by: default avatarPaulo Da Silva <Paulo.DaSilva@kyberna.com>
      Acked-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca8add92
    • Shigeru Yoshida's avatar
      tipc: Fix kernel-infoleak due to uninitialized TLV value · fb317eb2
      Shigeru Yoshida authored
      KMSAN reported the following kernel-infoleak issue:
      
      =====================================================
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
      BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
      BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       copy_to_user_iter lib/iov_iter.c:24 [inline]
       iterate_ubuf include/linux/iov_iter.h:29 [inline]
       iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       iterate_and_advance include/linux/iov_iter.h:271 [inline]
       _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       copy_to_iter include/linux/uio.h:197 [inline]
       simple_copy_to_iter net/core/datagram.c:532 [inline]
       __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420
       skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546
       skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
       netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967
       sock_recvmsg_nosec net/socket.c:1044 [inline]
       sock_recvmsg net/socket.c:1066 [inline]
       __sys_recvfrom+0x476/0x860 net/socket.c:2246
       __do_sys_recvfrom net/socket.c:2264 [inline]
       __se_sys_recvfrom net/socket.c:2260 [inline]
       __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
       slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768
       slab_alloc_node mm/slub.c:3478 [inline]
       kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560
       __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651
       alloc_skb include/linux/skbuff.h:1286 [inline]
       tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline]
       tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170
       tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324
       genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
       genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067
       netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545
       genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076
       netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
       netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368
       netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       ____sys_sendmsg+0x997/0xd60 net/socket.c:2588
       ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642
       __sys_sendmsg net/socket.c:2671 [inline]
       __do_sys_sendmsg net/socket.c:2680 [inline]
       __se_sys_sendmsg net/socket.c:2678 [inline]
       __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Bytes 34-35 of 36 are uninitialized
      Memory access of size 36 starts at ffff88802d464a00
      Data copied to user address 00007ff55033c0a0
      
      CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c410411 #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      =====================================================
      
      tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is
      calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and
      the length of TLV value passed as an argument, and aligns the result to a
      multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
      
      If the size of struct tlv_desc plus the length of TLV value is not aligned,
      the current implementation leaves the remaining bytes uninitialized. This
      is the cause of the above kernel-infoleak issue.
      
      This patch resolves this issue by clearing data up to an aligned size.
      
      Fixes: d0796d1e ("tipc: convert legacy nl bearer dump to nl compat")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb317eb2
    • Willem de Bruijn's avatar
      net: gso_test: support CONFIG_MAX_SKB_FRAGS up to 45 · e6daf129
      Willem de Bruijn authored
      The test allocs a single page to hold all the frag_list skbs. This
      is insufficient on kernels with CONFIG_MAX_SKB_FRAGS=45, due to the
      increased skb_shared_info frags[] array length.
      
              gso_test_func: ASSERTION FAILED at net/core/gso_test.c:210
              Expected alloc_size <= ((1UL) << 12), but
                  alloc_size == 5075 (0x13d3)
                  ((1UL) << 12) == 4096 (0x1000)
      
      Simplify the logic. Just allocate a page for each frag_list skb.
      
      Fixes: 4688ecb1 ("net: expand skb_segment unit test with frag_list coverage")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6daf129
    • Marek Behún's avatar
      net: mdio: fix typo in header · 438cbcdf
      Marek Behún authored
      The quotes symbol in
        "EEE "link partner ability 1
      should be at the end of the register name
        "EEE link partner ability 1"
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      438cbcdf
    • MD Danish Anwar's avatar
      MAINTAINERS: add entry for TI ICSSG Ethernet driver · 6979a51e
      MD Danish Anwar authored
      Add record for TI Industrial Communication Subsystem - Gigabit (ICSSG)
      Ethernet driver.
      
      Also add Roger and myself as maintainer.
      Signed-off-by: default avatarMD Danish Anwar <danishanwar@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6979a51e
    • David S. Miller's avatar
      Merge branch 'hns3-fixes' · 5d64075c
      David S. Miller authored
      Jijie Shao says:
      
      ====================
      There are some bugfix for the HNS3 ethernet driver
      
      There are some bugfix for the HNS3 ethernet driver
      
      ---
      ChangeLog:
      v1 -> v2:
        - net: hns3: fix add VLAN fail issue, net: hns3: fix VF reset fail issue
          are modified suggested by Paolo
        v1: https://lore.kernel.org/all/20231028025917.314305-1-shaojijie@huawei.com/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d64075c
    • Jijie Shao's avatar
      net: hns3: fix VF wrong speed and duplex issue · dff655e8
      Jijie Shao authored
      If PF is down, firmware will returns 10 Mbit/s rate and half-duplex mode
      when PF queries the port information from firmware.
      
      After imp reset command is executed, PF status changes to down,
      and PF will query link status and updates port information
      from firmware in a periodic scheduled task.
      
      However, there is a low probability that port information is updated
      when PF is down, and then PF link status changes to up.
      In this case, PF synchronizes incorrect rate and duplex mode to VF.
      
      This patch fixes it by updating port information before
      PF synchronizes the rate and duplex to the VF
      when PF changes to up.
      
      Fixes: 18b6e31f ("net: hns3: PF add support for pushing link status to VFs")
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dff655e8
    • Jijie Shao's avatar
      net: hns3: fix VF reset fail issue · 65e98bb5
      Jijie Shao authored
      Currently the reset process in hns3 and firmware watchdog init process is
      asynchronous. We think firmware watchdog initialization is completed
      before VF clear the interrupt source. However, firmware initialization
      may not complete early. So VF will receive multiple reset interrupts
      and fail to reset.
      
      So we add delay before VF interrupt source and 5 ms delay
      is enough to avoid second reset interrupt.
      
      Fixes: 427900d2 ("net: hns3: fix the timing issue of VF clearing interrupt sources")
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65e98bb5
    • Yonglong Liu's avatar
      net: hns3: fix variable may not initialized problem in hns3_init_mac_addr() · dbd2f3b2
      Yonglong Liu authored
      When a VF is calling hns3_init_mac_addr(), get_mac_addr() may
      return fail, then the value of mac_addr_temp is not initialized.
      
      Fixes: 76ad4f0e ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbd2f3b2
    • Yonglong Liu's avatar
      net: hns3: fix out-of-bounds access may occur when coalesce info is read via debugfs · 53aba458
      Yonglong Liu authored
      The hns3 driver define an array of string to show the coalesce
      info, but if the kernel adds a new mode or a new state,
      out-of-bounds access may occur when coalesce info is read via
      debugfs, this patch fix the problem.
      
      Fixes: c99fead7 ("net: hns3: add debugfs support for interrupt coalesce")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53aba458
    • Jian Shen's avatar
      net: hns3: fix incorrect capability bit display for copper port · 75b247b5
      Jian Shen authored
      Currently, the FEC capability bit is default set for device version V2.
      It's incorrect for the copper port. Eventhough it doesn't make the nic
      work abnormal, but the capability information display in debugfs may
      confuse user. So clear it when driver get the port type inforamtion.
      
      Fixes: 433ccce8 ("net: hns3: use FEC capability queried from firmware")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75b247b5
    • Yonglong Liu's avatar
      net: hns3: add barrier in vf mailbox reply process · ac92c0a9
      Yonglong Liu authored
      In hclgevf_mbx_handler() and hclgevf_get_mbx_resp() functions,
      there is a typical store-store and load-load scenario between
      received_resp and additional_info. This patch adds barrier
      to fix the problem.
      
      Fixes: 4671042f ("net: hns3: add match_id to check mailbox response from PF to VF")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac92c0a9
    • Jian Shen's avatar
      net: hns3: fix add VLAN fail issue · 472a2ff6
      Jian Shen authored
      The hclge_sync_vlan_filter is called in periodic task,
      trying to remove VLAN from vlan_del_fail_bmap. It can
      be concurrence with VLAN adding operation from user.
      So once user failed to delete a VLAN id, and add it
      again soon, it may be removed by the periodic task,
      which may cause the software configuration being
      inconsistent with hardware. So add mutex handling
      to avoid this.
      
           user                        hns3 driver
      
                                                 periodic task
                                                      │
        add vlan 10 ───── hns3_vlan_rx_add_vid        │
             │             (suppose success)          │
             │                                        │
        del vlan 10 ─────  hns3_vlan_rx_kill_vid      │
             │           (suppose fail,add to         │
             │             vlan_del_fail_bmap)        │
             │                                        │
        add vlan 10 ───── hns3_vlan_rx_add_vid        │
                           (suppose success)          │
                                             foreach vlan_del_fail_bmp
                                                  del vlan 10
      
      Fixes: fe4144d4 ("net: hns3: sync VLAN filter entries when kill VLAN ID failed")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      472a2ff6
  3. 11 Nov, 2023 2 commits
  4. 10 Nov, 2023 7 commits
    • Shigeru Yoshida's avatar
      tty: Fix uninit-value access in ppp_sync_receive() · 71963985
      Shigeru Yoshida authored
      KMSAN reported the following uninit-value access issue:
      
      =====================================================
      BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline]
      BUG: KMSAN: uninit-value in ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334
       ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline]
       ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334
       tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295
       tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:871 [inline]
       __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857
       __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
       __alloc_pages+0x75d/0xe80 mm/page_alloc.c:4591
       __alloc_pages_node include/linux/gfp.h:238 [inline]
       alloc_pages_node include/linux/gfp.h:261 [inline]
       __page_frag_cache_refill+0x9a/0x2c0 mm/page_alloc.c:4691
       page_frag_alloc_align+0x91/0x5d0 mm/page_alloc.c:4722
       page_frag_alloc include/linux/gfp.h:322 [inline]
       __netdev_alloc_skb+0x215/0x6d0 net/core/skbuff.c:728
       netdev_alloc_skb include/linux/skbuff.h:3225 [inline]
       dev_alloc_skb include/linux/skbuff.h:3238 [inline]
       ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline]
       ppp_sync_receive+0x237/0xe70 drivers/net/ppp/ppp_synctty.c:334
       tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295
       tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:871 [inline]
       __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857
       __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      CPU: 0 PID: 12950 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c410411 #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      =====================================================
      
      ppp_sync_input() checks the first 2 bytes of the data are PPP_ALLSTATIONS
      and PPP_UI. However, if the data length is 1 and the first byte is
      PPP_ALLSTATIONS, an access to an uninitialized value occurs when checking
      PPP_UI. This patch resolves this issue by checking the data length.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71963985
    • Eric Dumazet's avatar
      ipvlan: add ipvlan_route_v6_outbound() helper · 18f03942
      Eric Dumazet authored
      Inspired by syzbot reports using a stack of multiple ipvlan devices.
      
      Reduce stack size needed in ipvlan_process_v6_outbound() by moving
      the flowi6 struct used for the route lookup in an non inlined
      helper. ipvlan_route_v6_outbound() needs 120 bytes on the stack,
      immediately reclaimed.
      
      Also make sure ipvlan_process_v4_outbound() is not inlined.
      
      We might also have to lower MAX_NEST_DEV, because only syzbot uses
      setups with more than four stacked devices.
      
      BUG: TASK stack guard page was hit at ffffc9000e803ff8 (stack is ffffc9000e804000..ffffc9000e808000)
      stack guard page: 0000 [#1] SMP KASAN
      CPU: 0 PID: 13442 Comm: syz-executor.4 Not tainted 6.1.52-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      RIP: 0010:kasan_check_range+0x4/0x2a0 mm/kasan/generic.c:188
      Code: 48 01 c6 48 89 c7 e8 db 4e c1 03 31 c0 5d c3 cc 0f 0b eb 02 0f 0b b8 ea ff ff ff 5d c3 cc 00 00 cc cc 00 00 cc cc 55 48 89 e5 <41> 57 41 56 41 55 41 54 53 b0 01 48 85 f6 0f 84 a4 01 00 00 48 89
      RSP: 0018:ffffc9000e804000 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff817e5bf2
      RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff887c6568
      RBP: ffffc9000e804000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: dffffc0000000001 R12: 1ffff92001d0080c
      R13: dffffc0000000000 R14: ffffffff87e6b100 R15: 0000000000000000
      FS: 00007fd0c55826c0(0000) GS:ffff8881f6800000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffc9000e803ff8 CR3: 0000000170ef7000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <#DF>
      </#DF>
      <TASK>
      [<ffffffff81f281d1>] __kasan_check_read+0x11/0x20 mm/kasan/shadow.c:31
      [<ffffffff817e5bf2>] instrument_atomic_read include/linux/instrumented.h:72 [inline]
      [<ffffffff817e5bf2>] _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
      [<ffffffff817e5bf2>] cpumask_test_cpu include/linux/cpumask.h:506 [inline]
      [<ffffffff817e5bf2>] cpu_online include/linux/cpumask.h:1092 [inline]
      [<ffffffff817e5bf2>] trace_lock_acquire include/trace/events/lock.h:24 [inline]
      [<ffffffff817e5bf2>] lock_acquire+0xe2/0x590 kernel/locking/lockdep.c:5632
      [<ffffffff8563221e>] rcu_lock_acquire+0x2e/0x40 include/linux/rcupdate.h:306
      [<ffffffff8561464d>] rcu_read_lock include/linux/rcupdate.h:747 [inline]
      [<ffffffff8561464d>] ip6_pol_route+0x15d/0x1440 net/ipv6/route.c:2221
      [<ffffffff85618120>] ip6_pol_route_output+0x50/0x80 net/ipv6/route.c:2606
      [<ffffffff856f65b5>] pol_lookup_func include/net/ip6_fib.h:584 [inline]
      [<ffffffff856f65b5>] fib6_rule_lookup+0x265/0x620 net/ipv6/fib6_rules.c:116
      [<ffffffff85618009>] ip6_route_output_flags_noref+0x2d9/0x3a0 net/ipv6/route.c:2638
      [<ffffffff8561821a>] ip6_route_output_flags+0xca/0x340 net/ipv6/route.c:2651
      [<ffffffff838bd5a3>] ip6_route_output include/net/ip6_route.h:100 [inline]
      [<ffffffff838bd5a3>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:473 [inline]
      [<ffffffff838bd5a3>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff838bd5a3>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff838bd5a3>] ipvlan_queue_xmit+0xc33/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline]
      [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324
      [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline]
      [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline]
      [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline]
      [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139
      [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
      [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
      [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline]
      [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324
      [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline]
      [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline]
      [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline]
      [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139
      [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
      [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
      [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline]
      [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324
      [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline]
      [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline]
      [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline]
      [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139
      [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
      [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
      [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline]
      [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324
      [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline]
      [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline]
      [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline]
      [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139
      [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
      [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
      [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline]
      [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324
      [<ffffffff84d4a65e>] dev_queue_xmit include/linux/netdevice.h:3067 [inline]
      [<ffffffff84d4a65e>] neigh_resolve_output+0x64e/0x750 net/core/neighbour.c:1560
      [<ffffffff855ce503>] neigh_output include/net/neighbour.h:545 [inline]
      [<ffffffff855ce503>] ip6_finish_output2+0x1643/0x1ae0 net/ipv6/ip6_output.c:139
      [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff855b9ce4>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff855b9ce4>] NF_HOOK include/linux/netfilter.h:309 [inline]
      [<ffffffff855b9ce4>] ip6_xmit+0x11a4/0x1b20 net/ipv6/ip6_output.c:352
      [<ffffffff8597984e>] sctp_v6_xmit+0x9ae/0x1230 net/sctp/ipv6.c:250
      [<ffffffff8594623e>] sctp_packet_transmit+0x25de/0x2bc0 net/sctp/output.c:653
      [<ffffffff858f5142>] sctp_packet_singleton+0x202/0x310 net/sctp/outqueue.c:783
      [<ffffffff858ea411>] sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline]
      [<ffffffff858ea411>] sctp_outq_flush+0x661/0x3d40 net/sctp/outqueue.c:1212
      [<ffffffff858f02f9>] sctp_outq_uncork+0x79/0xb0 net/sctp/outqueue.c:764
      [<ffffffff8589f060>] sctp_side_effects net/sctp/sm_sideeffect.c:1199 [inline]
      [<ffffffff8589f060>] sctp_do_sm+0x55c0/0x5c30 net/sctp/sm_sideeffect.c:1170
      [<ffffffff85941567>] sctp_primitive_ASSOCIATE+0x97/0xc0 net/sctp/primitive.c:73
      [<ffffffff859408b2>] sctp_sendmsg_to_asoc+0xf62/0x17b0 net/sctp/socket.c:1839
      [<ffffffff85910b5e>] sctp_sendmsg+0x212e/0x33b0 net/sctp/socket.c:2029
      [<ffffffff8544d559>] inet_sendmsg+0x149/0x310 net/ipv4/af_inet.c:849
      [<ffffffff84c6c4d2>] sock_sendmsg_nosec net/socket.c:716 [inline]
      [<ffffffff84c6c4d2>] sock_sendmsg net/socket.c:736 [inline]
      [<ffffffff84c6c4d2>] ____sys_sendmsg+0x572/0x8c0 net/socket.c:2504
      [<ffffffff84c6ca91>] ___sys_sendmsg net/socket.c:2558 [inline]
      [<ffffffff84c6ca91>] __sys_sendmsg+0x271/0x360 net/socket.c:2587
      [<ffffffff84c6cbff>] __do_sys_sendmsg net/socket.c:2596 [inline]
      [<ffffffff84c6cbff>] __se_sys_sendmsg net/socket.c:2594 [inline]
      [<ffffffff84c6cbff>] __x64_sys_sendmsg+0x7f/0x90 net/socket.c:2594
      [<ffffffff85b32553>] do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      [<ffffffff85b32553>] do_syscall_64+0x53/0x80 arch/x86/entry/common.c:84
      [<ffffffff85c00087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18f03942
    • Ravi Gunasekaran's avatar
      MAINTAINERS: net: Update reviewers for TI's Ethernet drivers · cbe9e68e
      Ravi Gunasekaran authored
      Grygorii is no longer associated with TI and messages addressed to
      him bounce.
      
      Add Siddharth, Roger and myself as reviewers.
      Signed-off-by: default avatarRavi Gunasekaran <r-gunasekaran@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbe9e68e
    • Stanislav Fomichev's avatar
      net: set SOCK_RCU_FREE before inserting socket into hashtable · 871019b2
      Stanislav Fomichev authored
      We've started to see the following kernel traces:
      
       WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0
      
       Call Trace:
        <IRQ>
        __bpf_skc_lookup+0x10d/0x120
        bpf_sk_lookup+0x48/0xd0
        bpf_sk_lookup_tcp+0x19/0x20
        bpf_prog_<redacted>+0x37c/0x16a3
        cls_bpf_classify+0x205/0x2e0
        tcf_classify+0x92/0x160
        __netif_receive_skb_core+0xe52/0xf10
        __netif_receive_skb_list_core+0x96/0x2b0
        napi_complete_done+0x7b5/0xb70
        <redacted>_poll+0x94/0xb0
        net_rx_action+0x163/0x1d70
        __do_softirq+0xdc/0x32e
        asm_call_irq_on_stack+0x12/0x20
        </IRQ>
        do_softirq_own_stack+0x36/0x50
        do_softirq+0x44/0x70
      
      __inet_hash can race with lockless (rcu) readers on the other cpus:
      
        __inet_hash
          __sk_nulls_add_node_rcu
          <- (bpf triggers here)
          sock_set_flag(SOCK_RCU_FREE)
      
      Let's move the SOCK_RCU_FREE part up a bit, before we are inserting
      the socket into hashtables. Note, that the race is really harmless;
      the bpf callers are handling this situation (where listener socket
      doesn't have SOCK_RCU_FREE set) correctly, so the only
      annoyance is a WARN_ONCE.
      
      More details from Eric regarding SOCK_RCU_FREE timeline:
      
      Commit 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under
      synflood") added SOCK_RCU_FREE. At that time, the precise location of
      sock_set_flag(sk, SOCK_RCU_FREE) did not matter, because the thread calling
      __inet_hash() owns a reference on sk. SOCK_RCU_FREE was only tested
      at dismantle time.
      
      Commit 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      started checking SOCK_RCU_FREE _after_ the lookup to infer whether
      the refcount has been taken care of.
      
      Fixes: 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      871019b2
    • Yuran Pereira's avatar
      ptp: Fixes a null pointer dereference in ptp_ioctl · 8a4f030d
      Yuran Pereira authored
      Syzkaller found a null pointer dereference in ptp_ioctl
      originating from the lack of a null check for tsevq.
      
      ```
      general protection fault, probably for non-canonical
      	address 0xdffffc000000020b: 0000 [#1] PREEMPT SMP KASAN
      KASAN: probably user-memory-access in range
      	[0x0000000000001058-0x000000000000105f]
      CPU: 0 PID: 5053 Comm: syz-executor353 Not tainted
      	6.6.0-syzkaller-10396-g4652b8e4 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      	BIOS Google 10/09/2023
      RIP: 0010:ptp_ioctl+0xcb7/0x1d10 drivers/ptp/ptp_chardev.c:476
      ...
      Call Trace:
       <TASK>
       posix_clock_ioctl+0xf8/0x160 kernel/time/posix-clock.c:86
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:871 [inline]
       __se_sys_ioctl fs/ioctl.c:857 [inline]
       __x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      ```
      
      This patch fixes the issue by adding a check for tsevq and
      ensuring ptp_ioctl returns with an error if tsevq is null.
      
      Reported-by: syzbot+8a78ecea7ac1a2ea26e5@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=8a78ecea7ac1a2ea26e5
      Fixes: c5a445b1 ("ptp: support event queue reader channel masks")
      Signed-off-by: default avatarYuran Pereira <yuran.pereira@hotmail.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a4f030d
    • Linus Torvalds's avatar
      Merge tag 'net-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 89cdf9d5
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter and bpf.
      
        Current release - regressions:
      
         - sched: fix SKB_NOT_DROPPED_YET splat under debug config
      
        Current release - new code bugs:
      
         - tcp:
             - fix usec timestamps with TCP fastopen
             - fix possible out-of-bounds reads in tcp_hash_fail()
             - fix SYN option room calculation for TCP-AO
      
         - tcp_sigpool: fix some off by one bugs
      
         - bpf: fix compilation error without CGROUPS
      
         - ptp:
             - ptp_read() should not release queue
             - fix tsevqs corruption
      
        Previous releases - regressions:
      
         - llc: verify mac len before reading mac header
      
        Previous releases - always broken:
      
         - bpf:
             - fix check_stack_write_fixed_off() to correctly spill imm
             - fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
             - check map->usercnt after timer->timer is assigned
      
         - dsa: lan9303: consequently nested-lock physical MDIO
      
         - dccp/tcp: call security_inet_conn_request() after setting IP addr
      
         - tg3: fix the TX ring stall due to incorrect full ring handling
      
         - phylink: initialize carrier state at creation
      
         - ice: fix direction of VF rules in switchdev mode
      
        Misc:
      
         - fill in a bunch of missing MODULE_DESCRIPTION()s, more to come"
      
      * tag 'net-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
        net: ti: icss-iep: fix setting counter value
        ptp: fix corrupted list in ptp_open
        ptp: ptp_read should not release queue
        net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP
        net: kcm: fill in MODULE_DESCRIPTION()
        net/sched: act_ct: Always fill offloading tuple iifidx
        netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses
        netfilter: xt_recent: fix (increase) ipv6 literal buffer length
        ipvs: add missing module descriptions
        netfilter: nf_tables: remove catchall element in GC sync path
        netfilter: add missing module descriptions
        drivers/net/ppp: use standard array-copy-function
        net: enetc: shorten enetc_setup_xdp_prog() error message to fit NETLINK_MAX_FMTMSG_LEN
        virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt()
        r8169: respect userspace disabling IFF_MULTICAST
        selftests/bpf: get trusted cgrp from bpf_iter__cgroup directly
        bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg
        net: phylink: initialize carrier state at creation
        test/vsock: add dobule bind connect test
        test/vsock: refactor vsock_accept
        ...
      89cdf9d5
    • Linus Torvalds's avatar
      Merge tag 'v6.7-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 3b220413
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "This fixes a regression in ahash and hides the Kconfig sub-options for
        the jitter RNG"
      
      * tag 'v6.7-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: ahash - Set using_shash for cloned ahash wrapper over shash
        crypto: jitterentropy - Hide esoteric Kconfig options under FIPS and EXPERT
      3b220413
  5. 09 Nov, 2023 1 commit
    • Linus Torvalds's avatar
      Merge tag 'input-for-v6.7-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · a12deb44
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
      
       - a number of input drivers has been converted to use facilities
         provided by the device core to instantiate driver-specific attributes
         instead of using devm_device_add_group() and similar APIs
      
       - platform input devices have been converted to use remove() callback
         returning void
      
       - a fix for use-after-free when tearing down a Synaptics RMI device
      
       - a few flexible arrays in input structures have been annotated with
         __counted_by to help hardening efforts
      
       - handling of vddio supply in cyttsp5 driver
      
       - other miscellaneous fixups
      
      * tag 'input-for-v6.7-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (86 commits)
        Input: walkera0701 - use module_parport_driver macro to simplify the code
        Input: synaptics-rmi4 - fix use after free in rmi_unregister_function()
        dt-bindings: input: fsl,scu-key: Document wakeup-source
        Input: cyttsp5 - add handling for vddio regulator
        dt-bindings: input: cyttsp5: document vddio-supply
        Input: tegra-kbc - use device_get_match_data()
        Input: Annotate struct ff_device with __counted_by
        Input: axp20x-pek - avoid needless newline removal
        Input: mt - annotate struct input_mt with __counted_by
        Input: leds - annotate struct input_leds with __counted_by
        Input: evdev - annotate struct evdev_client with __counted_by
        Input: synaptics-rmi4 - replace deprecated strncpy
        Input: wm97xx-core - convert to platform remove callback returning void
        Input: wm831x-ts - convert to platform remove callback returning void
        Input: ti_am335x_tsc - convert to platform remove callback returning void
        Input: sun4i-ts - convert to platform remove callback returning void
        Input: stmpe-ts - convert to platform remove callback returning void
        Input: pcap_ts - convert to platform remove callback returning void
        Input: mc13783_ts - convert to platform remove callback returning void
        Input: mainstone-wm97xx - convert to platform remove callback returning void
        ...
      a12deb44