1. 02 Sep, 2022 4 commits
    • Eric Dumazet's avatar
      tcp: TX zerocopy should not sense pfmemalloc status · 32614006
      Eric Dumazet authored
      We got a recent syzbot report [1] showing a possible misuse
      of pfmemalloc page status in TCP zerocopy paths.
      
      Indeed, for pages coming from user space or other layers,
      using page_is_pfmemalloc() is moot, and possibly could give
      false positives.
      
      There has been attempts to make page_is_pfmemalloc() more robust,
      but not using it in the first place in this context is probably better,
      removing cpu cycles.
      
      Note to stable teams :
      
      You need to backport 84ce071e ("net: introduce
      __skb_fill_page_desc_noacc") as a prereq.
      
      Race is more probable after commit c07aea3e
      ("mm: add a signature in struct page") because page_is_pfmemalloc()
      is now using low order bit from page->lru.next, which can change
      more often than page->index.
      
      Low order bit should never be set for lru.next (when used as an anchor
      in LRU list), so KCSAN report is mostly a false positive.
      
      Backporting to older kernel versions seems not necessary.
      
      [1]
      BUG: KCSAN: data-race in lru_add_fn / tcp_build_frag
      
      write to 0xffffea0004a1d2c8 of 8 bytes by task 18600 on cpu 0:
      __list_add include/linux/list.h:73 [inline]
      list_add include/linux/list.h:88 [inline]
      lruvec_add_folio include/linux/mm_inline.h:105 [inline]
      lru_add_fn+0x440/0x520 mm/swap.c:228
      folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246
      folio_batch_add_and_move mm/swap.c:263 [inline]
      folio_add_lru+0xf1/0x140 mm/swap.c:490
      filemap_add_folio+0xf8/0x150 mm/filemap.c:948
      __filemap_get_folio+0x510/0x6d0 mm/filemap.c:1981
      pagecache_get_page+0x26/0x190 mm/folio-compat.c:104
      grab_cache_page_write_begin+0x2a/0x30 mm/folio-compat.c:116
      ext4_da_write_begin+0x2dd/0x5f0 fs/ext4/inode.c:2988
      generic_perform_write+0x1d4/0x3f0 mm/filemap.c:3738
      ext4_buffered_write_iter+0x235/0x3e0 fs/ext4/file.c:270
      ext4_file_write_iter+0x2e3/0x1210
      call_write_iter include/linux/fs.h:2187 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x468/0x760 fs/read_write.c:578
      ksys_write+0xe8/0x1a0 fs/read_write.c:631
      __do_sys_write fs/read_write.c:643 [inline]
      __se_sys_write fs/read_write.c:640 [inline]
      __x64_sys_write+0x3e/0x50 fs/read_write.c:640
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffffea0004a1d2c8 of 8 bytes by task 18611 on cpu 1:
      page_is_pfmemalloc include/linux/mm.h:1740 [inline]
      __skb_fill_page_desc include/linux/skbuff.h:2422 [inline]
      skb_fill_page_desc include/linux/skbuff.h:2443 [inline]
      tcp_build_frag+0x613/0xb20 net/ipv4/tcp.c:1018
      do_tcp_sendpages+0x3e8/0xaf0 net/ipv4/tcp.c:1075
      tcp_sendpage_locked net/ipv4/tcp.c:1140 [inline]
      tcp_sendpage+0x89/0xb0 net/ipv4/tcp.c:1150
      inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833
      kernel_sendpage+0x184/0x300 net/socket.c:3561
      sock_sendpage+0x5a/0x70 net/socket.c:1054
      pipe_to_sendpage+0x128/0x160 fs/splice.c:361
      splice_from_pipe_feed fs/splice.c:415 [inline]
      __splice_from_pipe+0x222/0x4d0 fs/splice.c:559
      splice_from_pipe fs/splice.c:594 [inline]
      generic_splice_sendpage+0x89/0xc0 fs/splice.c:743
      do_splice_from fs/splice.c:764 [inline]
      direct_splice_actor+0x80/0xa0 fs/splice.c:931
      splice_direct_to_actor+0x305/0x620 fs/splice.c:886
      do_splice_direct+0xfb/0x180 fs/splice.c:974
      do_sendfile+0x3bf/0x910 fs/read_write.c:1249
      __do_sys_sendfile64 fs/read_write.c:1317 [inline]
      __se_sys_sendfile64 fs/read_write.c:1303 [inline]
      __x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1303
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x0000000000000000 -> 0xffffea0004a1d288
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 18611 Comm: syz-executor.4 Not tainted 6.0.0-rc2-syzkaller-00248-ge022620b-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
      
      Fixes: c07aea3e ("mm: add a signature in struct page")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32614006
    • Dan Carpenter's avatar
      tipc: fix shift wrapping bug in map_get() · e2b224ab
      Dan Carpenter authored
      There is a shift wrapping bug in this code so anything thing above
      31 will return false.
      
      Fixes: 35c55c98 ("tipc: add neighbor monitoring framework")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2b224ab
    • Toke Høiland-Jørgensen's avatar
      sch_sfb: Don't assume the skb is still around after enqueueing to child · 9efd2329
      Toke Høiland-Jørgensen authored
      The sch_sfb enqueue() routine assumes the skb is still alive after it has
      been enqueued into a child qdisc, using the data in the skb cb field in the
      increment_qlen() routine after enqueue. However, the skb may in fact have
      been freed, causing a use-after-free in this case. In particular, this
      happens if sch_cake is used as a child of sfb, and the GSO splitting mode
      of CAKE is enabled (in which case the skb will be split into segments and
      the original skb freed).
      
      Fix this by copying the sfb cb data to the stack before enqueueing the skb,
      and using this stack copy in increment_qlen() instead of the skb pointer
      itself.
      
      Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18231
      Fixes: e13e02a3 ("net_sched: SFB flow scheduler")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@toke.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9efd2329
    • Heiner Kallweit's avatar
      Revert "net: phy: meson-gxl: improve link-up behavior" · 7fdc7766
      Heiner Kallweit authored
      This reverts commit 2c87c6f9.
      Meanwhile it turned out that the following commit is the proper
      workaround for the issue that 2c87c6f9 tries to address.
      a3a57bf0 ("net: stmmac: work around sporadic tx issue on link-up")
      It's nor clear why the to be reverted commit helped for one user,
      for others it didn't make a difference.
      
      Fixes: 2c87c6f9 ("net: phy: meson-gxl: improve link-up behavior")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/8deeeddc-6b71-129b-1918-495a12dc11e3@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7fdc7766
  2. 01 Sep, 2022 11 commits
  3. 31 Aug, 2022 19 commits
  4. 30 Aug, 2022 2 commits
  5. 29 Aug, 2022 4 commits
    • Bart Van Assche's avatar
      tracing: Define the is_signed_type() macro once · dcf8e563
      Bart Van Assche authored
      There are two definitions of the is_signed_type() macro: one in
      <linux/overflow.h> and a second definition in <linux/trace_events.h>.
      
      As suggested by Linus, move the definition of the is_signed_type() macro
      into the <linux/compiler.h> header file.  Change the definition of the
      is_signed_type() macro to make sure that it does not trigger any sparse
      warnings with future versions of sparse for bitwise types.
      
      Link: https://lore.kernel.org/all/CAHk-=whjH6p+qzwUdx5SOVVHjS3WvzJQr6mDUwhEyTf6pJWzaQ@mail.gmail.com/
      Link: https://lore.kernel.org/all/CAHk-=wjQGnVfb4jehFR0XyZikdQvCZouE96xR_nnf5kqaM5qqQ@mail.gmail.com/
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dcf8e563
    • Linus Torvalds's avatar
      Merge tag 'docs-6.0-fixes' of git://git.lwn.net/linux · d68d289f
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A handful of fixes for documentation and the docs build system"
      
      * tag 'docs-6.0-fixes' of git://git.lwn.net/linux:
        docs/conf.py: add function attribute '__fix_address' to conf.py
        Docs/admin-guide/mm/damon/usage: fix the example code snip
        docs: Update version number from 5.x to 6.x in README.rst
        docs/ja_JP/SubmittingPatches: Remove reference to submitting-drivers.rst
        docs: kerneldoc-preamble: Test xeCJK.sty before loading
      d68d289f
    • David S. Miller's avatar
      Merge branch 'u64_stats-fixups' · cb10b0f9
      David S. Miller authored
      Sebastian Andrzej Siewior says:
      
      ====================
      net: u64_stats fixups for 32bit.
      
      while looking at the u64-stats patch
      	https://lore.kernel.org/all/20220817162703.728679-10-bigeasy@linutronix.de
      
      I noticed that u64_stats_fetch_begin() is used. That suspicious thing
      about it is that network processing, including stats update, is
      performed in NAPI and so I would expect to see
      u64_stats_fetch_begin_irq() in order to avoid updates from NAPI during
      the read. This is only needed on 32bit-UP where the seqcount is not
      used. This is address in 2/2. The remaining user take some kind of
      precaution and may use u64_stats_fetch_begin().
      
      I updated the previously mentioned patch to get rid of
      u64_stats_fetch_begin_irq(). If this is not considered stable patch
      worthy then it can be ignored and considred fixed by the other series
      which removes the special 32bit cases.
      
      The xrs700x driver reads and writes the counter from preemptible context
      so the only missing piece here is at least disable preemption on the
      writer side to avoid preemption while the writer is in progress. The
      possible reader would spin then until the writer completes its write
      critical section which is considered bad. This is addressed in 1/2 by
      using u64_stats_update_begin_irqsave() and so disable interrupts during
      the write critical section.
      The other closet resemblance I found is mdio_bus.c::mdiobus_stats_acct()
      where preemtion is disabled unconditionally. This is something I want to
      avoid since it also affects 64bit.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb10b0f9
    • Sebastian Andrzej Siewior's avatar
      net: Use u64_stats_fetch_begin_irq() for stats fetch. · 278d3ba6
      Sebastian Andrzej Siewior authored
      On 32bit-UP u64_stats_fetch_begin() disables only preemption. If the
      reader is in preemptible context and the writer side
      (u64_stats_update_begin*()) runs in an interrupt context (IRQ or
      softirq) then the writer can update the stats during the read operation.
      This update remains undetected.
      
      Use u64_stats_fetch_begin_irq() to ensure the stats fetch on 32bit-UP
      are not interrupted by a writer. 32bit-SMP remains unaffected by this
      change.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Catherine Sullivan <csully@google.com>
      Cc: David Awogbemila <awogbemila@google.com>
      Cc: Dimitris Michailidis <dmichail@fungible.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jeroen de Borst <jeroendb@google.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Simon Horman <simon.horman@corigine.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-wireless@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: oss-drivers@corigine.com
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      278d3ba6