1. 26 Sep, 2024 4 commits
    • David Howells's avatar
      netfs: Fix write oops in generic/346 (9p) and generic/074 (cifs) · df9b4556
      David Howells authored
      In netfslib, a buffered writeback operation has a 'write queue' of folios
      that are being written, held in a linear sequence of folio_queue structs.
      The 'issuer' adds new folio_queues on the leading edge of the queue and
      populates each one progressively; the 'collector' pops them off the
      trailing edge and discards them and the folios they point to as they are
      consumed.
      
      The queue is required to always retain at least one folio_queue structure.
      This allows the queue to be accessed without locking and with just a bit of
      barriering.
      
      When a new subrequest is prepared, its ->io_iter iterator is pointed at the
      current end of the write queue and then the iterator is extended as more
      data is added to the queue until the subrequest is committed.
      
      Now, the problem is that the folio_queue at the leading edge of the write
      queue when a subrequest is prepared might have been entirely consumed - but
      not yet removed from the queue as it is the only remaining one and is
      preventing the queue from collapsing.
      
      So, what happens is that subreq->io_iter is pointed at the spent
      folio_queue, then a new folio_queue is added, and, at that point, the
      collector is at entirely at liberty to immediately delete the spent
      folio_queue.
      
      This leaves the subreq->io_iter pointing at a freed object.  If the system
      is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted
      freed object and advances to the next folio_queue in the queue.
      
      In the case seen, however, the freed object gets recycled and put back onto
      the queue at the tail and filled to the end.  This confuses
      iterate_folioq() and it tries to step ->next, which may be NULL - resulting
      in an oops.
      
      Fix this by the following means:
      
       (1) When preparing a write subrequest, make sure there's a folio_queue
           struct with space in it at the leading edge of the queue.  A function
           to make space is split out of the function to append a folio so that
           it can be called for this purpose.
      
       (2) If the request struct iterator is pointing to a completely spent
           folio_queue when we make space, then advance the iterator to the newly
           allocated folio_queue.  The subrequest's iterator will then be set
           from this.
      
      The oops could be triggered using the generic/346 xfstest with a filesystem
      on9P over TCP with cache=loose.  The oops looked something like:
      
       BUG: kernel NULL pointer dereference, address: 0000000000000008
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       ...
       RIP: 0010:_copy_from_iter+0x2db/0x530
       ...
       Call Trace:
        <TASK>
       ...
        p9pdu_vwritef+0x3d8/0x5d0
        p9_client_prepare_req+0xa8/0x140
        p9_client_rpc+0x81/0x280
        p9_client_write+0xcf/0x1c0
        v9fs_issue_write+0x87/0xc0
        netfs_advance_write+0xa0/0xb0
        netfs_write_folio.isra.0+0x42d/0x500
        netfs_writepages+0x15a/0x1f0
        do_writepages+0xd1/0x220
        filemap_fdatawrite_wbc+0x5c/0x80
        v9fs_mmap_vm_close+0x7d/0xb0
        remove_vma+0x35/0x70
        vms_complete_munmap_vmas+0x11a/0x170
        do_vmi_align_munmap+0x17d/0x1c0
        do_vmi_munmap+0x13e/0x150
        __vm_munmap+0x92/0xd0
        __x64_sys_munmap+0x17/0x20
        do_syscall_64+0x80/0xe0
        entry_SYSCALL_64_after_hwframe+0x71/0x79
      
      This also fixed a similar-looking issue with cifs and generic/074.
      
      Fixes: cd0277ed ("netfs: Use new folio_queue data type and iterator instead of xarray iter")
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com
      Closes: https://lore.kernel.org/oe-lkp/202409131438.3f225fbf-oliver.sang@intel.comSigned-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarkernel test robot <oliver.sang@intel.com>
      cc: Eric Van Hensbergen <ericvh@kernel.org>
      cc: Latchesar Ionkov <lucho@ionkov.net>
      cc: Dominique Martinet <asmadeus@codewreck.org>
      cc: Christian Schoenebeck <linux_oss@crudebyte.com>
      cc: Paulo Alcantara <pc@manguebit.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: v9fs@lists.linux.dev
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      df9b4556
    • Linus Torvalds's avatar
      Merge tag 'v6.12-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · ac34bb40
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Most are from the recent SMB3.1.1 test event, and also an important
        netfs fix for a cifs mtime write regression
      
         - fix mode reported by stat of readonly directories and files
      
         - DFS (global namespace) related fixes
      
         - fixes for special file support via reparse points
      
         - mount improvement and reconnect fix
      
         - fix for noisy log message on umount
      
         - two netfs related fixes, one fixing a recent regression, and add
           new write tracepoint"
      
      * tag 'v6.12-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        netfs, cifs: Fix mtime/ctime update for mmapped writes
        cifs: update internal version number
        smb: client: print failed session logoffs with FYI
        cifs: Fix reversion of the iter in cifs_readv_receive().
        smb3: fix incorrect mode displayed for read-only files
        smb: client: fix parsing of device numbers
        smb: client: set correct device number on nfs reparse points
        smb: client: propagate error from cifs_construct_tcon()
        smb: client: fix DFS failover in multiuser mounts
        cifs: Make the write_{enter,done,err} tracepoints display netfs info
        smb: client: fix DFS interlink failover
        smb: client: improve purging of cached referrals
        smb: client: avoid unnecessary reconnects when refreshing referrals
      ac34bb40
    • Linus Torvalds's avatar
      Merge tag 'probes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 5159938e
      Linus Torvalds authored
      Pull probes updates from Masami Hiramatsu:
      
       - uprobes: make trace_uprobe->nhit counter a per-CPU one
      
         This makes uprobe event's hit counter per-CPU for improving
         scalability on multi-core environment
      
       - kprobes: Remove obsoleted declaration for init_test_probes
      
         Remove unused init_test_probes() from header
      
       - Raw tracepoint probe supports raw tracepoint events on modules:
           - add a function for iterating over all tracepoints in all modules
           - add a function for iterating over tracepoints in a module
           - support raw tracepoint events on modules
           - support raw tracepoints on future loaded modules
           - add a test for tracepoint events on modules"
      
      * tag 'probes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        sefltests/tracing: Add a test for tracepoint events on modules
        tracing/fprobe: Support raw tracepoints on future loaded modules
        tracing/fprobe: Support raw tracepoint events on modules
        tracepoint: Support iterating tracepoints in a loading module
        tracepoint: Support iterating over tracepoints on modules
        kprobes: Remove obsoleted declaration for init_test_probes
        uprobes: turn trace_uprobe's nhit counter to be per-CPU one
      5159938e
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 0181f8c8
      Linus Torvalds authored
      Pull virtio updates from Michael Tsirkin:
       "Several new features here:
      
         - virtio-balloon supports new stats
      
         - vdpa supports setting mac address
      
         - vdpa/mlx5 suspend/resume as well as MKEY ops are now faster
      
         - virtio_fs supports new sysfs entries for queue info
      
         - virtio/vsock performance has been improved
      
        And fixes, cleanups all over the place"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (34 commits)
        vsock/virtio: avoid queuing packets when intermediate queue is empty
        vsock/virtio: refactor virtio_transport_send_pkt_work
        fw_cfg: Constify struct kobj_type
        vdpa/mlx5: Postpone MR deletion
        vdpa/mlx5: Introduce init/destroy for MR resources
        vdpa/mlx5: Rename mr_mtx -> lock
        vdpa/mlx5: Extract mr members in own resource struct
        vdpa/mlx5: Rename function
        vdpa/mlx5: Delete direct MKEYs in parallel
        vdpa/mlx5: Create direct MKEYs in parallel
        MAINTAINERS: add virtio-vsock driver in the VIRTIO CORE section
        virtio_fs: add sysfs entries for queue information
        virtio_fs: introduce virtio_fs_put_locked helper
        vdpa: Remove unused declarations
        vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command
        vdpa/mlx5: Small improvement for change_num_qps()
        vdpa/mlx5: Keep notifiers during suspend but ignore
        vdpa/mlx5: Parallelize device resume
        vdpa/mlx5: Parallelize device suspend
        vdpa/mlx5: Use async API for vq modify commands
        ...
      0181f8c8
  2. 25 Sep, 2024 36 commits