1. 29 Jun, 2021 26 commits
  2. 27 Jun, 2021 2 commits
    • Linus Torvalds's avatar
      Linux 5.13 · 62fb9874
      Linus Torvalds authored
      62fb9874
    • Linus Torvalds's avatar
      Revert "signal: Allow tasks to cache one sigqueue struct" · b4b27b9e
      Linus Torvalds authored
      This reverts commits 4bad58eb (and
      399f8dd9, which tried to fix it).
      
      I do not believe these are correct, and I'm about to release 5.13, so am
      reverting them out of an abundance of caution.
      
      The locking is odd, and appears broken.
      
      On the allocation side (in __sigqueue_alloc()), the locking is somewhat
      straightforward: it depends on sighand->siglock.  Since one caller
      doesn't hold that lock, it further then tests 'sigqueue_flags' to avoid
      the case with no locks held.
      
      On the freeing side (in sigqueue_cache_or_free()), there is no locking
      at all, and the logic instead depends on 'current' being a single
      thread, and not able to race with itself.
      
      To make things more exciting, there's also the data race between freeing
      a signal and allocating one, which is handled by using WRITE_ONCE() and
      READ_ONCE(), and being mutually exclusive wrt the initial state (ie
      freeing will only free if the old state was NULL, while allocating will
      obviously only use the value if it was non-NULL, so only one or the
      other will actually act on the value).
      
      However, while the free->alloc paths do seem mutually exclusive thanks
      to just the data value dependency, it's not clear what the memory
      ordering constraints are on it.  Could writes from the previous
      allocation possibly be delayed and seen by the new allocation later,
      causing logical inconsistencies?
      
      So it's all very exciting and unusual.
      
      And in particular, it seems that the freeing side is incorrect in
      depending on "current" being single-threaded.  Yes, 'current' is a
      single thread, but in the presense of asynchronous events even a single
      thread can have data races.
      
      And such asynchronous events can and do happen, with interrupts causing
      signals to be flushed and thus free'd (for example - sending a
      SIGCONT/SIGSTOP can happen from interrupt context, and can flush
      previously queued process control signals).
      
      So regardless of all the other questions about the memory ordering and
      locking for this new cached allocation, the sigqueue_cache_or_free()
      assumptions seem to be fundamentally incorrect.
      
      It may be that people will show me the errors of my ways, and tell me
      why this is all safe after all.  We can reinstate it if so.  But my
      current belief is that the WRITE_ONCE() that sets the cached entry needs
      to be a smp_store_release(), and the READ_ONCE() that finds a cached
      entry needs to be a smp_load_acquire() to handle memory ordering
      correctly.
      
      And the sequence in sigqueue_cache_or_free() would need to either use a
      lock or at least be interrupt-safe some way (perhaps by using something
      like the percpu 'cmpxchg': it doesn't need to be SMP-safe, but like the
      percpu operations it needs to be interrupt-safe).
      
      Fixes: 399f8dd9 ("signal: Prevent sigqueue caching after task got released")
      Fixes: 4bad58eb ("signal: Allow tasks to cache one sigqueue struct")
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4b27b9e
  3. 26 Jun, 2021 2 commits
  4. 25 Jun, 2021 10 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · e2f527b5
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two small fixes, both in upper layer drivers (scsi disk and cdrom).
      
        The sd one is fixing a commit changing revalidation that came from the
        block tree a while ago (5.10) and the sr one adds handling of a
        condition we didn't previously handle for manually removed media"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sd: Call sd_revalidate_disk() for ioctl(BLKRRPART)
        scsi: sr: Return appropriate error code when disk is ejected
      e2f527b5
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 7ce32ac6
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "24 patches, based on 4a09d388.
      
        Subsystems affected by this patch series: mm (thp, vmalloc, hugetlb,
        memory-failure, and pagealloc), nilfs2, kthread, MAINTAINERS, and
        mailmap"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
        mailmap: add Marek's other e-mail address and identity without diacritics
        MAINTAINERS: fix Marek's identity again
        mm/page_alloc: do bulk array bounds check after checking populated elements
        mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array
        mm/hwpoison: do not lock page again when me_huge_page() successfully recovers
        mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned
        mm/memory-failure: use a mutex to avoid memory_failure() races
        mm, futex: fix shared futex pgoff on shmem huge page
        kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
        kthread_worker: split code for canceling the delayed work timer
        mm/vmalloc: unbreak kasan vmalloc support
        KVM: s390: prepare for hugepage vmalloc
        mm/vmalloc: add vmalloc_no_huge
        nilfs2: fix memory leak in nilfs_sysfs_delete_device_group
        mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
        mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
        mm: page_vma_mapped_walk(): get vma_address_end() earlier
        mm: page_vma_mapped_walk(): use goto instead of while (1)
        mm: page_vma_mapped_walk(): add a level of indentation
        mm: page_vma_mapped_walk(): crossing page table boundary
        ...
      7ce32ac6
    • Gleb Fotengauer-Malinovskiy's avatar
      userfaultfd: uapi: fix UFFDIO_CONTINUE ioctl request definition · 808e9df4
      Gleb Fotengauer-Malinovskiy authored
      This ioctl request reads from uffdio_continue structure written by
      userspace which justifies _IOC_WRITE flag.  It also writes back to that
      structure which justifies _IOC_READ flag.
      
      See NOTEs in include/uapi/asm-generic/ioctl.h for more information.
      
      Fixes: f6191471 ("userfaultfd: add UFFDIO_CONTINUE ioctl")
      Signed-off-by: default avatarGleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
      Acked-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Reviewed-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      808e9df4
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 55fcd449
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Three more driver bugfixes and an annotation fix for the core"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: robotfuzz-osif: fix control-request directions
        i2c: dev: Add __user annotation
        i2c: cp2615: check for allocation failure in cp2615_i2c_recv()
        i2c: i801: Ensure that SMBHSTSTS_INUSE_STS is cleared when leaving i801_access
      55fcd449
    • Linus Torvalds's avatar
      Merge tag 'devprop-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7764c62f
      Linus Torvalds authored
      Pull device properties framework fix from Rafael Wysocki:
       "Fix a NULL pointer dereference introduced by a recent commit and
        occurring when device_remove_software_node() is used with a device
        that has never been registered (Heikki Krogerus)"
      
      * tag 'devprop-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        software node: Handle software node injection to an existing device properly
      7764c62f
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.13b-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · b960e014
      Linus Torvalds authored
      Pull xen fix from Juergen Gross:
       "A fix for a regression introduced in 5.12: when migrating an irq
        related to a Xen user event to another cpu, a race might result
        in a WARN() triggering"
      
      * tag 'for-linus-5.13b-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/events: reset active flag for lateeoi events later
      b960e014
    • Linus Torvalds's avatar
      Merge tag 'for-linus-urgent' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 616a99dd
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "A selftests fix for ARM, and the fix for page reference count
        underflow. This is a very small fix that was provided by Nick Piggin
        and tested by myself"
      
      * tag 'for-linus-urgent' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: do not allow mapping valid but non-reference-counted pages
        KVM: selftests: Fix mapping length truncation in m{,un}map()
      616a99dd
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 94ca94bb
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
       "Two more urgent FPU fixes:
      
         - prevent unprivileged userspace from reinitializing supervisor
           states
      
         - prepare init_fpstate, which is the buffer used when initializing
           FPU state, properly in case the skip-writing-state-components
           XSAVE* variants are used"
      
      * tag 'x86_urgent_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/fpu: Make init_fpstate correct with optimized XSAVE
        x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()
      94ca94bb
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client · edf54d9d
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Two regression fixes from the merge window: one in the auth code
        affecting old clusters and one in the filesystem for proper
        propagation of MDS request errors.
      
        Also included a locking fix for async creates, marked for stable"
      
      * tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client:
        libceph: set global_id as soon as we get an auth ticket
        libceph: don't pass result into ac->ops->handle_reply()
        ceph: fix error handling in ceph_atomic_open and ceph_lookup
        ceph: must hold snap_rwsem when filling inode for async create
      edf54d9d
    • Linus Torvalds's avatar
      Merge tag 'netfs-fixes-20210621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 9e736cf7
      Linus Torvalds authored
      Pull netfs fixes from David Howells:
       "This contains patches to fix netfs_write_begin() and afs_write_end()
        in the following ways:
      
        (1) In netfs_write_begin(), extract the decision about whether to skip
            a page out to its own helper and have that clear around the region
            to be written, but not clear that region. This requires the
            filesystem to patch it up afterwards if the hole doesn't get
            completely filled.
      
        (2) Use offset_in_thp() in (1) rather than manually calculating the
            offset into the page.
      
        (3) Due to (1), afs_write_end() now needs to handle short data write
            into the page by generic_perform_write(). I've adopted an
            analogous approach to ceph of just returning 0 in this case and
            letting the caller go round again.
      
        It also adds a note that (in the future) the len parameter may extend
        beyond the page allocated. This is because the page allocation is
        deferred to write_begin() and that gets to decide what size of THP to
        allocate."
      
      Jeff Layton points out:
       "The netfs fix in particular fixes a data corruption bug in cephfs"
      
      * tag 'netfs-fixes-20210621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        netfs: fix test for whether we can skip read when writing beyond EOF
        afs: Fix afs_write_end() to handle short writes
      9e736cf7