1. 11 Feb, 2021 2 commits
    • Daniel Vetter's avatar
      PCI: Revoke mappings like devmem · 636b21b5
      Daniel Vetter authored
      Since 3234ac66 ("/dev/mem: Revoke mappings when a driver claims
      the region") /dev/kmem zaps PTEs when the kernel requests exclusive
      acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
      the default for all driver uses.
      
      Except there are two more ways to access PCI BARs: sysfs and proc mmap
      support. Let's plug that hole.
      
      For revoke_devmem() to work we need to link our vma into the same
      address_space, with consistent vma->vm_pgoff. ->pgoff is already
      adjusted, because that's how (io_)remap_pfn_range works, but for the
      mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
      to adjust this at at ->open time:
      
      - for sysfs this is easy, now that binary attributes support this. We
        just set bin_attr->mapping when mmap is supported
      - for procfs it's a bit more tricky, since procfs PCI access has only
        one file per device, and access to a specific resource first needs
        to be set up with some ioctl calls. But mmap is only supported for
        the same resources as sysfs exposes with mmap support, and otherwise
        rejected, so we can set the mapping unconditionally at open time
        without harm.
      
      A special consideration is for arch_can_pci_mmap_io() - we need to
      make sure that the ->f_mapping doesn't alias between ioport and iomem
      space. There are only 2 ways in-tree to support mmap of ioports: generic
      PCI mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
      architecture hand-rolling. Both approaches support ioport mmap through a
      special PFN range and not through magic PTE attributes. Aliasing is
      therefore not a problem.
      
      The only difference in access checks left is that sysfs PCI mmap does
      not check for CAP_RAWIO. I'm not really sure whether that should be
      added or not.
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: linux-pci@vger.kernel.org
      Link: https://patchwork.freedesktop.org/patch/msgid/20210204165831.2703772-3-daniel.vetter@ffwll.ch
      636b21b5
    • Daniel Vetter's avatar
      PCI: Also set up legacy files only after sysfs init · efd532a6
      Daniel Vetter authored
      We are already doing this for all the regular sysfs files on PCI
      devices, but not yet on the legacy io files on the PCI buses. Thus far
      no problem, but in the next patch I want to wire up iomem revoke
      support. That needs the vfs up and running already to make sure that
      iomem_get_mapping() works.
      
      Wire it up exactly like the existing code in
      pci_create_sysfs_dev_files(). Note that pci_remove_legacy_files()
      doesn't need a check since the one for pci_bus->legacy_io is
      sufficient.
      
      An alternative solution would be to implement a callback in sysfs to
      set up the address space from iomem_get_mapping() when userspace calls
      mmap(). This also works, but Greg didn't really like that just to work
      around an ordering issue when the kernel loads initially.
      
      v2: Improve commit message (Bjorn)
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: linux-pci@vger.kernel.org
      Link: https://patchwork.freedesktop.org/patch/msgid/20210205133632.2827730-1-daniel.vetter@ffwll.ch
      efd532a6
  2. 12 Jan, 2021 11 commits
    • Daniel Vetter's avatar
      sysfs: Support zapping of binary attr mmaps · 74b30195
      Daniel Vetter authored
      We want to be able to revoke pci mmaps so that the same access rules
      applies as for /dev/kmem. Revoke support for devmem was added in
      3234ac66 ("/dev/mem: Revoke mappings when a driver claims the
      region").
      
      The simplest way to achieve this is by having the same filp->f_mapping
      for all mappings, so that unmap_mapping_range can find them all, no
      matter through which file they've been created. Since this must be set
      at open time we need sysfs support for this.
      
      Add an optional mapping parameter bin_attr, which is only consulted
      when there's also an mmap callback, since without mmap support
      allowing to adjust the ->f_mapping makes no sense.
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Cc: Nayna Jain <nayna@linux.ibm.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-12-daniel.vetter@ffwll.ch
      74b30195
    • Daniel Vetter's avatar
      resource: Move devmem revoke code to resource framework · 71a1d8ed
      Daniel Vetter authored
      We want all iomem mmaps to consistently revoke ptes when the kernel
      takes over and CONFIG_IO_STRICT_DEVMEM is enabled. This includes the
      pci bar mmaps available through procfs and sysfs, which currently do
      not revoke mappings.
      
      To prepare for this, move the code from the /dev/kmem driver to
      kernel/resource.c.
      
      During review Jason spotted that barriers are used somewhat
      inconsistently. Fix that up while we shuffle this code, since it
      doesn't have an actual impact at runtime. Otherwise no semantic and
      behavioural changes intended, just code extraction and adjusting
      comments and names.
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-11-daniel.vetter@ffwll.ch
      71a1d8ed
    • Daniel Vetter's avatar
      /dev/mem: Only set filp->f_mapping · 0fb1b1ed
      Daniel Vetter authored
      When we care about pagecache maintenance, we need to make sure that
      both f_mapping and i_mapping point at the right mapping.
      
      But for iomem mappings we only care about the virtual/pte side of
      things, so f_mapping is enough. Also setting inode->i_mapping was
      confusing me as a driver maintainer, since in e.g. drivers/gpu we
      don't do that. Per Dan this seems to be copypasta from places which do
      care about pagecache consistency, but not needed. Hence remove it for
      slightly less confusion.
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-10-daniel.vetter@ffwll.ch
      0fb1b1ed
    • Daniel Vetter's avatar
      PCI: Obey iomem restrictions for procfs mmap · dc217d2c
      Daniel Vetter authored
      There's three ways to access PCI BARs from userspace: /dev/mem, sysfs
      files, and the old proc interface. Two check against
      iomem_is_exclusive, proc never did. And with CONFIG_IO_STRICT_DEVMEM,
      this starts to matter, since we don't want random userspace having
      access to PCI BARs while a driver is loaded and using it.
      
      Fix this by adding the same iomem_is_exclusive() check we already have
      on the sysfs side in pci_mmap_resource().
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      References: 90a545e9 ("restrict /dev/mem to idle io memory ranges")
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: linux-pci@vger.kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-9-daniel.vetter@ffwll.ch
      dc217d2c
    • Daniel Vetter's avatar
      mm: Close race in generic_access_phys · 96667f8a
      Daniel Vetter authored
      Way back it was a reasonable assumptions that iomem mappings never
      change the pfn range they point at. But this has changed:
      
      - gpu drivers dynamically manage their memory nowadays, invalidating
        ptes with unmap_mapping_range when buffers get moved
      
      - contiguous dma allocations have moved from dedicated carvetouts to
        cma regions. This means if we miss the unmap the pfn might contain
        pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
      
      - even /dev/mem now invalidates mappings when the kernel requests that
        iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac66
        ("/dev/mem: Revoke mappings when a driver claims the region")
      
      Accessing pfns obtained from ptes without holding all the locks is
      therefore no longer a good idea. Fix this.
      
      Since ioremap might need to manipulate pagetables too we need to drop
      the pt lock and have a retry loop if we raced.
      
      While at it, also add kerneldoc and improve the comment for the
      vma_ops->access function. It's for accessing, not for moving the
      memory from iomem to system memory, as the old comment seemed to
      suggest.
      
      References: 28b2ee20 ("access_process_vm device memory infrastructure")
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Benjamin Herrensmidt <benh@kernel.crashing.org>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-8-daniel.vetter@ffwll.ch
      96667f8a
    • Daniel Vetter's avatar
      media: videobuf2: Move frame_vector into media subsystem · eb83b8e3
      Daniel Vetter authored
      It's the only user. This also garbage collects the CONFIG_FRAME_VECTOR
      symbol from all over the tree (well just one place, somehow omap media
      driver still had this in its Kconfig, despite not using it).
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Acked-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Acked-by: default avatarTomasz Figa <tfiga@chromium.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Tomasz Figa <tfiga@chromium.org>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-7-daniel.vetter@ffwll.ch
      eb83b8e3
    • Daniel Vetter's avatar
      mm/frame-vector: Use FOLL_LONGTERM · 04769cb1
      Daniel Vetter authored
      This is used by media/videbuf2 for persistent dma mappings, not just
      for a single dma operation and then freed again, so needs
      FOLL_LONGTERM.
      
      Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
      locking issues. Rework the code to pull the pup path out from the
      mmap_sem critical section as suggested by Jason.
      
      By relying entirely on the vma checks in pin_user_pages and follow_pfn
      (for vm_flags and vma_is_fsdax) we can also streamline the code a lot.
      
      Note that pin_user_pages_fast is a safe replacement despite the
      seeming lack of checking for vma->vm_flasg & (VM_IO | VM_PFNMAP). Such
      ptes are marked with pte_mkspecial (which pup_fast rejects in the
      fastpath), and only architectures supporting that support the
      pin_user_pages_fast fastpath.
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Tomasz Figa <tfiga@chromium.org>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-6-daniel.vetter@ffwll.ch
      04769cb1
    • Daniel Vetter's avatar
      misc/habana: Use FOLL_LONGTERM for userptr · d88a0c16
      Daniel Vetter authored
      These are persistent, not just for the duration of a dma operation.
      Reviewed-by: default avatarOded Gabbay <ogabbay@kernel.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Oded Gabbay <oded.gabbay@gmail.com>
      Cc: Omer Shpigelman <oshpigelman@habana.ai>
      Cc: Ofir Bitton <obitton@habana.ai>
      Cc: Tomer Tayar <ttayar@habana.ai>
      Cc: Moti Haimovski <mhaimovski@habana.ai>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Pawel Piskorski <ppiskorski@habana.ai>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-5-daniel.vetter@ffwll.ch
      d88a0c16
    • Daniel Vetter's avatar
      misc/habana: Stop using frame_vector helpers · d4cb1925
      Daniel Vetter authored
      All we need are a pages array, pin_user_pages_fast can give us that
      directly. Plus this avoids the entire raw pfn side of get_vaddr_frames.
      
      Note that pin_user_pages_fast is a safe replacement despite the
      seeming lack of checking for vma->vm_flasg & (VM_IO | VM_PFNMAP). Such
      ptes are marked with pte_mkspecial (which pup_fast rejects in the
      fastpath), and only architectures supporting that support the
      pin_user_pages_fast fastpath.
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarOded Gabbay <ogabbay@kernel.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Oded Gabbay <oded.gabbay@gmail.com>
      Cc: Omer Shpigelman <oshpigelman@habana.ai>
      Cc: Ofir Bitton <obitton@habana.ai>
      Cc: Tomer Tayar <ttayar@habana.ai>
      Cc: Moti Haimovski <mhaimovski@habana.ai>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Pawel Piskorski <ppiskorski@habana.ai>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-4-daniel.vetter@ffwll.ch
      d4cb1925
    • Daniel Vetter's avatar
      drm/exynos: Use FOLL_LONGTERM for g2d cmdlists · 9fcac0f1
      Daniel Vetter authored
      The exynos g2d interface is very unusual, but it looks like the
      userptr objects are persistent. Hence they need FOLL_LONGTERM.
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Inki Dae <inki.dae@samsung.com>
      Cc: Joonyoung Shim <jy0922.shim@samsung.com>
      Cc: Seung-Woo Kim <sw0312.kim@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Kukjin Kim <kgene@kernel.org>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-3-daniel.vetter@ffwll.ch
      9fcac0f1
    • Daniel Vetter's avatar
      drm/exynos: Stop using frame_vector helpers · 2c8c08f3
      Daniel Vetter authored
      All we need are a pages array, pin_user_pages_fast can give us that
      directly. Plus this avoids the entire raw pfn side of get_vaddr_frames.
      
      Note that pin_user_pages_fast is a safe replacement despite the
      seeming lack of checking for vma->vm_flasg & (VM_IO | VM_PFNMAP). Such
      ptes are marked with pte_mkspecial (which pup_fast rejects in the
      fastpath), and only architectures supporting that support the
      pin_user_pages_fast fastpath.
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Inki Dae <inki.dae@samsung.com>
      Cc: Joonyoung Shim <jy0922.shim@samsung.com>
      Cc: Seung-Woo Kim <sw0312.kim@samsung.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Kukjin Kim <kgene@kernel.org>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201127164131.2244124-2-daniel.vetter@ffwll.ch
      2c8c08f3
  3. 10 Jan, 2021 12 commits
    • Linus Torvalds's avatar
      Linux 5.11-rc3 · 7c53f6b6
      Linus Torvalds authored
      7c53f6b6
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.11' of... · 20210a98
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Search for <ncurses.h> in the default header path of HOSTCC
      
       - Tweak the option order to be kind to old BSD awk
      
       - Remove 'kvmconfig' and 'xenconfig' shorthands
      
       - Fix documentation
      
      * tag 'kbuild-fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        Documentation: kbuild: Fix section reference
        kconfig: remove 'kvmconfig' and 'xenconfig' shorthands
        lib/raid6: Let $(UNROLL) rules work with macOS userland
        kconfig: Support building mconf with vendor sysroot ncurses
        kconfig: config script: add a little user help
        MAINTAINERS: adjust GCC PLUGINS after gcc-plugin.sh removal
      20210a98
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 688daed2
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is two driver fixes (megaraid_sas and hisi_sas).
      
        The megaraid one is a revert of a previous revert of a cpu hotplug fix
        which exposed a bug in the block layer which has been fixed in this
        merge window.
      
        The hisi_sas performance enhancement comes from switching to interrupt
        managed completion queues, which depended on the addition of
        devm_platform_get_irqs_affinity() which is now upstream via the irq
        tree in the last merge window"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: hisi_sas: Expose HW queues for v2 hw
        Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug""
      688daed2
    • Linus Torvalds's avatar
      Merge tag 'block-5.11-2021-01-10' of git://git.kernel.dk/linux-block · ed41fd07
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Missing CRC32 selections (Arnd)
      
       - Fix for a merge window regression with bdev inode init (Christoph)
      
       - bcache fixes
      
       - rnbd fixes
      
       - NVMe pull request from Christoph:
          - fix a race in the nvme-tcp send code (Sagi Grimberg)
          - fix a list corruption in an nvme-rdma error path (Israel Rukshin)
          - avoid a possible double fetch in nvme-pci (Lalithambika Krishnakumar)
          - add the susystem NQN quirk for a Samsung driver (Gopal Tiwari)
          - fix two compiler warnings in nvme-fcloop (James Smart)
          - don't call sleeping functions from irq context in nvme-fc (James Smart)
          - remove an unused argument (Max Gurtovoy)
          - remove unused exports (Minwoo Im)
      
       - Use-after-free fix for partition iteration (Ming)
      
       - Missing blk-mq debugfs flag annotation (John)
      
       - Bdev freeze regression fix (Satya)
      
       - blk-iocost NULL pointer deref fix (Tejun)
      
      * tag 'block-5.11-2021-01-10' of git://git.kernel.dk/linux-block: (26 commits)
        bcache: set bcache device into read-only mode for BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET
        bcache: introduce BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE for large bucket
        bcache: check unsupported feature sets for bcache register
        bcache: fix typo from SUUP to SUPP in features.h
        bcache: set pdev_set_uuid before scond loop iteration
        blk-mq-debugfs: Add decode for BLK_MQ_F_TAG_HCTX_SHARED
        block/rnbd-clt: avoid module unload race with close confirmation
        block/rnbd: Adding name to the Contributors List
        block/rnbd-clt: Fix sg table use after free
        block/rnbd-srv: Fix use after free in rnbd_srv_sess_dev_force_close
        block/rnbd: Select SG_POOL for RNBD_CLIENT
        block: pre-initialize struct block_device in bdev_alloc_inode
        fs: Fix freeze_bdev()/thaw_bdev() accounting of bd_fsfreeze_sb
        nvme: remove the unused status argument from nvme_trace_bio_complete
        nvmet-rdma: Fix list_del corruption on queue establishment failure
        nvme: unexport functions with no external caller
        nvme: avoid possible double fetch in handling CQE
        nvme-tcp: Fix possible race of io_work and direct send
        nvme-pci: mark Samsung PM1725a as IGNORE_DEV_SUBNQN
        nvme-fcloop: Fix sscanf type and list_first_entry_or_null warnings
        ...
      ed41fd07
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.11-2021-01-10' of git://git.kernel.dk/linux-block · d430adfe
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A bit larger than I had hoped at this point, but it's all changes that
        will be directed towards stable anyway. In detail:
      
         - Fix a merge window regression on error return (Matthew)
      
         - Remove useless variable declaration/assignment (Ye Bin)
      
         - IOPOLL fixes (Pavel)
      
         - Exit and cancelation fixes (Pavel)
      
         - fasync lockdep complaint fix (Pavel)
      
         - Ensure SQPOLL is synchronized with creator life time (Pavel)"
      
      * tag 'io_uring-5.11-2021-01-10' of git://git.kernel.dk/linux-block:
        io_uring: stop SQPOLL submit on creator's death
        io_uring: add warn_once for io_uring_flush()
        io_uring: inline io_uring_attempt_task_drop()
        io_uring: io_rw_reissue lockdep annotations
        io_uring: synchronise ev_posted() with waitqueues
        io_uring: dont kill fasync under completion_lock
        io_uring: trigger eventfd for IOPOLL
        io_uring: Fix return value from alloc_fixed_file_ref_node
        io_uring: Delete useless variable ‘id’ in io_prep_async_work
        io_uring: cancel more aggressively in exit_work
        io_uring: drop file refs after task cancel
        io_uring: patch up IOPOLL overflow_flush sync
        io_uring: synchronise IOPOLL on task_submit fail
      d430adfe
    • Linus Torvalds's avatar
      Merge tag 'usb-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 28318f53
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a number of small USB driver fixes for 5.11-rc3.
      
        Include in here are:
      
         - USB gadget driver fixes for reported issues
      
         - new usb-serial driver ids
      
         - dma from stack bugfixes
      
         - typec bugfixes
      
         - dwc3 bugfixes
      
         - xhci driver bugfixes
      
         - other small misc usb driver bugfixes
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (35 commits)
        usb: dwc3: gadget: Clear wait flag on dequeue
        usb: typec: Send uevent for num_altmodes update
        usb: typec: Fix copy paste error for NVIDIA alt-mode description
        usb: gadget: enable super speed plus
        kcov, usb: hide in_serving_softirq checks in __usb_hcd_giveback_urb
        usb: uas: Add PNY USB Portable SSD to unusual_uas
        usb: gadget: configfs: Preserve function ordering after bind failure
        usb: gadget: select CONFIG_CRC32
        usb: gadget: core: change the comment for usb_gadget_connect
        usb: gadget: configfs: Fix use-after-free issue with udc_name
        usb: dwc3: gadget: Restart DWC3 gadget when enabling pullup
        usb: usbip: vhci_hcd: protect shift size
        USB: usblp: fix DMA to stack
        USB: serial: iuu_phoenix: fix DMA from stack
        USB: serial: option: add LongSung M5710 module support
        USB: serial: option: add Quectel EM160R-GL
        USB: Gadget: dummy-hcd: Fix shift-out-of-bounds bug
        usb: gadget: f_uac2: reset wMaxPacketSize
        usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend regression
        usb: dwc3: ulpi: Replace CPU-based busyloop with Protocol-based one
        ...
      28318f53
    • Linus Torvalds's avatar
      Merge tag 'staging-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 4ad9a28f
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are some small staging driver fixes for 5.11-rc3. Nothing major,
        just resolving some reported issues:
      
         - cleanup some remaining mentions of the ION drivers that were
           removed in 5.11-rc1
      
         - comedi driver bugfix
      
         - two error path memory leak fixes
      
        All have been in linux-next for a while with no reported issues"
      
      * tag 'staging-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: ION: remove some references to CONFIG_ION
        staging: mt7621-dma: Fix a resource leak in an error handling path
        Staging: comedi: Return -EFAULT if copy_to_user() fails
        staging: spmi: hisi-spmi-controller: Fix some error handling paths
      4ad9a28f
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · e07cd2f3
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are some small char and misc driver fixes for 5.11-rc3.
      
        The majority here are fixes for the habanalabs drivers, but also in
        here are:
      
         - crypto driver fix
      
         - pvpanic driver fix
      
         - updated font file
      
         - interconnect driver fixes
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'char-misc-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
        Fonts: font_ter16x32: Update font with new upstream Terminus release
        misc: pvpanic: Check devm_ioport_map() for NULL
        speakup: Add github repository URL and bug tracker
        MAINTAINERS: Update Georgi's email address
        crypto: asym_tpm: correct zero out potential secrets
        habanalabs: Fix memleak in hl_device_reset
        interconnect: imx8mq: Use icc_sync_state
        interconnect: imx: Remove a useless test
        interconnect: imx: Add a missing of_node_put after of_device_is_available
        interconnect: qcom: fix rpmh link failures
        habanalabs: fix order of status check
        habanalabs: register to pci shutdown callback
        habanalabs: add validation cs counter, fix misplaced counters
        habanalabs/gaudi: retry loading TPC f/w on -EINTR
        habanalabs: adjust pci controller init to new firmware
        habanalabs: update comment in hl_boot_if.h
        habanalabs/gaudi: enhance reset message
        habanalabs: full FW hard reset support
        habanalabs/gaudi: disable CGM at HW initialization
        habanalabs: Revise comment to align with mirror list name
        ...
      e07cd2f3
    • Viresh Kumar's avatar
      Documentation: kbuild: Fix section reference · 5625dcfb
      Viresh Kumar authored
      Section 3.11 was incorrectly called 3.9, fix it.
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      5625dcfb
    • Linus Torvalds's avatar
      Merge tag 'arc-5.11-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 0653161f
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - Address the 2nd boot failure due to snafu in signal handling code
         (first was generic console ttynull issue)
      
       - misc other fixes
      
      * tag 'arc-5.11-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: [hsdk]: Enable FPU_SAVE_RESTORE
        ARC: unbork 5.11 bootup: fix snafu in _TIF_NOTIFY_SIGNAL handling
        include/soc: remove headers for EZChip NPS
        arch/arc: add copy_user_page() to <asm/page.h> to fix build error on ARC
      0653161f
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · b3cd1a16
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - A fix for machine check handling with VMAP stack on 32-bit.
      
       - A clang build fix.
      
      Thanks to Christophe Leroy and Nathan Chancellor.
      
      * tag 'powerpc-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Handle .text.{hot,unlikely}.* in linker script
        powerpc/32s: Fix RTAS machine check with VMAP stack
      b3cd1a16
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.11_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a440e4d7
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
       "As expected, fixes started trickling in after the holidays so here is
        the accumulated pile of x86 fixes for 5.11:
      
         - A fix for fanotify_mark() missing the conversion of x86_32 native
           syscalls which take 64-bit arguments to the compat handlers due to
           former having a general compat handler. (Brian Gerst)
      
         - Add a forgotten pmd page destructor call to pud_free_pmd_page()
           where a pmd page is freed. (Dan Williams)
      
         - Make IN/OUT insns with an u8 immediate port operand handling for
           SEV-ES guests more precise by using only the single port byte and
           not the whole s32 value of the insn decoder. (Peter Gonda)
      
         - Correct a straddling end range check before returning the proper
           MTRR type, when the end address is the same as top of memory.
           (Ying-Tsun Huang)
      
         - Change PQR_ASSOC MSR update scheme when moving a task to a resctrl
           resource group to avoid significant performance overhead with some
           resctrl workloads. (Fenghua Yu)
      
         - Avoid the actual task move overhead when the task is already in the
           resource group. (Fenghua Yu)"
      
      * tag 'x86_urgent_for_v5.11_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/resctrl: Don't move a task to the same resource group
        x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR
        x86/mtrr: Correct the range check before performing MTRR type lookups
        x86/sev-es: Fix SEV-ES OUT/IN immediate opcode vc handling
        x86/mm: Fix leak of pmd ptlock
        fanotify: Fix sys_fanotify_mark() on native x86-32
      a440e4d7
  4. 09 Jan, 2021 15 commits
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.11-rc3' of... · 2ff90100
      Linus Torvalds authored
      Merge tag 'hwmon-for-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Fix possible KASAN issue in amd_energy driver
      
       - Avoid configuration problem in pwm-fan driver
      
       - Fix kernel-doc warning in sbtsi_temp documentation
      
      * tag 'hwmon-for-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (amd_energy) fix allocation of hwmon_channel_info config
        hwmon: (pwm-fan) Ensure that calculation doesn't discard big period values
        hwmon: (sbtsi_temp) Fix Documenation kernel-doc warning
      2ff90100
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · f408126b
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "A bunch of dmaengine driver fixes for:
      
         - coverity discovered issues for xilinx driver
      
         - qcom, gpi driver fix for undefined bhaviour and one off cleanup
      
         - update Peter's email for TI DMA drivers
      
         - one-off for idxd driver
      
         - resource leak fix for mediatek and milbeaut drivers"
      
      * tag 'dmaengine-fix-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: stm32-mdma: fix STM32_MDMA_VERY_HIGH_PRIORITY value
        dmaengine: xilinx_dma: fix mixed_enum_type coverity warning
        dmaengine: xilinx_dma: fix incompatible param warning in _child_probe()
        dmaengine: xilinx_dma: check dma_async_device_register return value
        dmaengine: qcom: fix gpi undefined behavior
        dt-bindings: dma: ti: Update maintainer and author information
        MAINTAINERS: Add entry for Texas Instruments DMA drivers
        qcom: bam_dma: Delete useless kfree code
        dmaengine: dw-edma: Fix use after free in dw_edma_alloc_chunk()
        dmaengine: milbeaut-xdmac: Fix a resource leak in the error handling path of the probe function
        dmaengine: mediatek: mtk-hsdma: Fix a resource leak in the error handling path of the probe function
        dmaengine: qcom: gpi: Fixes a format mismatch
        dmaengine: idxd: off by one in cleanup code
        dmaengine: ti: k3-udma: Fix pktdma rchan TPL level setup
      f408126b
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · caab3147
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Three driver bugfixes for I2C. Buisness as usual"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mediatek: Fix apdma and i2c hand-shake timeout
        i2c: i801: Fix the i2c-mux gpiod_lookup_table not being properly terminated
        i2c: sprd: use a specific timeout to avoid system hang up issue
      caab3147
    • Darrick J. Wong's avatar
      maintainers: update my email address · 6bae85bd
      Darrick J. Wong authored
      Change my email contact ahead of a likely painful eleven-month migration
      to a certain cobalt enteprisey groupware cloud product that will totally
      break my workflow.  Some day I may get used to having to email being
      sequestered behind both claret and cerulean oath2+sms 2fa layers, but
      for now I'll stick with keying in one password to receive an email vs.
      the required four.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6bae85bd
    • Pavel Begunkov's avatar
      io_uring: stop SQPOLL submit on creator's death · d9d05217
      Pavel Begunkov authored
      When the creator of SQPOLL io_uring dies (i.e. sqo_task), we don't want
      its internals like ->files and ->mm to be poked by the SQPOLL task, it
      have never been nice and recently got racy. That can happen when the
      owner undergoes destruction and SQPOLL tasks tries to submit new
      requests in parallel, and so calls io_sq_thread_acquire*().
      
      That patch halts SQPOLL submissions when sqo_task dies by introducing
      sqo_dead flag. Once set, the SQPOLL task must not do any submission,
      which is synchronised by uring_lock as well as the new flag.
      
      The tricky part is to make sure that disabling always happens, that
      means either the ring is discovered by creator's do_exit() -> cancel,
      or if the final close() happens before it's done by the creator. The
      last is guaranteed by the fact that for SQPOLL the creator task and only
      it holds exactly one file note, so either it pins up to do_exit() or
      removed by the creator on the final put in flush. (see comments in
      uring_flush() around file->f_count == 2).
      
      One more place that can trigger io_sq_thread_acquire_*() is
      __io_req_task_submit(). Shoot off requests on sqo_dead there, even
      though actually we don't need to. That's because cancellation of
      sqo_task should wait for the request before going any further.
      
      note 1: io_disable_sqo_submit() does io_ring_set_wakeup_flag() so the
      caller would enter the ring to get an error, but it still doesn't
      guarantee that the flag won't be cleared.
      
      note 2: if final __userspace__ close happens not from the creator
      task, the file note will pin the ring until the task dies.
      
      Fixed: b1b6b5a3 ("kernel/io_uring: cancel io_uring before task works")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d9d05217
    • Pavel Begunkov's avatar
      io_uring: add warn_once for io_uring_flush() · 6b5733eb
      Pavel Begunkov authored
      files_cancel() should cancel all relevant requests and drop file notes,
      so we should never have file notes after that, including on-exit fput
      and flush. Add a WARN_ONCE to be sure.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6b5733eb
    • Pavel Begunkov's avatar
      io_uring: inline io_uring_attempt_task_drop() · 4f793dc4
      Pavel Begunkov authored
      A simple preparation change inlining io_uring_attempt_task_drop() into
      io_uring_flush().
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4f793dc4
    • Pavel Begunkov's avatar
      io_uring: io_rw_reissue lockdep annotations · 55e6ac1e
      Pavel Begunkov authored
      We expect io_rw_reissue() to take place only during submission with
      uring_lock held. Add a lockdep annotation to check that invariant.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      55e6ac1e
    • Coly Li's avatar
      bcache: set bcache device into read-only mode for BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET · 5342fd42
      Coly Li authored
      If BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET is set in incompat feature
      set, it means the cache device is created with obsoleted layout with
      obso_bucket_site_hi. Now bcache does not support this feature bit, a new
      BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE incompat feature bit is added
      for a better layout to support large bucket size.
      
      For the legacy compatibility purpose, if a cache device created with
      obsoleted BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET feature bit, all bcache
      devices attached to this cache set should be set to read-only. Then the
      dirty data can be written back to backing device before re-create the
      cache device with BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE feature bit
      by the latest bcache-tools.
      
      This patch checks BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET feature bit
      when running a cache set and attach a bcache device to the cache set. If
      this bit is set,
      - When run a cache set, print an error kernel message to indicate all
        following attached bcache device will be read-only.
      - When attach a bcache device, print an error kernel message to indicate
        the attached bcache device will be read-only, and ask users to update
        to latest bcache-tools.
      
      Such change is only for cache device whose bucket size >= 32MB, this is
      for the zoned SSD and almost nobody uses such large bucket size at this
      moment. If you don't explicit set a large bucket size for a zoned SSD,
      such change is totally transparent to your bcache device.
      
      Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5342fd42
    • Coly Li's avatar
      bcache: introduce BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE for large bucket · b16671e8
      Coly Li authored
      When large bucket feature was added, BCH_FEATURE_INCOMPAT_LARGE_BUCKET
      was introduced into the incompat feature set. It used bucket_size_hi
      (which was added at the tail of struct cache_sb_disk) to extend current
      16bit bucket size to 32bit with existing bucket_size in struct
      cache_sb_disk.
      
      This is not a good idea, there are two obvious problems,
      - Bucket size is always value power of 2, if store log2(bucket size) in
        existing bucket_size of struct cache_sb_disk, it is unnecessary to add
        bucket_size_hi.
      - Macro csum_set() assumes d[SB_JOURNAL_BUCKETS] is the last member in
        struct cache_sb_disk, bucket_size_hi was added after d[] which makes
        csum_set calculate an unexpected super block checksum.
      
      To fix the above problems, this patch introduces a new incompat feature
      bit BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE, when this bit is set, it
      means bucket_size in struct cache_sb_disk stores the order of power-of-2
      bucket size value. When user specifies a bucket size larger than 32768
      sectors, BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE will be set to
      incompat feature set, and bucket_size stores log2(bucket size) more
      than store the real bucket size value.
      
      The obsoleted BCH_FEATURE_INCOMPAT_LARGE_BUCKET won't be used anymore,
      it is renamed to BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET and still only
      recognized by kernel driver for legacy compatible purpose. The previous
      bucket_size_hi is renmaed to obso_bucket_size_hi in struct cache_sb_disk
      and not used in bcache-tools anymore.
      
      For cache device created with BCH_FEATURE_INCOMPAT_LARGE_BUCKET feature,
      bcache-tools and kernel driver still recognize the feature string and
      display it as "obso_large_bucket".
      
      With this change, the unnecessary extra space extend of bcache on-disk
      super block can be avoided, and csum_set() may generate expected check
      sum as well.
      
      Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: stable@vger.kernel.org # 5.9+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b16671e8
    • Coly Li's avatar
      bcache: check unsupported feature sets for bcache register · 1dfc0686
      Coly Li authored
      This patch adds the check for features which is incompatible for
      current supported feature sets.
      
      Now if the bcache device created by bcache-tools has features that
      current kernel doesn't support, read_super() will fail with error
      messoage. E.g. if an unsupported incompatible feature detected,
      bcache register will fail with dmesg "bcache: register_bcache() error :
      Unsupported incompatible feature found".
      
      Fixes: d721a43f ("bcache: increase super block version for cache device and backing device")
      Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: stable@vger.kernel.org # 5.9+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1dfc0686
    • Coly Li's avatar
      bcache: fix typo from SUUP to SUPP in features.h · f7b4943d
      Coly Li authored
      This patch fixes the following typos,
      from BCH_FEATURE_COMPAT_SUUP to BCH_FEATURE_COMPAT_SUPP
      from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_INCOMPAT_SUPP
      from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_RO_COMPAT_SUPP
      
      Fixes: d721a43f ("bcache: increase super block version for cache device and backing device")
      Fixes: ffa47032 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: stable@vger.kernel.org # 5.9+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f7b4943d
    • Yi Li's avatar
      bcache: set pdev_set_uuid before scond loop iteration · e8092707
      Yi Li authored
      There is no need to reassign pdev_set_uuid in the second loop iteration,
      so move it to the place before second loop.
      Signed-off-by: default avatarYi Li <yili@winhong.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e8092707
    • Linus Torvalds's avatar
      Merge tag 'zonefs-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs · 996e435f
      Linus Torvalds authored
      Pull zonefs fix from Damien Le Moal:
       "A single patch from Arnd to fix a missing dependency in zonefs
        Kconfig"
      
      * tag 'zonefs-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
        zonefs: select CONFIG_CRC32
      996e435f
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-kunit-fixes-5.11-rc3' of... · 263da333
      Linus Torvalds authored
      Merge tag 'linux-kselftest-kunit-fixes-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kunit fixes from Shuah Khan:
       "One fix to force the use of the 'tty' console for UML.
      
        Given that kunit tool requires the console output, explicitly stating
        the dependency makes sense than relying on it being the default"
      
      * tag 'linux-kselftest-kunit-fixes-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kunit: tool: Force the use of the 'tty' console for UML
      263da333