• Alistair Popple's avatar
    mm/memory.c: fix race when faulting a device private page · 16ce101d
    Alistair Popple authored
    Patch series "Fix several device private page reference counting issues",
    v2
    
    This series aims to fix a number of page reference counting issues in
    drivers dealing with device private ZONE_DEVICE pages.  These result in
    use-after-free type bugs, either from accessing a struct page which no
    longer exists because it has been removed or accessing fields within the
    struct page which are no longer valid because the page has been freed.
    
    During normal usage it is unlikely these will cause any problems.  However
    without these fixes it is possible to crash the kernel from userspace. 
    These crashes can be triggered either by unloading the kernel module or
    unbinding the device from the driver prior to a userspace task exiting. 
    In modules such as Nouveau it is also possible to trigger some of these
    issues by explicitly closing the device file-descriptor prior to the task
    exiting and then accessing device private memory.
    
    This involves some minor changes to both PowerPC and AMD GPU code. 
    Unfortunately I lack hardware to test either of those so any help there
    would be appreciated.  The changes mimic what is done in for both Nouveau
    and hmm-tests though so I doubt they will cause problems.
    
    
    This patch (of 8):
    
    When the CPU tries to access a device private page the migrate_to_ram()
    callback associated with the pgmap for the page is called.  However no
    reference is taken on the faulting page.  Therefore a concurrent migration
    of the device private page can free the page and possibly the underlying
    pgmap.  This results in a race which can crash the kernel due to the
    migrate_to_ram() function pointer becoming invalid.  It also means drivers
    can't reliably read the zone_device_data field because the page may have
    been freed with memunmap_pages().
    
    Close the race by getting a reference on the page while holding the ptl to
    ensure it has not been freed.  Unfortunately the elevated reference count
    will cause the migration required to handle the fault to fail.  To avoid
    this failure pass the faulting page into the migrate_vma functions so that
    if an elevated reference count is found it can be checked to see if it's
    expected or not.
    
    [mpe@ellerman.id.au: fix build]
      Link: https://lkml.kernel.org/r/87fsgbf3gh.fsf@mpe.ellerman.id.au
    Link: https://lkml.kernel.org/r/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com
    Link: https://lkml.kernel.org/r/d3e813178a59e565e8d78d9b9a4e2562f6494f90.1664366292.git-series.apopple@nvidia.comSigned-off-by: default avatarAlistair Popple <apopple@nvidia.com>
    Acked-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: John Hubbard <jhubbard@nvidia.com>
    Cc: Ralph Campbell <rcampbell@nvidia.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Lyude Paul <lyude@redhat.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: Alex Sierra <alex.sierra@amd.com>
    Cc: Ben Skeggs <bskeggs@redhat.com>
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: "Huang, Ying" <ying.huang@intel.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    16ce101d
memory.c 160 KB