• Frederic Barrat's avatar
    cxl: Fix possible deadlock when processing page faults from cxllib · ad7b4e80
    Frederic Barrat authored
    cxllib_handle_fault() is called by an external driver when it needs to
    have the host resolve page faults for a buffer. The buffer can cover
    several pages and VMAs. The function iterates over all the pages used
    by the buffer, based on the page size of the VMA.
    
    To ensure some stability while processing the faults, the thread T1
    grabs the mm->mmap_sem semaphore with read access (R1). However, when
    processing a page fault for a single page, one of the underlying
    functions, copro_handle_mm_fault(), also grabs the same semaphore with
    read access (R2). So the thread T1 takes the semaphore twice.
    
    If another thread T2 tries to access the semaphore in write mode W1
    (say, because it wants to allocate memory and calls 'brk'), then that
    thread T2 will have to wait because there's a reader (R1). If the
    thread T1 is processing a new page at that time, it won't get an
    automatic grant at R2, because there's now a writer thread
    waiting (T2). And we have a deadlock.
    
    The timeline is:
    1. thread T1 owns the semaphore with read access R1
    2. thread T2 requests write access W1 and waits
    3. thread T1 requests read access R2 and waits
    
    The fix is for the thread T1 to release the semaphore R1 once it got
    the information it needs from the current VMA. The address space/VMAs
    could evolve while T1 iterates over the full buffer, but in the
    unlikely case where T1 misses a page, the external driver will raise a
    new page fault when retrying the memory access.
    
    Fixes: 3ced8d73 ("cxl: Export library to support IBM XSL")
    Cc: stable@vger.kernel.org # 4.13+
    Signed-off-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    ad7b4e80
cxllib.c 6.7 KB