• Philipp Hahn's avatar
    fix pgd_lock deadlock · f4bced84
    Philipp Hahn authored
    commit a79e53d8 upstream.
    
    On Wednesday 16 February 2011 15:49:47 Andrea Arcangeli wrote:
    > Subject: fix pgd_lock deadlock
    >
    > From: Andrea Arcangeli <aarcange@redhat.com>
    >
    > It's forbidden to take the page_table_lock with the irq disabled or if
    > there's contention the IPIs (for tlb flushes) sent with the page_table_lock
    > held will never run leading to a deadlock.
    >
    > Apparently nobody takes the pgd_lock from irq so the _irqsave can be
    > removed.
    >
    > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
    
    This patch (original commit Id for 2.6.38 a79e53d8)
    needs to be back-ported to 2.6.32.x as well.
    I observed a dead-lock problem when running a PAE enabled Debian 2.6.32.46+
    kernel with 6 VCPUs as a KVM on (2.6.32, 3.2, 3.3) kernel, which showed the
    following behaviour:
    
    1 VCPU is stuck in
      pgd_alloc() =E2=86=92 pgd_prepopulate_pmb() =E2=86=92... =E2=86=92  flush_tlb_others_ipi()
    while (!cpumask_empty(to_cpumask(f->flush_cpumask)))
        cpu_relax();
    (gdb) print f->flush_cpumask
    $5 = {1}
    
    while all other VCPUs are stuck in
      pgd_alloc() =E2=86=92 spin_lock_irqsave(pgd_lock)
    
    I tracked it down to the commit
     2.6.39-rc1: 4981d01e
     2.6.32.34: ba456fd7
     x86: Flush TLB if PGD entry is changed in i386 PAE mode
    which when reverted made the bug disappear.
    
    Comparing 3.2 to 2.6.32.34 showed that the 'pgd-deadlock'-patch went into
    2.6.38, that is before the 'PAE correctness'-patch, so the problem was
    probably never observed in the main development branch.
    But for 2.6.32 the 'pgd-deadlock' patch is still missing, so the 'PAE
    corretness'-patch made the problem worse with 2.6.32.
    
    The Patch was also back-ported to the OpenSUSE Kernel
    <http://kernel.opensuse.org/cgit/kernel-source/commit/?id=ac27c01aa880c65d17043ab87249c613ac4c3635>,
    Since the patch didn't apply cleanly on the current Debian kernel, I had to
    backport it for us and Debian. The patch is also available from our (German)
    Bugzilla <https://forge.univention.org/bugzilla/show_bug.cgi?id=26661> or
    from the Debian BTS at <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=669335>.
    
    I have no easy test case, but running multiple parallel builds inside the VM
    normally triggers the bug within seconds to minutes. With the patch applied
    the VM survived a night building packages without any problem.
    Signed-off-by: default avatarPhilipp Hahn <hahn@univention.de>
    
    Sincerely
    Philipp
    -
    Philipp Hahn           Open Source Software Engineer      hahn@univention.de
    Univention GmbH        be open.                       fon: +49 421 22 232- 0
    Mary-Somerville-Str.1  D-28359 Bremen                 fax: +49 421 22 232-99
                                                       http://www.univention.de/
    
    It's forbidden to take the page_table_lock with the irq disabled
    or if there's contention the IPIs (for tlb flushes) sent with
    the page_table_lock held will never run leading to a deadlock.
    
    Nobody takes the pgd_lock from irq context so the _irqsave can be
    removed.
    Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
    Acked-by: default avatarRik van Riel <riel@redhat.com>
    Tested-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: <stable@kernel.org>
    LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    Git-commit: a79e53d8Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
    f4bced84
fault.c 26.7 KB