• Michael Ellerman's avatar
    powerpc/slb: Force a full SLB flush when we insert for a bad EA · 668e0da8
    Michael Ellerman authored
    [Note this patch is not upstream. The bug fix was fixed differently in
    upstream prior to the bug being identified.]
    
    The SLB miss handler calls slb_allocate_realmode() in order to create an
    SLB entry for the faulting address. At the very start of that function
    we check that the faulting Effective Address (EA) is less than
    PGTABLE_RANGE (ignoring the region), ie. is it an address which could
    possibly fit in the virtual address space.
    
    For an EA which fails that test, we branch out of line (to label 8), but
    we still go on to create an SLB entry for the address. The SLB entry we
    create has a VSID of 0, which means it will never match anything in the
    hash table and so can't actually translate to a physical address.
    
    However that SLB entry will be inserted in the SLB, and so needs to be
    managed properly like any other SLB entry. In particular we need to
    insert the SLB entry in the SLB cache, so that it will be flushed when
    the process is descheduled.
    
    And that is where the bugs begin. The first bug is that slb_finish_load()
    uses cr7 to decide if it should insert the SLB entry into the SLB cache.
    When we come from the invalid EA case we don't set cr7, it just has some
    junk value from userspace. So we may or may not insert the SLB entry in
    the SLB cache. If we fail to insert it, we may then incorrectly leave it
    in the SLB when the process is descheduled.
    
    The second bug is that even if we do happen to add the entry to the SLB
    cache, we do not have enough bits in the SLB cache to remember the full
    ESID value for very large EAs.
    
    For example if a process branches to 0x788c545a18000000, that results in
    a 256MB SLB entry with an ESID of 0x788c545a1. But each entry in the SLB
    cache is only 32-bits, meaning we truncate the ESID to 0x88c545a1. This
    has the same effect as the first bug, we incorrectly leave the SLB entry
    in the SLB when the process is descheduled.
    
    When a process accesses an invalid EA it results in a SEGV signal being
    sent to the process, which typically results in the process being
    killed. Process death isn't instantaneous however, the process may catch
    the SEGV signal and continue somehow, or the kernel may start writing a
    core dump for the process, either of which means it's possible for the
    process to be preempted while its processing the SEGV but before it's
    been killed.
    
    If that happens, when the process is scheduled back onto the CPU we will
    allocate a new SLB entry for the NIP, which will insert a second entry
    into the SLB for the bad EA. Because we never flushed the original
    entry, due to either bug one or two, we now have two SLB entries that
    match the same EA.
    
    If another access is made to that EA, either by the process continuing
    after catching the SEGV, or by a second process accessing the same bad
    EA on the same CPU, we will trigger an SLB multi-hit machine check
    exception. This has been observed happening in the wild.
    
    The fix is when we hit the invalid EA case, we mark the SLB cache as
    being full. This causes us to not insert the truncated ESID into the SLB
    cache, and means when the process is switched out we will flush the
    entire SLB. Note that this works both for the original fault and for a
    subsequent call to slb_allocate_realmode() from switch_slb().
    
    Because we mark the SLB cache as full, it doesn't really matter what
    value is in cr7, but rather than leaving it as something random we set
    it to indicate the address was a kernel address. That also skips the
    attempt to insert it in the SLB cache which is a nice side effect.
    
    Another way to fix the bug would be to make the entries in the SLB cache
    wider, so that we don't truncate the ESID. However this would be a more
    intrusive change as it alters the size and layout of the paca.
    
    This bug was fixed in upstream by commit f0f558b1 ("powerpc/mm:
    Preserve CFAR value on SLB miss caused by access to bogus address"),
    which changed the way we handle a bad EA entirely removing this bug in
    the process.
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Reviewed-by: default avatarPaul Mackerras <paulus@ozlabs.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    668e0da8
slb_low.S 9.05 KB