• Linus Torvalds's avatar
    x86: rewrite '__copy_user_nocache' function · 034ff37d
    Linus Torvalds authored
    I didn't really want to do this, but as part of all the other changes to
    the user copy loops, I've been looking at this horror.
    
    I tried to clean it up multiple times, but every time I just found more
    problems, and the way it's written, it's just too hard to fix them.
    
    For example, the code is written to do quad-word alignment, and will use
    regular byte accesses to get to that point.  That's fairly simple, but
    it means that any initial 8-byte alignment will be done with cached
    copies.
    
    However, the code then is very careful to do any 4-byte _tail_ accesses
    using an uncached 4-byte write, and that was claimed to be relevant in
    commit a82eee74 ("x86/uaccess/64: Handle the caching of 4-byte
    nocache copies properly in __copy_user_nocache()").
    
    So if you do a 4-byte copy using that function, it carefully uses a
    4-byte 'movnti' for the destination.  But if you were to do a 12-byte
    copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti'
    followed by a 8-byte 'movnti' to keep it all uncached.
    
    Instead, it would align the destination to 8 bytes using a
    byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8
    bytes.
    
    The main caller that cares is __copy_user_flushcache(), which knows
    about this insanity, and has odd cases for it all.  But I just can't
    deal with looking at this kind of "it does one case right, and another
    related case entirely wrong".
    
    And the code really wasn't fixable without hard drugs, which I try to
    avoid.
    
    So instead, rewrite it in a form that hopefully not only gets this
    right, but is a bit more maintainable.  Knock wood.
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    034ff37d
copy_user_64.S 2.52 KB