• Salman Qazi's avatar
    x86: fix performance regression in write() syscall · 30d697fa
    Salman Qazi authored
    While the introduction of __copy_from_user_nocache (see commit:
    0812a579) may have been an improvement
    for sufficiently large writes, there is evidence to show that it is
    deterimental for small writes.  Unixbench's fstime test gives the
    following results for 256 byte writes with MAX_BLOCK of 2000:
    
        2.6.29-rc6 ( 5 samples, each in KB/sec ):
        283750, 295200, 294500, 293000, 293300
    
        2.6.29-rc6 + this patch (5 samples, each in KB/sec):
        313050, 3106750, 293350, 306300, 307900
    
        2.6.18
        395700, 342000, 399100, 366050, 359850
    
        See w_test() in src/fstime.c in unixbench version 4.1.0.  Basically, the above test
        consists of counting how much we can write in this manner:
    
        alarm(10);
        while (!sigalarm) {
                for (f_blocks = 0; f_blocks < 2000; ++f_blocks) {
                       write(f, buf, 256);
                }
                lseek(f, 0L, 0);
        }
    
    Note, there are other components to the write syscall regression
    that are not addressed here.
    Signed-off-by: default avatarSalman Qazi <sqazi@google.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    30d697fa
uaccess_64.h 6.07 KB