• Nicholas Piggin's avatar
    powerpc/64: Implement clear_bit_unlock_is_negative_byte() · d11914b2
    Nicholas Piggin authored
    Commit b91e1302 ("mm: optimize PageWaiters bit use for
    unlock_page()") added a special bitop function to speed up
    unlock_page(). Implement this for 64-bit powerpc.
    
    This improves the unlock_page() core code from this:
    
    	li	9,1
    	lwsync
    1:	ldarx	10,0,3,0
    	andc	10,10,9
    	stdcx.	10,0,3
    	bne-	1b
    	ori	2,2,0
    	ld	9,0(3)
    	andi.	10,9,0x80
    	beqlr
    	li	4,0
    	b	wake_up_page_bit
    
    To this:
    
    	li	10,1
    	lwsync
    1:	ldarx	9,0,3,0
    	andc	9,9,10
    	stdcx.	9,0,3
    	bne-	1b
    	andi.	10,9,0x80
    	beqlr
    	li	4,0
    	b	wake_up_page_bit
    
    In a test of elapsed time for dd writing into 16GB of already-dirty
    pagecache on a POWER8 with 4K pages, which has one unlock_page per 4kB
    this patch reduced overhead by 1.1%:
    
        N           Min           Max        Median           Avg        Stddev
    x  19         2.578         2.619         2.594         2.595         0.011
    +  19         2.552         2.592         2.564         2.565         0.008
    Difference at 95.0% confidence
    	-0.030  +/- 0.006
    	-1.142% +/- 0.243%
    Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
    [mpe: Made 64-bit only until I can test it properly on 32-bit]
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    d11914b2
bitops.h 8.54 KB