• Robin Murphy's avatar
    arm64: Better optimised memchr() · 9e51cafd
    Robin Murphy authored
    Although we implement our own assembly version of memchr(), it turns
    out to be barely any better than what GCC can generate for the generic
    C version (and would go wrong if the size_t argument were ever large
    enough to be interpreted as negative). Unfortunately we can't import the
    tuned implementation from the Arm optimized-routines library, since that
    has some Advanced SIMD parts which are not really viable for general
    kernel library code. What we can do, however, is pep things up with some
    relatively straightforward word-at-a-time logic for larger calls.
    
    Adding some timing to optimized-routines' memchr() test for a simple
    benchmark, overall this version comes in around half as fast as the SIMD
    code, but still nearly 4x faster than our existing implementation.
    Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
    Link: https://lore.kernel.org/r/58471b42f9287e039dafa9e5e7035077152438fd.1622128527.git.robin.murphy@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
    9e51cafd
memchr.S 1.34 KB