• Jan Kara's avatar
    ext4: Speedup ext4 orphan inode handling · 02f310fc
    Jan Kara authored
    Ext4 orphan inode handling is a bottleneck for workloads which heavily
    truncate / unlink small files since it contends on the global
    s_orphan_mutex lock (and generally it's difficult to improve scalability
    of the ondisk linked list of orphaned inodes).
    
    This patch implements new way of handling orphan inodes. Instead of
    linking orphaned inode into a linked list, we store it's inode number in
    a new special file which we call "orphan file". Only if there's no more
    space in the orphan file (too many inodes are currently orphaned) we
    fall back to using old style linked list. Currently we protect
    operations in the orphan file with a spinlock for simplicity but even in
    this setting we can substantially reduce the length of the critical
    section and thus speedup some workloads. In the next patch we improve
    this by making orphan handling lockless.
    
    Note that the change is backwards compatible when the filesystem is
    clean - the existence of the orphan file is a compat feature, we set
    another ro-compat feature indicating orphan file needs scanning for
    orphaned inodes when mounting filesystem read-write. This ro-compat
    feature gets cleared on unmount / remount read-only.
    
    Some performance data from 80 CPU Xeon Server with 512 GB of RAM,
    filesystem located on SSD, average of 5 runs:
    
    stress-orphan (microbenchmark truncating files byte-by-byte from N
    processes in parallel)
    
    Threads Time            Time
            Vanilla         Patched
      1       1.057200        0.945600
      2       1.680400        1.331800
      4       2.547000        1.995000
      8       7.049400        6.424200
     16      14.827800       14.937600
     32      40.948200       33.038200
     64      87.787400       60.823600
    128     206.504000      122.941400
    
    So we can see significant wins all over the board.
    Reviewed-by: default avatarTheodore Ts'o <tytso@mit.edu>
    Signed-off-by: default avatarJan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20210816095713.16537-3-jack@suse.czSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    02f310fc
inode.c 179 KB