Btrfs: more efficient extent state insertions
Currently we do 2 traversals of an inode's extent_io_tree before inserting an extent state structure: 1 to see if a matching extent state already exists and 1 to do the insertion if the fist traversal didn't found such extent state. This change just combines those tree traversals into a single one. While running sysbench tests (random writes) I captured the number of elements in extent_io_tree trees for a while (into a procfs file backed by a seq_list from seq_file module) and got this histogram: Count: 9310 Range: 51.000 - 21386.000; Mean: 11785.243; Median: 18743.500; Stddev: 8923.688 Percentiles: 90th: 20985.000; 95th: 21155.000; 99th: 21369.000 51.000 - 93.933: 693 ######## 93.933 - 172.314: 938 ########## 172.314 - 315.408: 856 ######### 315.408 - 576.646: 95 # 576.646 - 6415.830: 888 ########## 6415.830 - 11713.809: 1024 ########### 11713.809 - 21386.000: 4816 ##################################################### So traversing such trees can take some significant time that can easily be avoided. Ran the following sysbench tests, 5 times each, for sequential and random writes, and got the following results: sysbench --test=fileio --file-num=1 --file-total-size=2G \ --file-test-mode=seqwr --num-threads=16 --file-block-size=65536 \ --max-requests=0 --max-time=60 --file-io-mode=sync sysbench --test=fileio --file-num=1 --file-total-size=2G \ --file-test-mode=rndwr --num-threads=16 --file-block-size=65536 \ --max-requests=0 --max-time=60 --file-io-mode=sync Before this change: sequential writes: 69.28Mb/sec (average of 5 runs) random writes: 4.14Mb/sec (average of 5 runs) After this change: sequential writes: 69.91Mb/sec (average of 5 runs) random writes: 5.69Mb/sec (average of 5 runs) Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
Showing
Please register or sign in to comment