• Shaohua Li's avatar
    md/raid1: read balance chooses idlest disk for SSD · 9dedf603
    Shaohua Li authored
    SSD hasn't spindle, distance between requests means nothing. And the original
    distance based algorithm sometimes can cause severe performance issue for SSD
    raid.
    
    Considering two thread groups, one accesses file A, the other access file B.
    The first group will access one disk and the second will access the other disk,
    because requests are near from one group and far between groups. In this case,
    read balance might keep one disk very busy but the other relative idle.  For
    SSD, we should try best to distribute requests to as many disks as possible.
    There isn't spindle move penality anyway.
    
    With below patch, I can see more than 50% throughput improvement sometimes
    depending on workloads.
    
    The only exception is small requests can be merged to a big request which
    typically can drive higher throughput for SSD too. Such small requests are
    sequential reads. Unlike hard disk, sequential read which can't be merged (for
    example direct IO, or read without readahead) can be ignored for SSD. Again
    there is no spindle move penality. readahead dispatches small requests and such
    requests can be merged.
    
    Last patch can help detect sequential read well, at least if concurrent read
    number isn't greater than raid disk number. In that case, distance based
    algorithm doesn't work well too.
    
    V2: For hard disk and SSD mixed raid, doesn't use distance based algorithm for
    random IO too. This makes the algorithm generic for raid with SSD.
    Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    9dedf603
raid1.c 79.6 KB