• Andrea Righi's avatar
    mm: swap: properly update readahead statistics in unuse_pte_range() · ebc5951e
    Andrea Righi authored
    In unuse_pte_range() we blindly swap-in pages without checking if the
    swap entry is already present in the swap cache.
    
    By doing this, the hit/miss ratio used by the swap readahead heuristic
    is not properly updated and this leads to non-optimal performance during
    swapoff.
    
    Tracing the distribution of the readahead size returned by the swap
    readahead heuristic during swapoff shows that a small readahead size is
    used most of the time as if we had only misses (this happens both with
    cluster and vma readahead), for example:
    
    r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
            COUNT      EVENT
            36948      $retval = 8
            44151      $retval = 4
            49290      $retval = 1
            527771     $retval = 2
    
    Checking if the swap entry is present in the swap cache, instead, allows
    to properly update the readahead statistics and the heuristic behaves in a
    better way during swapoff, selecting a bigger readahead size:
    
    r::swapin_nr_pages(unsigned long offset):unsigned long:$retval
            COUNT      EVENT
            1618       $retval = 1
            4960       $retval = 2
            41315      $retval = 4
            103521     $retval = 8
    
    In terms of swapoff performance the result is the following:
    
    Testing environment
    ===================
    
     - Host:
       CPU: 1.8GHz Intel Core i7-8565U (quad-core, 8MB cache)
       HDD: PC401 NVMe SK hynix 512GB
       MEM: 16GB
    
     - Guest (kvm):
       8GB of RAM
       virtio block driver
       16GB swap file on ext4 (/swapfile)
    
    Test case
    =========
     - allocate 85% of memory
     - `systemctl hibernate` to force all the pages to be swapped-out to the
       swap file
     - resume the system
     - measure the time that swapoff takes to complete:
       # /usr/bin/time swapoff /swapfile
    
    Result (swapoff time)
    ======
                      5.6 vanilla   5.6 w/ this patch
                      -----------   -----------------
    cluster-readahead      22.09s              12.19s
        vma-readahead      18.20s              15.33s
    
    Conclusion
    ==========
    
    The specific use case this patch is addressing is to improve swapoff
    performance in cloud environments when a VM has been hibernated, resumed
    and all the memory needs to be forced back to RAM by disabling swap.
    
    This change allows to better exploits the advantages of the readahead
    heuristic during swapoff and this improvement allows to to speed up the
    resume process of such VMs.
    
    [andrea.righi@canonical.com: update changelog]
      Link: http://lkml.kernel.org/r/20200418084705.GA147642@xps-13Signed-off-by: default avatarAndrea Righi <andrea.righi@canonical.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Reviewed-by: default avatar"Huang, Ying" <ying.huang@intel.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Anchal Agarwal <anchalag@amazon.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Vineeth Remanan Pillai <vpillai@digitalocean.com>
    Cc: Kelley Nielsen <kelleynnn@gmail.com>
    Link: http://lkml.kernel.org/r/20200416180132.GB3352@xps-13Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    ebc5951e
swapfile.c 95 KB