• Anatol Pomozov's avatar
    ext4: rate limit printk in buffer_io_error() · e8974c39
    Anatol Pomozov authored
    If there are a lot of outstanding buffered IOs when a device is
    taken offline (due to hardware errors etc), ext4_end_bio prints
    out a message for each failed logical block. While this is desirable,
    we see thousands of such lines being printed out before the
    serial console gets overwhelmed, causing ext4_end_bio() wait for
    the printk to complete.
    
    This in itself isn't a disaster, except for the detail that this
    function is being called with the queue lock held.
    This causes any other function in the block layer
    to spin on its spin_lock_irqsave while the serial console is
    draining. If NMI watchdog is enabled on this machine then it
    eventually comes along and shoots the machine in the head.
    
    The end result is that losing any one disk causes the machine to
    go down. This patch rate limits the printk to bandaid around the
    problem.
    
    Tested: xfstests
    Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8
    Signed-off-by: default avatarAnatol Pomozov <anatol.pomozov@gmail.com>
    Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
    e8974c39
page-io.c 13.3 KB