• Mikulas Patocka's avatar
    crash in md-raid1 and md-raid10 due to incorrect list manipulation · a452744b
    Mikulas Patocka authored
    The commit 55ce74d4 (md/raid1: ensure
    device failure recorded before write request returns) is causing crash in
    the LVM2 testsuite test shell/lvchange-raid.sh. For me the crash is 100%
    reproducible.
    
    The reason for the crash is that the newly added code in raid1d moves the
    list from conf->bio_end_io_list to tmp, then tests if tmp is non-empty and
    then incorrectly pops the bio from conf->bio_end_io_list (which is empty
    because the list was alrady moved).
    
    Raid-10 has a similar bug.
    
    Kernel Fault: Code=15 regs=000000006ccb8640 (Addr=0000000100000000)
    CPU: 3 PID: 1930 Comm: mdX_raid1 Not tainted 4.2.0-rc5-bisect+ #35
    task: 000000006cc1f258 ti: 000000006ccb8000 task.ti: 000000006ccb8000
    
         YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001001111111000001111 Not tainted
    r00-03  000000ff0804fe0f 000000001059d000 000000001059f818 000000007f16be38
    r04-07  000000001059d000 000000007f16be08 0000000000200200 0000000000000001
    r08-11  000000006ccb8260 000000007b7934d0 0000000000000001 0000000000000000
    r12-15  000000004056f320 0000000000000000 0000000000013dd0 0000000000000000
    r16-19  00000000f0d00ae0 0000000000000000 0000000000000000 0000000000000001
    r20-23  000000000800000f 0000000042200390 0000000000000000 0000000000000000
    r24-27  0000000000000001 000000000800000f 000000007f16be08 000000001059d000
    r28-31  0000000100000000 000000006ccb8560 000000006ccb8640 0000000000000000
    sr00-03  0000000000249800 0000000000000000 0000000000000000 0000000000249800
    sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
    
    IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001059f61c 000000001059f620
     IIR: 0f8010c6    ISR: 0000000000000000  IOR: 0000000100000000
     CPU:        3   CR30: 000000006ccb8000 CR31: 0000000000000000
     ORIG_R28: 000000001059d000
     IAOQ[0]: call_bio_endio+0x34/0x1a8 [raid1]
     IAOQ[1]: call_bio_endio+0x38/0x1a8 [raid1]
     RP(r2): raid_end_bio_io+0x88/0x168 [raid1]
    Backtrace:
     [<000000001059f818>] raid_end_bio_io+0x88/0x168 [raid1]
     [<00000000105a4f64>] raid1d+0x144/0x1640 [raid1]
     [<000000004017fd5c>] kthread+0x144/0x160
    Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
    Fixes: 55ce74d4 ("md/raid1: ensure device failure recorded before write request returns.")
    Fixes: 95af587e ("md/raid10: ensure device failure recorded before write request returns.")
    Signed-off-by: default avatarNeilBrown <neilb@suse.com>
    a452744b
raid1.c 85.5 KB