• Lars Ellenberg's avatar
    drbd: fix potential access of on-stack wait_queue_head_t after return · 725a97e4
    Lars Ellenberg authored
    I run into something declaring itself as "spinlock deadlock",
     BUG: spinlock lockup on CPU#1, kjournald/27816, ffff88000ad6bca0
     Pid: 27816, comm: kjournald Tainted: G        W 2.6.34.6 #2
     Call Trace:
      <IRQ>  [<ffffffff811ba0aa>] do_raw_spin_lock+0x11e/0x14d
      [<ffffffff81340fde>] _raw_spin_lock_irqsave+0x6a/0x81
      [<ffffffff8103b694>] ? __wake_up+0x22/0x50
      [<ffffffff8103b694>] __wake_up+0x22/0x50
      [<ffffffffa07ff661>] bm_async_io_complete+0x258/0x299 [drbd]
    but the call traces do not fit at all,
    all other cpus are cpu_idle.
    
    I think it may be this race:
    
    drbd_bm_write_page
     wait_queue_head_t io_wait;
     atomic_t in_flight;
     bm_async_io
      submit_bio
    					bm_async_io_complete
    					  if (atomic_dec_and_test(in_flight))
     wait_event(io_wait,
    	atomic_read(in_flight) == 0)
     return
    					    wake_up(io_wait)
    
    The wake_up now accesses the wait_queue_head_t spinlock, which is no
    longer valid, since the stack frame of drbd_bm_write_page has been
    clobbered now.
    
    Fix this by using struct completion, which does both the condition test
    as well as the wake_up inside its spinlock, so this race cannot happen.
    Signed-off-by: default avatarPhilipp Reisner <philipp.reisner@linbit.com>
    Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
    725a97e4
drbd_bitmap.c 44.4 KB