1. 27 Aug, 2010 1 commit
    • Artem Bityutskiy's avatar
      writeback: do not lose wakeup events when forking bdi threads · 6628bc74
      Artem Bityutskiy authored
      This patch fixes the following issue:
      
      INFO: task mount.nfs4:1120 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      mount.nfs4    D 00000000fffc6a21     0  1120   1119 0x00000000
       ffff880235643948 0000000000000046 ffffffff00000000 ffffffff00000000
       ffff880235643fd8 ffff880235314760 00000000001d44c0 ffff880235643fd8
       00000000001d44c0 00000000001d44c0 00000000001d44c0 00000000001d44c0
      Call Trace:
       [<ffffffff813bc747>] schedule_timeout+0x34/0xf1
       [<ffffffff813bc530>] ? wait_for_common+0x3f/0x130
       [<ffffffff8106b50b>] ? trace_hardirqs_on+0xd/0xf
       [<ffffffff813bc5c3>] wait_for_common+0xd2/0x130
       [<ffffffff8104159c>] ? default_wake_function+0x0/0xf
       [<ffffffff813beaa0>] ? _raw_spin_unlock+0x26/0x2a
       [<ffffffff813bc6bb>] wait_for_completion+0x18/0x1a
       [<ffffffff81101a03>] sync_inodes_sb+0xca/0x1bc
       [<ffffffff811056a6>] __sync_filesystem+0x47/0x7e
       [<ffffffff81105798>] sync_filesystem+0x47/0x4b
       [<ffffffff810e7ffd>] generic_shutdown_super+0x22/0xd2
       [<ffffffff810e80f8>] kill_anon_super+0x11/0x4f
       [<ffffffffa00d06d7>] nfs4_kill_super+0x3f/0x72 [nfs]
       [<ffffffff810e7b68>] deactivate_locked_super+0x21/0x41
       [<ffffffff810e7fd6>] deactivate_super+0x40/0x45
       [<ffffffff810fc66c>] mntput_no_expire+0xb8/0xed
       [<ffffffff810fc73b>] release_mounts+0x9a/0xb0
       [<ffffffff810fc7bb>] put_mnt_ns+0x6a/0x7b
       [<ffffffffa00d0fb2>] nfs_follow_remote_path+0x19a/0x296 [nfs]
       [<ffffffffa00d11ca>] nfs4_try_mount+0x75/0xaf [nfs]
       [<ffffffffa00d1790>] nfs4_get_sb+0x276/0x2ff [nfs]
       [<ffffffff810e7dba>] vfs_kern_mount+0xb8/0x196
       [<ffffffff810e7ef6>] do_kern_mount+0x48/0xe8
       [<ffffffff810fdf68>] do_mount+0x771/0x7e8
       [<ffffffff810fe062>] sys_mount+0x83/0xbd
       [<ffffffff810089c2>] system_call_fastpath+0x16/0x1b
      
      The reason of this hang was a race condition: when the flusher thread is
      forking a bdi thread, we use 'kthread_run()', so we run it _before_ we make it
      visible in 'bdi->wb.task'. The bdi thread runs, does all works, and goes sleep.
      'bdi->wb.task' is still NULL. And this is a dangerous time window.
      
      If at this time someone queues a work for this bdi, he does not see the bdi
      thread and wakes up the forker thread instead! But the forker has already
      forked this bdi thread, but just did not make it visible yet!
      
      The result is that we lose the wake up event for this bdi thread and the NFS4
      code waits forever.
      
      To fix the problem, we should use 'ktrhead_create()' for creating bdi threads,
      then make them visible in 'bdi->wb.task', and only after this wake them up.
      This is exactly what this patch does.
      Signed-off-by: default avatarArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      6628bc74
  2. 25 Aug, 2010 1 commit
  3. 23 Aug, 2010 18 commits
  4. 22 Aug, 2010 12 commits
  5. 21 Aug, 2010 6 commits
  6. 20 Aug, 2010 2 commits