1. 04 Nov, 2014 8 commits
    • Kirill Tkhai's avatar
      sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl() · 67dfa1b7
      Kirill Tkhai authored
      Currently used hrtimer_try_to_cancel() is racy:
      
      raw_spin_lock(&rq->lock)
      ...                            dl_task_timer                 raw_spin_lock(&rq->lock)
      ...                               raw_spin_lock(&rq->lock)   ...
         switched_from_dl()             ...                        ...
            hrtimer_try_to_cancel()     ...                        ...
         switched_to_fair()             ...                        ...
      ...                               ...                        ...
      ...                               ...                        ...
      raw_spin_unlock(&rq->lock)        ...                        (asquired)
      ...                               ...                        ...
      ...                               ...                        ...
      do_exit()                         ...                        ...
         schedule()                     ...                        ...
            raw_spin_lock(&rq->lock)    ...                        raw_spin_unlock(&rq->lock)
            ...                         ...                        ...
            raw_spin_unlock(&rq->lock)  ...                        raw_spin_lock(&rq->lock)
            ...                         ...                        (asquired)
            put_task_struct()           ...                        ...
                free_task_struct()      ...                        ...
            ...                         ...                        raw_spin_unlock(&rq->lock)
      ...                               (asquired)                 ...
      ...                               ...                        ...
      ...                               (use after free)           ...
      
      So, let's implement 100% guaranteed way to cancel the timer and let's
      be sure we are safe even in very unlikely situations.
      
      rq unlocking does not limit the area of switched_from_dl() use, because
      this has already been possible in pull_dl_task() below.
      
      Let's consider the safety of of this unlocking. New code in the patch
      is working when hrtimer_try_to_cancel() fails. This means the callback
      is running. In this case hrtimer_cancel() is just waiting till the
      callback is finished. Two
      
      1) Since we are in switched_from_dl(), new class is not dl_sched_class and
      new prio is not less MAX_DL_PRIO. So, the callback returns early; it's
      right after !dl_task() check. After that hrtimer_cancel() returns back too.
      
      The above is:
      
      raw_spin_lock(rq->lock);                  ...
      ...                                       dl_task_timer()
      ...                                          raw_spin_lock(rq->lock);
         switched_from_dl()                        ...
             hrtimer_try_to_cancel()               ...
                raw_spin_unlock(rq->lock);         ...
                hrtimer_cancel()                   ...
                ...                                raw_spin_unlock(rq->lock);
                ...                                return HRTIMER_NORESTART;
                ...                             ...
                raw_spin_lock(rq->lock);        ...
      
      2) But the below is also possible:
                                         dl_task_timer()
                                            raw_spin_lock(rq->lock);
                                            ...
                                            raw_spin_unlock(rq->lock);
      raw_spin_lock(rq->lock);              ...
         switched_from_dl()                 ...
             hrtimer_try_to_cancel()        ...
             ...                            return HRTIMER_NORESTART;
             raw_spin_unlock(rq->lock);  ...
             hrtimer_cancel();           ...
             raw_spin_lock(rq->lock);    ...
      
      In this case hrtimer_cancel() returns immediately. Very unlikely case,
      just to mention.
      
      Nobody can manipulate the task, because check_class_changed() is
      always called with pi_lock locked. Nobody can force the task to
      participate in (concurrent) priority inheritance schemes (the same reason).
      
      All concurrent task operations require pi_lock, which is held by us.
      No deadlocks with dl_task_timer() are possible, because it returns
      right after !dl_task() check (it does nothing).
      
      If we receive a new dl_task during the time of unlocked rq, we just
      don't have to do pull_dl_task() in switched_from_dl() further.
      Signed-off-by: default avatarKirill Tkhai <ktkhai@parallels.com>
      [ Added comments]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarJuri Lelli <juri.lelli@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1414420852.19914.186.camel@tkhaiSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      67dfa1b7
    • Peter Zijlstra's avatar
      sched: Use WARN_ONCE for the might_sleep() TASK_RUNNING test · e7097e8b
      Peter Zijlstra authored
      In some cases this can trigger a true flood of output.
      Requested-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e7097e8b
    • Peter Zijlstra's avatar
      netdev, sched/wait: Fix sleeping inside wait event · ff960a73
      Peter Zijlstra authored
      rtnl_lock_unregistering*() take rtnl_lock() -- a mutex -- inside a
      wait loop. The wait loop relies on current->state to function, but so
      does mutex_lock(), nesting them makes for the inner to destroy the
      outer state.
      
      Fix this using the new wait_woken() bits.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Cong Wang <cwang@twopensource.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jerry Chu <hkchu@google.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: sfeldma@cumulusnetworks.com <sfeldma@cumulusnetworks.com>
      Cc: stephen hemminger <stephen@networkplumber.org>
      Cc: Tom Gundersen <teg@jklm.no>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/20141029173110.GE15602@worktop.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ff960a73
    • Peter Zijlstra's avatar
      rfcomm, sched/wait: Fix broken wait construct · eedf7e47
      Peter Zijlstra authored
      rfcomm_run() is a tad broken in that is has a nested wait loop. One
      cannot rely on p->state for the outer wait because the inner wait will
      overwrite it.
      
      Fix this using the new wait_woken() facility.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Alexander Holler <holler@ahsoftware.de>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Johan Hedberg <johan.hedberg@gmail.com>
      Cc: Libor Pechacek <lpechacek@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Cc: Seung-Woo Kim <sw0312.kim@samsung.com>
      Cc: Vignesh Raman <Vignesh_Raman@mentor.com>
      Cc: linux-bluetooth@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      eedf7e47
    • Peter Zijlstra's avatar
      audit, sched/wait: Fixup kauditd_thread() wait loop · 6b55fc63
      Peter Zijlstra authored
      The kauditd_thread wait loop is a bit iffy; it has a number of problems:
      
       - calls try_to_freeze() before schedule(); you typically want the
         thread to re-evaluate the sleep condition when unfreezing, also
         freeze_task() issues a wakeup.
      
       - it unconditionally does the {add,remove}_wait_queue(), even when the
         sleep condition is false.
      
      Use wait_event_freezable() that does the right thing.
      Reported-by: default avatarMike Galbraith <umgwanakikbuti@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: oleg@redhat.com
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20141002102251.GA6324@worktop.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6b55fc63
    • Peter Zijlstra (Intel)'s avatar
      sched/wait: Remove wait_event_freezekillable() · 5d4d5658
      Peter Zijlstra (Intel) authored
      There is no user.. make it go away.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: oleg@redhat.com
      Cc: Rafael Wysocki <rjw@rjwysocki.net>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux-pm@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5d4d5658
    • Peter Zijlstra's avatar
      sched/wait: Reimplement wait_event_freezable() · 36df04bc
      Peter Zijlstra authored
      Provide better implementations of wait_event_freezable() APIs.
      
      The problem is with freezer_do_not_count(), it hides the thread from
      the freezer, even though this thread might not actually freeze/sleep
      at all.
      
      Cc: oleg@redhat.com
      Cc: Rafael Wysocki <rjw@rjwysocki.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-d86fz1jmso9wjxa8jfpinp8o@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      36df04bc
    • Peter Zijlstra's avatar
      sched/wait: Fix a kthread race with wait_woken() · cb6538e7
      Peter Zijlstra authored
      There is a race between kthread_stop() and the new wait_woken() that
      can result in a lack of progress.
      
      CPU 0                                    | CPU 1
                                               |
      rfcomm_run()                             | kthread_stop()
        ...                                    |
        if (!test_bit(KTHREAD_SHOULD_STOP))    |
                                               |   set_bit(KTHREAD_SHOULD_STOP)
                                               |   wake_up_process()
          wait_woken()                         |   wait_for_completion()
            set_current_state(INTERRUPTIBLE)   |
            if (!WQ_FLAG_WOKEN)                |
              schedule_timeout()               |
                                               |
      
      After which both tasks will wait.. forever.
      
      Fix this by having wait_woken() check for kthread_should_stop() but
      only for kthreads (obviously).
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cb6538e7
  2. 28 Oct, 2014 32 commits