1. 11 Oct, 2003 2 commits
    • Venkatesh Pallipadi's avatar
      [PATCH] Bug in timer_tsc cpufreq callback · 348e9f70
      Venkatesh Pallipadi authored
      There is a bug in cpufreq call back funtion in timer_tsc routines,
      that can result in system deadlock. The issue is: grabbing the
      write_lock on xtime_lock without disabling the interrupts. So,=20
      if we happen to get a timer interrupt while we are in this code,
      system will go into a deadlock.
      
      This bug only effects the kernels that have CONFIG_CPU_FREQ enabled.
      348e9f70
    • Ingo Molnar's avatar
      [PATCH] SMP races in the timer code · 158fb15f
      Ingo Molnar authored
      This fixes two del_timer_sync() races that are still in the timer code. 
      
      The first race was actually triggered in a 2.4 backport of the 2.6 timer
      code.  The second race was never triggered - it is mostly theoretical on
      a standalone kernel.  (It's more likely in any virtualized or otherwise
      preemptable environment.)
      
      Both races happen when self-rearming timers are used.  One mainstream
      example is kernel/itimer.c.  The effect of the races is that
      del_timer_sync() lets a timer running instead of synchronizing with it,
      causing logic bugs (and crashes) in the affected kernel code.  One
      typical incarnation of the race is a double add_timer(). 
      
      race #1:
      
      this code in __run_timers() is running on CPU0:
      
                              list_del(&timer->entry);
                              timer->base = NULL;
      			[*]
                              set_running_timer(base, timer);
                              spin_unlock_irq(&base->lock);
      			[**]
                              fn(data);
                              spin_lock_irq(&base->lock);
      
      CPU0 gets stuck at the [*] code-point briefly - after the timer->base has
      been set to NULL, but before the base->running_timer pointer has been set
      up. This is a fundamentally volatile scenario, as there's _zero_ knowledge
      in the data structures that this timer is about to be executed!
      
      Now CPU1 comes along and calls del_timer_sync(). It will find nothing -
      neither timer->base nor base->running_timer will cause it to synchronize.  
      It will return and report that the timer has been deleted - shortly
      afterwards CPU1 continues to execute the timer fn, which will cause
      crashes.
      
      This particular race is easy to fix by reordering the timer->base
      clearing with set_running_timer(), and putting a wmb() between them, but
      there's more races:
      
      race #2
      
      The checking of del_timer_sync() for 'pending or running timer' is
      fundamentally unrobust. Eg. if CPU0 gets stuck at the [***] point below:
      
                      base = &per_cpu(tvec_bases, i);
                      if (base->running_timer == timer) {
                              while (base->running_timer == timer) {
                                      cpu_relax();
                                      preempt_check_resched();
                              }
      			[***]
                              break;
                      }
              }
              smp_rmb();
              if (timer_pending(timer))
                      goto del_again;
      
      
      then del_timer_sync() has already decided that this timer is not running
      (we just finished loop-waiting for it), but we have not done the
      timer_pending() check yet.
      
      If the timer has re-armed itself, and if the timer expires on CPU1 (this
      needs a long delay on CPU0 but that's not hard to achieve eg.  in UML or
      with kernel preemption enabled), then CPU1 could start to expire the
      timer and gets to the [**] point in __run_timers (see above), then CPU1
      gets stalled and CPU0 is unstalled, then the timer_pending() check in
      del_timer_sync() will not notice the running timer, and del_timer_sync()
      returns - while CPU1 is just about to run the timer!
      
      Fixing this second race is hard - it involves a heavy race-check
      operation that has to lock all bases, and has to re-check the
      base->running_timer value, and timer_pending condition atomically.
      
      This fix also fixes the first race, due to forcing del_timer_sync() to
      always observe the timer state atomically, so the [*] code point will
      always synchronize with del_timer_sync(). 
      
      The patch is ugly but safe, and it has fixed the crashes in the 2.4
      backport.  I tested the patch on 2.6.0-test7 with some heavy itimer use
      and it works fine.  Removing self-arming timers safely is the sole
      purpose of del_timer_sync(), so there's no way around this overhead i
      think.  I believe we should ultimately fix all major del_timer_sync()
      users to not use self-arming timers - having del_timer_sync() in the
      thread-exit path is now a considerable source of SMP overhead.  But this
      is out of the scope of current 2.6 fixes of course, and we have to
      support self-arming timers as well.
      158fb15f
  2. 10 Oct, 2003 4 commits
  3. 11 Oct, 2003 1 commit
  4. 10 Oct, 2003 11 commits
  5. 09 Oct, 2003 22 commits