1. 18 Feb, 2014 31 commits
  2. 13 Feb, 2014 1 commit
  3. 10 Feb, 2014 4 commits
  4. 09 Feb, 2014 4 commits
    • Don Zickus's avatar
      perf/x86/p4: Block PMIs on init to prevent a stream of unkown NMIs · 90ed5b0f
      Don Zickus authored
      A bunch of unknown NMIs have popped up on a Pentium4 recently when booting
      into a kdump kernel.  This was exposed because the watchdog timer went
      from 60 seconds down to 10 seconds (increasing the ability to reproduce
      this problem).
      
      What is happening is on boot up of the second kernel (the kdump one),
      the previous nmi_watchdogs were enabled on thread 0 and thread 1.  The
      second kernel only initializes one cpu but the perf counter on thread 1
      still counts.
      
      Normally in a kdump scenario, the other cpus are blocking in an NMI loop,
      but more importantly their local apics have the performance counters disabled
      (iow LVTPC is masked).  So any counters that fire are masked and never get
      through to the second kernel.
      
      However, on a P4 the local apic is shared by both threads and thread1's PMI
      (despite being configured to only interrupt thread1) will generate an NMI on
      thread0.  Because thread0 knows nothing about this NMI, it is seen as an
      unknown NMI.
      
      This would be fine because it is a kdump kernel, strange things happen
      what is the big deal about a single unknown NMI.
      
      Unfortunately, the P4 comes with another quirk: clearing the overflow bit
      to prevent a stream of NMIs.  This is the problem.
      
      The kdump kernel can not execute because of the endless NMIs that happen.
      
      To solve this, I instrumented the p4 perf init code, to walk all the counters
      and zero them out (just like a normal reset would).
      
      Now when the counters go off, they do not generate anything and no unknown
      NMIs are seen.
      
      I tested this on a P4 we have in our lab.  After two or three crashes, I could
      normally reproduce the problem.  Now after 10 crashes, everything continues
      to boot correctly.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140120154115.GZ25953@redhat.com
      [ Fixed a stylistic detail. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      90ed5b0f
    • Don Zickus's avatar
      perf/x86/p4: Fix counter corruption when using lots of perf groups · 13beacee
      Don Zickus authored
      On a P4 box stressing perf with:
      
         ./perf record -o perf.data ./perf stat -v ./perf bench all
      
      it was noticed that a slew of unknown NMIs would pop out rather quickly.
      
      Painfully debugging this ancient platform, led me to notice cross cpu counter
      corruption.
      
      The P4 machine is special in that it has 18 counters, half are used for cpu0
      and the other half is for cpu1 (or all 18 if hyperthreading is disabled).  But
      the splitting of the counters has to be actively managed by the software.
      
      In this particular bug, one of the cpu0 specific counters was being used by
      cpu1 and caused all sorts of random unknown nmis.
      
      I am not entirely sure on the corruption path, but what happens is:
      
       o perf schedules a group with p4_pmu_schedule_events()
       o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused
         but for a different cpu, so it 'swaps' the config bits and returns the
         updated 'assign' array with a _new_ index.
       o perf schedules another group with p4_pmu_schedule_events()
       o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused
         (the same one as above) but for the _same_ cpu [BUG!!], so it updates the
         'assign' array to use the _old_ (wrong cpu) index because the _new_ index is in
         an earlier part of the 'assign' array (and hasn't been committed yet).
       o perf commits the transaction using the wrong index and corrupts the other cpu
      
      The [BUG!!] is because the 'hwc->config' is updated but not the 'hwc->idx'.  So
      the check for 'p4_should_swap_ts()' is correct the first time around but
      incorrect the second time around (because hwc->config was updated in between).
      
      I think the spirit of perf was to not modify anything until all the
      transactions had a chance to 'test' if they would succeed, and if so, commit
      atomically.  However, P4 breaks this spirit by touching the hwc->config
      element.
      
      So my fix is to continue the un-perf like breakage, by assigning hwc->idx to -1
      on swap to tell follow up group scheduling to find a new index.
      
      Of course if the transaction fails rolling this back will be difficult, but
      that is not different than how the current code works. :-)  And I wasn't sure
      how much effort to cleanup the code I should do for a platform that is almost
      10 years old by now.
      
      Hence the lazy fix.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Acked-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1391024270-19469-1-git-send-email-dzickus@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      13beacee
    • Peter Zijlstra's avatar
      x86/nmi: Push duration printk() to irq context · e90c7853
      Peter Zijlstra authored
      Calling printk() from NMI context is bad (TM), so move it to IRQ
      context.
      
      In doing so we slightly change (probably wreck) the debugfs
      nmi_longest_ns thingy, in that it doesn't update to reflect the
      longest, nor does writing to it reset the count.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-rdw0au56a5ymis1u8p48c12d@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e90c7853
    • Peter Zijlstra's avatar
      perf/x86: Push the duration-logging printk() to IRQ context · 6a02ad66
      Peter Zijlstra authored
      Calling printk() from NMI context is bad (TM), so move it to IRQ
      context.
      
      This also avoids the problem where the printk() time is measured by
      the generic NMI duration goo and triggers a second warning.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-75dv35xf6dhhmeb7nq6fua31@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6a02ad66