1. 09 Mar, 2012 3 commits
    • Roberto Agostino Vitillo's avatar
      perf report: Add support for taken branch sampling · b50311dc
      Roberto Agostino Vitillo authored
      This patch adds support for taken branch sampling, i.e, the
      PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
      words, to display histograms based on taken branches rather
      than executed instructions addresses.
      
      The new option is called -b and it takes no argument. To
      generate meaningful output, the perf.data must have been
      obtained using perf record -b xxx ... where xxx is a branch
      filter option.
      
      The output shows symbols, modules, sorted by 'who branches
      where' the most often. The percentages reported in the first
      column refer to the total number of branches captured and
      not the usual number of samples.
      
      Here is a quick example.
      Here branchy is simple test program which looks as follows:
      
      void f2(void)
      {}
      void f3(void)
      {}
      void f1(unsigned long n)
      {
        if (n & 1UL)
          f2();
        else
          f3();
      }
      int main(void)
      {
        unsigned long i;
      
        for (i=0; i < N; i++)
         f1(i);
        return 0;
      }
      
      Here is the output captured on Nehalem, if we are
      only interested in user level function calls.
      
      $ perf record -b any_call,u -e cycles:u branchy
      
      $ perf report -b --sort=symbol
          52.34%  [.] main                   [.] f1
          24.04%  [.] f1                     [.] f3
          23.60%  [.] f1                     [.] f2
           0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
           0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
           0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
           0.01%  [k] __printf               [k] _IO_vfprintf_internal
           0.01%  [k] main                   [k] __printf
      
      About half (52%) of the call branches captured are from main()
      -> f1(). The second half (24%+23%) is split in two equal shares
      between f1() -> f2(), f1() ->f3(). The output is as expected
      given the code.
      
      It should be noted, that using -b in perf record does not
      eliminate information in the perf.data file. Consequently, a
      typical profile can also be obtained by perf report by simply
      not using its -b option.
      
      It is possible to sort on branch related columns:
      
         - dso_from, symbol_from
         - dso_to, symbol_to
         - mispredict
      Signed-off-by: default avatarRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b50311dc
    • Roberto Agostino Vitillo's avatar
      perf record: Add support for sampling taken branch · bdfebd84
      Roberto Agostino Vitillo authored
      This patch adds a new option to enable taken branch stack
      sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
      of perf_events.
      
      There is a new option to active this mode: -b.
      It is possible to pass a set of filters to select the type of
      branches to sample.
      
      The following filters are available:
      
       - any : any type of branches
       - any_call : any function call or system call
       - any_ret : any function return or system call return
       - any_ind : any indirect branch
       - u:  only when the branch target is at the user level
       - k: only when the branch target is in the kernel
       - hv: only when the branch target is in the hypervisor
      
      Filters can be combined by passing a comma separated list
      to the option:
      
      $ perf record -b any_call,u -e cycles:u branchy
      Signed-off-by: default avatarRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      bdfebd84
    • Roberto Agostino Vitillo's avatar
      perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK · b5387528
      Roberto Agostino Vitillo authored
      This patch adds:
      
       - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
       - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
       - build histograms on branches
      Signed-off-by: default avatarRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b5387528
  2. 05 Mar, 2012 12 commits
  3. 03 Mar, 2012 3 commits
  4. 02 Mar, 2012 2 commits
  5. 29 Feb, 2012 4 commits
  6. 28 Feb, 2012 3 commits
  7. 27 Feb, 2012 11 commits
  8. 26 Feb, 2012 2 commits
    • Andreas Bießmann's avatar
      mod/file2alias: make modpost compile on darwin again · dd2a3aca
      Andreas Bießmann authored
      commit e49ce141 breaks cross compiling
      the linux kernel on darwin hosts.
      This fix introduce some minimal glue to adopt linker section handling
      for darwin hosts.
      Signed-off-by: default avatarAndreas Bießmann <andreas@biessmann.de>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: Jochen Friedrich <jochen@scram.de>
      CC: Samuel Ortiz <sameo@linux.intel.com>
      CC: "K. Y. Srinivasan" <kys@microsoft.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Tested-by: default avatarBernhard Walle <bernhard@bwalle.de>
      dd2a3aca
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 203738e5
      Linus Torvalds authored
      1) ICMP sockets leave err uninitialized but we try to return it for the
         unsupported MSG_OOB case, reported by Dave Jones.
      
      2) Add new Zaurus device ID entries, from Dave Jones.
      
      3) Pointer calculation in hso driver memset is wrong, from Dan
         Carpenter.
      
      4) ks8851_probe() checks unsigned value as negative, fix also from Dan
         Carpenter.
      
      5) Fix crashes in atl1c driver due to TX queue handling, from Eric
         Dumazet.  I anticipate some TX side locking fixes coming in the near
         future for this driver as well.
      
      6) The inline directive fix in Bluetooth which was breaking the build
         only with very new versions of GCC, from Johan Hedberg.
      
      7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge
         window, reported by Meelis Roos and fixed by Eric Dumazet.
      
      8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng.
      
      9) Some ip6_route_output() callers test the return value for NULL, but
         this never happens as the convention is to return a dst entry with
         dst->error set.  Fixes from RonQing Li.
      
      10) Logitech Harmony 900 should be handled by zaurus driver not
         cdc_ether, update white lists and black lists accordingly.  From
         Scott Talbert.
      
      11) Receiving from certain kinds of devices there won't be a MAC header,
         so there is no MAC header to fixup in the IPSEC code, and if we try
         to do it we'll crash.  Fix from Eric Dumazet.
      
      12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny
         Petrilin.
      
      13) Fix regression in link-down handling in davinci_emac which causes
         all RX descriptors to be freed up and therefore RX to wedge
         completely, from Christian Riesch.
      
      14) It took two attempts, but ctnetlink soft lockups seem to be
         cured now, from Pablo Neira Ayuso.
      
      15) Endianness bug fix in ENIC driver, from Santosh Nayak.
      
      16) The long ago conversion of the PPP fragmentation code over to
         abstracted SKB list handling wasn't perfect, once we get an
         out of sequence SKB we don't flush the rest of them like we
         should.  From Ben McKeegan.
      
      17) Fix regression of ->ip_summed initialization in sfc driver.
         From Ben Hutchings.
      
      18) Bluetooth timeout mistakenly using msecs instead of jiffies,
         from Andrzej Kaczmarek.
      
      19) Using _sync variant of work cancellation results in deadlocks,
         use the non _sync variants instead.  From Andre Guedes.
      
      20) Bluetooth rfcomm code had reference counting problems leading
         to crashes, fix from Octavian Purdila.
      
      21) The conversion of netem over to classful qdisc handling added
         two bugs to netem_dequeue(), fixes from Eric Dumazet.
      
      22) Missing pci_iounmap() in ATM Solos driver.  Fix from Julia Lawall.
      
      23) b44_pci_exit() should not have __exit tag since it's invoked from
         non-__exit code.  From Nikola Pajkovsky.
      
      24) The conversion of the neighbour hash tables over to RCU added a
         race, fixed here by adding the necessary reread of tbl->nht, fix
         from Michel Machado.
      
      25) When we added VF (virtual function) attributes for network device
         dumps, this potentially bloats up the size of the dump of one
         network device such that the dump size is too large for the buffer
         allocated by properly written netlink applications.
      
         In particular, if you add 255 VFs to a network device, parts of
         GLIBC stop working.
      
         To fix this, we add an attribute that is used to turn on these
         extended portions of the network device dump.  Sophisticaed
         applications like 'ip' that want to see this stuff  will be changed
         to set the attribute, whereas things like GLIBC that don't care
         about VFs simply will not, and therefore won't be busted by the
         mere presence of VFs on a network device.
      
         Thanks to the tireless work of Greg Rose on this fix.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
        sfc: Fix assignment of ip_summed for pre-allocated skbs
        ppp: fix 'ppp_mp_reconstruct bad seq' errors
        enic: Fix endianness bug.
        gre: fix spelling in comments
        netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
        Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
        davinci_emac: Do not free all rx dma descriptors during init
        mlx4_core: Fixing array indexes when setting port types
        phy: IC+101G and PHY_HAS_INTERRUPT flag
        netdev/phy/icplus: Correct broken phy_init code
        ipsec: be careful of non existing mac headers
        Move Logitech Harmony 900 from cdc_ether to zaurus
        hso: memsetting wrong data in hso_get_count()
        netfilter: ip6_route_output() never returns NULL.
        ethernet/broadcom: ip6_route_output() never returns NULL.
        ipv6: ip6_route_output() never returns NULL.
        jme: Fix FIFO flush issue
        atm: clip: remove clip_tbl
        ipv4: ping: Fix recvmsg MSG_OOB error handling.
        rtnetlink: Fix problem with buffer allocation
        ...
      203738e5