1. 26 Jun, 2017 3 commits
    • Len Brown's avatar
      intel_pstate: skip scheduler hook when in "performance" mode · 82b4e03e
      Len Brown authored
      When the governor is set to "performance", intel_pstate does not
      need the scheduler hook for doing any calculations.  Under these
      conditions, its only purpose is to continue to maintain
      cpufreq/scaling_cur_freq.
      
      The cpufreq/scaling_cur_freq sysfs attribute is now provided by
      shared x86 cpufreq code on modern x86 systems, including
      all systems supported by the intel_pstate driver.
      
      So in "performance" governor mode, the scheduler hook can be skipped.
      This applies to both in Software and Hardware P-state control modes.
      Suggested-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      82b4e03e
    • Len Brown's avatar
      intel_pstate: delete scheduler hook in HWP mode · 62611cb9
      Len Brown authored
      The cpufreq/scaling_cur_freq sysfs attribute is now provided by
      shared x86 cpufreq code on modern x86 systems, including
      all systems supported by the intel_pstate driver.
      
      In HWP mode, maintaining that value was the sole purpose of
      the scheduler hook, intel_pstate_update_util_hwp(),
      so it can now be removed.
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      62611cb9
    • Len Brown's avatar
      x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF · f8475cef
      Len Brown authored
      The goal of this change is to give users a uniform and meaningful
      result when they read /sys/...cpufreq/scaling_cur_freq
      on modern x86 hardware, as compared to what they get today.
      
      Modern x86 processors include the hardware needed
      to accurately calculate frequency over an interval --
      APERF, MPERF, and the TSC.
      
      Here we provide an x86 routine to make this calculation
      on supported hardware, and use it in preference to any
      driver driver-specific cpufreq_driver.get() routine.
      
      MHz is computed like so:
      
      MHz = base_MHz * delta_APERF / delta_MPERF
      
      MHz is the average frequency of the busy processor
      over a measurement interval.  The interval is
      defined to be the time between successive invocations
      of aperfmperf_khz_on_cpu(), which are expected to to
      happen on-demand when users read sysfs attribute
      cpufreq/scaling_cur_freq.
      
      As with previous methods of calculating MHz,
      idle time is excluded.
      
      base_MHz above is from TSC calibration global "cpu_khz".
      
      This x86 native method to calculate MHz returns a meaningful result
      no matter if P-states are controlled by hardware or firmware
      and/or if the Linux cpufreq sub-system is or is-not installed.
      
      When this routine is invoked more frequently, the measurement
      interval becomes shorter.  However, the code limits re-computation
      to 10ms intervals so that average frequency remains meaningful.
      
      Discerning users are encouraged to take advantage of
      the turbostat(8) utility, which can gracefully handle
      concurrent measurement intervals of arbitrary length.
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      f8475cef
  2. 23 Jun, 2017 2 commits
    • Srinivas Pandruvada's avatar
      cpufreq: intel_pstate: Remove max/min fractions to limit performance · 1a4fe38a
      Srinivas Pandruvada authored
      In the current model the max/min perf limits are a fraction of current
      user space limits to the allowed max_freq or 100% for global limits.
      This results in wrong ratio limits calculation because of rounding
      issues for some user space limits.
      
      Initially we tried to solve this issue by issue by having more shift
      bits to increase precision. Still there are isolated cases where we still
      have error.
      
      This can be avoided by using ratios all together. Since the way we get
      cpuinfo.max_freq is by multiplying scaling factor to max ratio, we can
      easily keep the max/min ratios in terms of ratios and not fractions.
      
      For example:
      if the max ratio = 36
      cpuinfo.max_freq = 36 * 100000 = 3600000
      
      Suppose user space sets a limit of 1200000, then we can calculate
      max ratio limit as
      = 36 * 1200000 / 3600000
      = 12
      This will be correct for any user limits.
      
      The other advantage is that, we don't need to do any calculation in the
      fast path as ratio limit is already calculated via set_policy() callback.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1a4fe38a
    • Len Brown's avatar
      x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz" · 51204e06
      Len Brown authored
      cpufreq_quick_get() allows cpufreq drivers to over-ride cpu_khz
      that is otherwise reported in x86 /proc/cpuinfo "cpu MHz".
      
      There are four problems with this scheme,
      any of them is sufficient justification to delete it.
      
       1. Depending on which cpufreq driver is loaded, the behavior
          of this field is different.
      
       2. Distros complain that they have to explain to users
          why and how this field changes.  Distros have requested a constant.
      
       3. The two major providers of this information, acpi_cpufreq
          and intel_pstate, both "get it wrong" in different ways.
      
          acpi_cpufreq lies to the user by telling them that
          they are running at whatever frequency was last
          requested by software.
      
          intel_pstate lies to the user by telling them that
          they are running at the average frequency computed
          over an undefined measurement.  But an average computed
          over an undefined interval, is itself, undefined...
      
       4. On modern processors, user space utilities, such as
          turbostat(1), are more accurate and more precise, while
          supporing concurrent measurement over arbitrary intervals.
      
      Users who have been consulting /proc/cpuinfo to
      track changing CPU frequency will be dissapointed that
      it no longer wiggles -- perhaps being unaware of the
      limitations of the information they have been consuming.
      
      Yes, they can change their scripts to look in sysfs
      cpufreq/scaling_cur_frequency.  Here they will find the same
      data of dubious quality here removed from /proc/cpuinfo.
      The value in sysfs will be addressed in a subsequent patch
      to address issues 1-3, above.
      
      Issue 4 will remain -- users that really care about
      accurate frequency information should not be using either
      proc or sysfs kernel interfaces.
      They should be using using turbostat(8), or a similar
      purpose-built analysis tool.
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      51204e06
  3. 19 Jun, 2017 8 commits
    • Linus Torvalds's avatar
      Linux 4.12-rc6 · 41f1830f
      Linus Torvalds authored
      41f1830f
    • Hugh Dickins's avatar
      mm: larger stack guard gap, between vmas · 1be7107f
      Hugh Dickins authored
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      unfortunatelly.
      
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: default avatarOleg Nesterov <oleg@redhat.com>
      Original-patch-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Tested-by: Helge Deller <deller@gmx.de> # parisc
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1be7107f
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 1132d5e7
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "Stream of fixes has slowed down, only a few this week:
      
         - Some DT fixes for Allwinner platforms, and addition of a clock to
           the R_CCU clock controller that had been missed.
      
         - A couple of small DT fixes for am335x-sl50"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
        ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
        ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
        ARM: dts: am335x-sl50: Fix card detect pin for mmc1
        arm64: allwinner: h5: Remove syslink to shared DTSI
        ARM: sunxi: h3/h5: fix the compatible of R_CCU
      1132d5e7
    • Olof Johansson's avatar
      Merge tag 'sunxi-fixes-for-4.12' of... · a1858df9
      Olof Johansson authored
      Merge tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
      
      Allwinner fixes for 4.12
      
      A few fixes around the PRCM support that got in 4.12 with a wrong
      compatible, and a missing clock in the binding.
      
      * tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
        arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
        ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
        arm64: allwinner: h5: Remove syslink to shared DTSI
        ARM: sunxi: h3/h5: fix the compatible of R_CCU
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      a1858df9
    • Olof Johansson's avatar
      Merge tag 'omap-for-v4.12/fixes-sl50' of... · 51b6e281
      Olof Johansson authored
      Merge tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
      
      Two fixes for am335x-sl50 to fix a boot time error
      for claiming SPI pins, and to fix a SDIO card detect
      pin for production version of the device.
      
      * tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
        ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
        ARM: dts: am335x-sl50: Fix card detect pin for mmc1
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      51b6e281
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 3696e4f0
      Linus Torvalds authored
      Pull virtio bugfix from Michael Tsirkin:
       "It turns out balloon does not handle IOMMUs correctly. We should fix
        that at some point, for now let's just disable this configuration"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_balloon: disable VIOMMU support
      3696e4f0
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 7d62d947
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two driver bugfixes"
      
      * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: ismt: fix wrong device address when unmap the data buffer
        i2c: rcar: use correct length when unmapping DMA
      7d62d947
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · b3ee4edd
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
      
       - Three highmem fixes:
          + Fixed mapping initialization
          + Adjust the pkmap location
          + Ensure we use at most one page for PTEs
      
       - Fix makefile dependencies for .its targets to depend on vmlinux
      
       - Fix reversed condition in BNEZC and JIALC software branch emulation
      
       - Only flush initialized flush_insn_slot to avoid NULL pointer
         dereference
      
       - perf: Remove incorrect odd/even counter handling for I6400
      
       - ftrace: Fix init functions tracing
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: .its targets depend on vmlinux
        MIPS: Fix bnezc/jialc return address calculation
        MIPS: kprobes: flush_insn_slot should flush only if probe initialised
        MIPS: ftrace: fix init functions tracing
        MIPS: mm: adjust PKMAP location
        MIPS: highmem: ensure that we don't use more than one page for PTEs
        MIPS: mm: fixed mappings: correct initialisation
        MIPS: perf: Remove incorrect odd/even counter handling for I6400
      b3ee4edd
  4. 18 Jun, 2017 7 commits
  5. 17 Jun, 2017 7 commits
  6. 16 Jun, 2017 13 commits
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 1439ccf7
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - fix another PCI_ENDPOINT build error (merged for v4.12)
      
       - fix error codes added to config accessors for v4.12
      
      * tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: endpoint: Select CRC32 to fix test build error
        PCI: Make error code types consistent in pci_{read,write}_config_*
      1439ccf7
    • Linus Torvalds's avatar
      Merge tag 'fbdev-v4.12-rc6' of git://github.com/bzolnier/linux · 3a448294
      Linus Torvalds authored
      Pull fbdev fixes from Bartlomiej Zolnierkiewicz:
      
       - fix udlfb driver to stop spamming logs (Mike Gerow)
      
       - add missing endianness conversions in smscufx & udlfb drivers (Johan
         Hovold)
      
       - fix few gcc warnings/errors (Arnd Bergmann)
      
      * tag 'fbdev-v4.12-rc6' of git://github.com/bzolnier/linux:
        video: fbdev: udlfb: drop log level for blanking
        video: fbdev: via: remove possibly unused variables
        video: fbdev: add missing USB-descriptor endianness conversions
        video: fbdev: avoid int-in-bool-context warning
      3a448294
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 162f73f4
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "5 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: correct the comment when reclaimed pages exceed the scanned pages
        userfaultfd: shmem: handle coredumping in handle_userfault()
        mm: numa: avoid waiting on freed migrated pages
        swap: cond_resched in swap_cgroup_prepare()
        mm/memory-failure.c: use compound_head() flags for huge pages
      162f73f4
    • zhongjiang's avatar
      mm: correct the comment when reclaimed pages exceed the scanned pages · d7143e31
      zhongjiang authored
      Commit e1587a49 ("mm: vmpressure: fix sending wrong events on
      underflow") declared that reclaimed pages exceed the scanned pages due
      to the thp reclaim.
      
      That is incorrect because THP will be spilt to normal page and loop
      again, which will result in the scanned pages increment.
      
      [akpm@linux-foundation.org: tweak comment text]
      Link: http://lkml.kernel.org/r/1496824266-25235-1-git-send-email-zhongjiang@huawei.comSigned-off-by: default avatarzhongjiang <zhongjiang@huawei.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7143e31
    • Andrea Arcangeli's avatar
      userfaultfd: shmem: handle coredumping in handle_userfault() · 64c2b203
      Andrea Arcangeli authored
      Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to
      __get_user_pages().
      
      shmem as opposed has no special FOLL_DUMP handling there so
      handle_mm_fault() is invoked without mmap_sem and ends up calling
      handle_userfault() that isn't expecting to be invoked without mmap_sem
      held.
      
      This makes handle_userfault() fail immediately if invoked through
      shmem_vm_ops->fault during coredumping and solves the problem.
      
      The side effect is a BUG_ON with no lock held triggered by the
      coredumping process which exits.  Only 4.11 is affected, pre-4.11 anon
      memory holes are skipped in __get_user_pages by checking FOLL_DUMP
      explicitly against empty pagetables (mm/gup.c:no_page_table()).
      
      It's zero cost as we already had a check for current->flags to prevent
      futex to trigger userfaults during exit (PF_EXITING).
      
      Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.comSigned-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatar"Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: <stable@vger.kernel.org>	[4.11+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      64c2b203
    • Mark Rutland's avatar
      mm: numa: avoid waiting on freed migrated pages · 3c226c63
      Mark Rutland authored
      In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
      waiting until the pmd is unlocked before we return and retry.  However,
      we can race with migrate_misplaced_transhuge_page():
      
          // do_huge_pmd_numa_page                // migrate_misplaced_transhuge_page()
          // Holds 0 refs on page                 // Holds 2 refs on page
      
          vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
          /* ... */
          if (pmd_trans_migrating(*vmf->pmd)) {
                  page = pmd_page(*vmf->pmd);
                  spin_unlock(vmf->ptl);
                                                  ptl = pmd_lock(mm, pmd);
                                                  if (page_count(page) != 2)) {
                                                          /* roll back */
                                                  }
                                                  /* ... */
                                                  mlock_migrate_page(new_page, page);
                                                  /* ... */
                                                  spin_unlock(ptl);
                                                  put_page(page);
                                                  put_page(page); // page freed here
                  wait_on_page_locked(page);
                  goto out;
          }
      
      This can result in the freed page having its waiters flag set
      unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
      page alloc/free functions.  This has been observed on arm64 KVM guests.
      
      We can avoid this by having do_huge_pmd_numa_page() take a reference on
      the page before dropping the pmd lock, mirroring what we do in
      __migration_entry_wait().
      
      When we hit the race, migrate_misplaced_transhuge_page() will see the
      reference and abort the migration, as it may do today in other cases.
      
      Fixes: b8916634 ("mm: Prevent parallel splits during THP migration")
      Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.comSigned-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarSteve Capper <steve.capper@arm.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3c226c63
    • Yu Zhao's avatar
      swap: cond_resched in swap_cgroup_prepare() · ef707629
      Yu Zhao authored
      I saw need_resched() warnings when swapping on large swapfile (TBs)
      because continuously allocating many pages in swap_cgroup_prepare() took
      too long.
      
      We already cond_resched when freeing page in swap_cgroup_swapoff().  Do
      the same for the page allocation.
      
      Link: http://lkml.kernel.org/r/20170604200109.17606-1-yuzhao@google.comSigned-off-by: default avatarYu Zhao <yuzhao@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ef707629
    • James Morse's avatar
      mm/memory-failure.c: use compound_head() flags for huge pages · 7258ae5c
      James Morse authored
      memory_failure() chooses a recovery action function based on the page
      flags.  For huge pages it uses the tail page flags which don't have
      anything interesting set, resulting in:
      
      > Memory failure: 0x9be3b4: Unknown page state
      > Memory failure: 0x9be3b4: recovery action for unknown page: Failed
      
      Instead, save a copy of the head page's flags if this is a huge page,
      this means if there are no relevant flags for this tail page, we use the
      head pages flags instead.  This results in the me_huge_page() recovery
      action being called:
      
      > Memory failure: 0x9b7969: recovery action for huge page: Delayed
      
      For hugepages that have not yet been allocated, this allows the hugepage
      to be dequeued.
      
      Fixes: 524fca1e ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages")
      Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.comSigned-off-by: default avatarJames Morse <james.morse@arm.com>
      Tested-by: default avatarPunit Agrawal <punit.agrawal@arm.com>
      Acked-by: default avatarPunit Agrawal <punit.agrawal@arm.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7258ae5c
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 5ac447d2
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Three small fixes for recently merged code:
      
         - remove a spurious WARN_ON when a PCI device has no of_node, it's
           allowed in some circumstances for there to be no of_node.
      
         - fix the offset for store EOI MMIOs in the XIVE interrupt
           controller.
      
         - fix non-const WARN_ONs which were becoming BUGs due to them losing
           BUGFLAG_WARNING in a recent cleanup patch.
      
        Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin
        Herrenschmidt"
      
      * tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path
        powerpc/xive: Fix offset for store EOI MMIOs
        powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
      5ac447d2
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-4.12-20170616' of... · 531c221d
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-4.12-20170616' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      - Fix probing of precise_ip level for default cycles event, that
        got broken recently on x86_64 when its arch code started
        considering invalid requesting precise samples when not sampling
        (i.e. when attr.sample_period == 0).
      
        This also fixes another problem in s/390 where the precision
        probing with sample_period == 0 returned precise_ip > 0, that
        then, when setting up the real cycles event (not probing) would
        return EOPNOTSUPP for precise_ip > 0 (as determined previously
        by probing) and sample_period > 0.
      
        These problems resulted in attr_precise not being set to the
        highest precision available on x86.64 when no event was specified,
        i.e. the canonical:
      
      	perf record ./workload
      
        would end up using attr.precise_ip = 0. As a workaround this would
        need to be done:
      
      	perf record -e cycles:P ./workload
      
        And on s/390 it would plain not work, requiring using:
      
              perf record -e cycles ./workload
      
        as a workaround.  (Arnaldo Carvalho de Melo)
      
      - Fix perf build with ARCH=x86_64, when ARCH should be transformed
        into ARCH=x86, just like with the main kernel Makefile and
        tools/objtool's, i.e. use SRCARCH. (Jiada Wang)
      
      - Avoid accessing uninitialized data structures when unwinding with
        elfutils's libdw, making it more closely mimic libunwind's unwinder.
        (Milian Wolff)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      531c221d
    • Milian Wolff's avatar
      perf unwind: Report module before querying isactivation in dwfl unwind · 9126cbba
      Milian Wolff authored
      The PC returned by dwfl_frame_pc() may map into a not-yet-reported
      module. We have to report it before we continue unwinding. But when we
      query for the isactivation flag in dwfl_frame_pc, libdw will actually do
      one more unwinding step internally which can then break and lead to
      missed frames or broken stacks.
      
      With libunwind we get e.g.:
      
      ~~~~~
        heaptrack_gui  2228 135073.400474:     613969 cycles:
      	          108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
      	          109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
      	          10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
      	           92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
      	          2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
      	           f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
      	          1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
      	           78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      	           20439 __libc_start_main (/usr/lib/libc-2.25.so)
      	           78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      
        heaptrack_gui  2228 135073.401156:     569521 cycles:
      	          131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
      	          1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
      	          21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
      	          2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
      	          279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
      	           e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
      	           f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0)
      	           f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0)
      	          298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
      	           f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
      	          1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
      	           78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      	           20439 __libc_start_main (/usr/lib/libc-2.25.so)
      	           78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      ~~~~~
      
      Note the two frames 1589e8 and 78622 in the first sample. These are
      missing when unwinding with libdw. The second sample's breakage is
      more obvious:
      
      ~~~~~
        heaptrack_gui  2228 135073.400474:     613969 cycles:
      	          108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
      	          109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
      	          10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
      	          1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
      	           92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	           93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
      	          2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
      	           f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
      	           20439 __libc_start_main (/usr/lib/libc-2.25.so)
      	           78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
      
      heaptrack_gui  2228 135073.401156:     569521 cycles:
      	          131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
      	          1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
      	          21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
      	          1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
      	          2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
      	          279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
      	           e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
      	          723dbf [unknown] ([unknown])
      ~~~~~
      
      This patch fixes this issue and the libdw unwinder mimicks the libunwind
      behavior more closely.
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Acked-by: default avatarJan Kratochvil <jan.kratochvil@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9126cbba
    • Linus Torvalds's avatar
      Merge tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs · ab2789b7
      Linus Torvalds authored
      Pull configfs updates from Christoph Hellwig:
       "A fix from Nic for a race seen in production (including a stable tag).
      
        And while I'm sending you this I'm also sneaking in a trivial new
        helper from Bart so that we don't need inter-tree dependencies for the
        next merge window"
      
      * tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
        configfs: Introduce config_item_get_unless_zero()
        configfs: Fix race between create_link and configfs_rmdir
      ab2789b7
    • Christoph Hellwig's avatar
      fs: pass on flags in compat_writev · 20223f0f
      Christoph Hellwig authored
      Fixes: 793b80ef ("vfs: pass a flags argument to vfs_readv/vfs_writev")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      20223f0f