1. 29 Oct, 2014 5 commits
    • Yu Zhao's avatar
      mm: free compound page with correct order · 5ddacbe9
      Yu Zhao authored
      Compound page should be freed by put_page() or free_pages() with correct
      order.  Not doing so will cause tail pages leaked.
      
      The compound order can be obtained by compound_order() or use
      HPAGE_PMD_ORDER in our case.  Some people would argue the latter is
      faster but I prefer the former which is more general.
      
      This bug was observed not just on our servers (the worst case we saw is
      11G leaked on a 48G machine) but also on our workstations running Ubuntu
      based distro.
      
        $ cat /proc/vmstat  | grep thp_zero_page_alloc
        thp_zero_page_alloc 55
        thp_zero_page_alloc_failed 0
      
      This means there is (thp_zero_page_alloc - 1) * (2M - 4K) memory leaked.
      
      Fixes: 97ae1749 ("thp: implement refcounting for huge zero page")
      Signed-off-by: default avatarYu Zhao <yuzhao@google.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Cc: <stable@vger.kernel.org>	[3.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5ddacbe9
    • Riku Voipio's avatar
      gcov: add ARM64 to GCOV_PROFILE_ALL · f601de20
      Riku Voipio authored
      Following up the arm testing of gcov, turns out gcov on ARM64 works fine
      as well.  Only change needed is adding ARM64 to Kconfig depends.
      
      Tested with qemu and mach-virt
      Signed-off-by: default avatarRiku Voipio <riku.voipio@linaro.org>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f601de20
    • Jerry Hoemann's avatar
      fsnotify: next_i is freed during fsnotify_unmount_inodes. · 6424babf
      Jerry Hoemann authored
      During file system stress testing on 3.10 and 3.12 based kernels, the
      umount command occasionally hung in fsnotify_unmount_inodes in the
      section of code:
      
                      spin_lock(&inode->i_lock);
                      if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
                              spin_unlock(&inode->i_lock);
                              continue;
                      }
      
      As this section of code holds the global inode_sb_list_lock, eventually
      the system hangs trying to acquire the lock.
      
      Multiple crash dumps showed:
      
      The inode->i_state == 0x60 and i_count == 0 and i_sb_list would point
      back at itself.  As this is not the value of list upon entry to the
      function, the kernel never exits the loop.
      
      To help narrow down problem, the call to list_del_init in
      inode_sb_list_del was changed to list_del.  This poisons the pointers in
      the i_sb_list and causes a kernel to panic if it transverse a freed
      inode.
      
      Subsequent stress testing paniced in fsnotify_unmount_inodes at the
      bottom of the list_for_each_entry_safe loop showing next_i had become
      free.
      
      We believe the root cause of the problem is that next_i is being freed
      during the window of time that the list_for_each_entry_safe loop
      temporarily releases inode_sb_list_lock to call fsnotify and
      fsnotify_inode_delete.
      
      The code in fsnotify_unmount_inodes attempts to prevent the freeing of
      inode and next_i by calling __iget.  However, the code doesn't do the
      __iget call on next_i
      
      	if i_count == 0 or
      	if i_state & (I_FREEING | I_WILL_FREE)
      
      The patch addresses this issue by advancing next_i in the above two cases
      until we either find a next_i which we can __iget or we reach the end of
      the list.  This makes the handling of next_i more closely match the
      handling of the variable "inode."
      
      The time to reproduce the hang is highly variable (from hours to days.) We
      ran the stress test on a 3.10 kernel with the proposed patch for a week
      without failure.
      
      During list_for_each_entry_safe, next_i is becoming free causing
      the loop to never terminate.  Advance next_i in those cases where
      __iget is not done.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hp.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ken Helias <kenhelias@firemail.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6424babf
    • Joonsoo Kim's avatar
      mm/compaction.c: avoid premature range skip in isolate_migratepages_range · 6ea41c0c
      Joonsoo Kim authored
      Commit edc2ca61 ("mm, compaction: move pageblock checks up from
      isolate_migratepages_range()") commonizes isolate_migratepages variants
      and make them use isolate_migratepages_block().
      
      isolate_migratepages_block() could stop the execution when enough pages
      are isolated, but, there is no code in isolate_migratepages_range() to
      handle this case.  In the result, even if isolate_migratepages_block()
      returns prematurely without checking all pages in the range,
      
      isolate_migratepages_block() is called repeately on the following
      pageblock and some pages in the previous range are skipped to check.
      Then, CMA is failed frequently due to this fact.
      
      To fix this problem, this patch let isolate_migratepages_range() know
      the situation that enough pages are isolated and stop the isolation in
      that case.
      
      Note that isolate_migratepages() has no such problem, because, it always
      stops the isolation after just one call of isolate_migratepages_block().
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ea41c0c
    • Wang Nan's avatar
      cgroup/kmemleak: add kmemleak_free() for cgroup deallocations. · 401507d6
      Wang Nan authored
      Commit ff7ee93f ("cgroup/kmemleak: Annotate alloc_page() for cgroup
      allocations") introduces kmemleak_alloc() for alloc_page_cgroup(), but
      corresponding kmemleak_free() is missing, which makes kmemleak be
      wrongly disabled after memory offlining.  Log is pasted at the end of
      this commit message.
      
      This patch add kmemleak_free() into free_page_cgroup().  During page
      offlining, this patch removes corresponding entries in kmemleak rbtree.
      After that, the freed memory can be allocated again by other subsystems
      without killing kmemleak.
      
        bash # for x in 1 2 3 4; do echo offline > /sys/devices/system/memory/memory$x/state ; sleep 1; done ; dmesg | grep leak
      
        Offlined Pages 32768
        kmemleak: Cannot insert 0xffff880016969000 into the object search tree (overlaps existing)
        CPU: 0 PID: 412 Comm: sleep Not tainted 3.17.0-rc5+ #86
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          dump_stack+0x46/0x58
          create_object+0x266/0x2c0
          kmemleak_alloc+0x26/0x50
          kmem_cache_alloc+0xd3/0x160
          __sigqueue_alloc+0x49/0xd0
          __send_signal+0xcb/0x410
          send_signal+0x45/0x90
          __group_send_sig_info+0x13/0x20
          do_notify_parent+0x1bb/0x260
          do_exit+0x767/0xa40
          do_group_exit+0x44/0xa0
          SyS_exit_group+0x17/0x20
          system_call_fastpath+0x16/0x1b
      
        kmemleak: Kernel memory leak detector disabled
        kmemleak: Object 0xffff880016900000 (size 524288):
        kmemleak:   comm "swapper/0", pid 0, jiffies 4294667296
        kmemleak:   min_count = 0
        kmemleak:   count = 0
        kmemleak:   flags = 0x1
        kmemleak:   checksum = 0
        kmemleak:   backtrace:
              log_early+0x63/0x77
              kmemleak_alloc+0x4b/0x50
              init_section_page_cgroup+0x7f/0xf5
              page_cgroup_init+0xc5/0xd0
              start_kernel+0x333/0x408
              x86_64_start_reservations+0x2a/0x2c
              x86_64_start_kernel+0xf5/0xfc
      
      Fixes: ff7ee93f (cgroup/kmemleak: Annotate alloc_page() for cgroup allocations)
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: <stable@vger.kernel.org>	[3.2+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      401507d6
  2. 28 Oct, 2014 4 commits
    • Linus Torvalds's avatar
      Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux · 9f76628d
      Linus Torvalds authored
      Pull two nfsd fixes from Bruce Fields:
       "One regression from the 3.16 xdr rewrite, one an older bug exposed by
        a separate bug in the client's new SEEK code"
      
      * 'for-3.18' of git://linux-nfs.org/~bfields/linux:
        nfsd4: fix crash on unknown operation number
        nfsd4: fix response size estimation for OP_SEQUENCE
      9f76628d
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.18-rc1' of... · 6234056e
      Linus Torvalds authored
      Merge tag 'trace-fixes-v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull ftrace trampoline accounting fixes from Steven Rostedt:
       "Adding the new code for 3.19, I discovered a couple of minor bugs with
        the accounting of the ftrace_ops trampoline logic.
      
        One was that the old hash was not updated before calling the modify
        code for an ftrace_ops.  The second bug was what let the first bug go
        unnoticed, as the update would check the current hash for all
        ftrace_ops (where it should only check the old hash for modified
        ones).  This let things work when only one ftrace_ops was registered
        to a function, but could break if more than one was registered
        depending on the order of the look ups.
      
        The worse thing that can happen if this bug triggers is that the
        ftrace self checks would find an anomaly and shut itself down"
      
      * tag 'trace-fixes-v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix checking of trampoline ftrace_ops in finding trampoline
        ftrace: Set ops->old_hash on modifying what an ops hooks to
      6234056e
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 6e2028aa
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A couple of ARM fixes.
      
        We fix some printk formats for ptrdiff_t quantities which cause GCC
        4.9 to complain, and we also blacklist known buggy GCC 4.8.x compilers
        as their miscompilation is serious enough to cause filesystem
        corruption, even through many distros have fixed their versions"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: fix some printk formats
        ARM: Blacklist GCC 4.8.0 to GCC 4.8.2 - PR58854
      6e2028aa
    • Will Deacon's avatar
      zap_pte_range: update addr when forcing flush after TLB batching faiure · ce9ec37b
      Will Deacon authored
      When unmapping a range of pages in zap_pte_range, the page being
      unmapped is added to an mmu_gather_batch structure for asynchronous
      freeing. If we run out of space in the batch structure before the range
      has been completely unmapped, then we break out of the loop, force a
      TLB flush and free the pages that we have batched so far. If there are
      further pages to unmap, then we resume the loop where we left off.
      
      Unfortunately, we forget to update addr when we break out of the loop,
      which causes us to truncate the range being invalidated as the end
      address is exclusive. When we re-enter the loop at the same address, the
      page has already been freed and the pte_present test will fail, meaning
      that we do not reconsider the address for invalidation.
      
      This patch fixes the problem by incrementing addr by the PAGE_SIZE
      before breaking out of the loop on batch failure.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ce9ec37b
  3. 27 Oct, 2014 7 commits
  4. 26 Oct, 2014 4 commits
    • Linus Torvalds's avatar
      Linux 3.18-rc2 · cac7f242
      Linus Torvalds authored
      cac7f242
    • Linus Torvalds's avatar
      Merge tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 88e23761
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "Another week, another small batch of fixes.
      
        Most of these make zynq, socfpga and sunxi platforms work a bit
        better:
      
         - due to new requirements for regulators, DWMMC on socfpga broke past
           v3.17
         - SMP spinup fix for socfpga
         - a few DT fixes for zynq
         - another option (FIXED_REGULATOR) for sunxi is needed that used to
           be selected by other options but no longer is.
         - a couple of small DT fixes for at91
         - ...and a couple for i.MX"
      
      * tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: dts: imx28-evk: Let i2c0 run at 100kHz
        ARM: i.MX6: Fix "emi" clock name typo
        ARM: multi_v7_defconfig: enable CONFIG_MMC_DW_ROCKCHIP
        ARM: sunxi_defconfig: enable CONFIG_REGULATOR_FIXED_VOLTAGE
        ARM: dts: socfpga: Add a 3.3V fixed regulator node
        ARM: dts: socfpga: Fix SD card detect
        ARM: dts: socfpga: rename gpio nodes
        ARM: at91/dt: sam9263: fix PLLB frequencies
        power: reset: at91-reset: fix power down register
        MAINTAINERS: add atmel ssc driver maintainer entry
        arm: socfpga: fix fetching cpu1start_addr for SMP
        ARM: zynq: DT: trivial: Fix mc node
        ARM: zynq: DT: Add cadence watchdog node
        ARM: zynq: DT: Add missing reference for memory-controller
        ARM: zynq: DT: Add missing reference for ADC
        ARM: zynq: DT: Add missing address for L2 pl310
        ARM: zynq: DT: Remove 222 MHz OPP
        ARM: zynq: DT: Fix GEM register area size
      88e23761
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d1e14f1d
      Linus Torvalds authored
      Pull vfs updates from Al Viro:
       "overlayfs merge + leak fix for d_splice_alias() failure exits"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        overlayfs: embed middle into overlay_readdir_data
        overlayfs: embed root into overlay_readdir_data
        overlayfs: make ovl_cache_entry->name an array instead of pointer
        overlayfs: don't hold ->i_mutex over opening the real directory
        fix inode leaks on d_splice_alias() failure exits
        fs: limit filesystem stacking depth
        overlay: overlay filesystem documentation
        overlayfs: implement show_options
        overlayfs: add statfs support
        overlay filesystem
        shmem: support RENAME_WHITEOUT
        ext4: support RENAME_WHITEOUT
        vfs: add RENAME_WHITEOUT
        vfs: add whiteout support
        vfs: export check_sticky()
        vfs: introduce clone_private_mount()
        vfs: export __inode_permission() to modules
        vfs: export do_splice_direct() to modules
        vfs: add i_op->dentry_open()
      d1e14f1d
    • Olof Johansson's avatar
      Merge tag 'imx-fixes-3.18' of... · efc176a8
      Olof Johansson authored
      Merge tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
      
      Merge "ARM: imx: fixes for 3.18" from Shawn Guo:
      
      The i.MX fixes for 3.18:
       - Revert one patch which increases I2C bus frequency on imx28-evk
       - Fix a typo on imx6q EIM clock name
      
      * tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
        ARM: dts: imx28-evk: Let i2c0 run at 100kHz
        ARM: i.MX6: Fix "emi" clock name typo
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      efc176a8
  5. 25 Oct, 2014 6 commits
  6. 24 Oct, 2014 14 commits
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Fix checking of trampoline ftrace_ops in finding trampoline · 4fc40904
      Steven Rostedt (Red Hat) authored
      When modifying code, ftrace has several checks to make sure things
      are being done correctly. One of them is to make sure any code it
      modifies is exactly what it expects it to be before it modifies it.
      In order to do so with the new trampoline logic, it must be able
      to find out what trampoline a function is hooked to in order to
      see if the code that hooks to it is what's expected.
      
      The logic to find the trampoline from a record (accounting descriptor
      for a function that is hooked) needs to only look at the "old_hash"
      of an ops that is being modified. The old_hash is the list of function
      an ops is hooked to before its update. Since a record would only be
      pointing to an ops that is being modified if it was already hooked
      before.
      
      Currently, it can pick a modified ops based on its new functions it
      will be hooked to, and this picks the wrong trampoline and causes
      the check to fail, disabling ftrace.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      
      ftrace: squash into ordering of ops for modification
      4fc40904
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Set ops->old_hash on modifying what an ops hooks to · 8252ecf3
      Steven Rostedt (Red Hat) authored
      The code that checks for trampolines when modifying function hooks
      tests against a modified ops "old_hash". But the ops old_hash pointer
      is not being updated before the changes are made, making it possible
      to not find the right hash to the callback and possibly causing
      ftrace to break in accounting and disable itself.
      
      Have the ops set its old_hash before the modifying takes place.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8252ecf3
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 2cc91884
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
       "This is the first round of fixes and tying up loose ends for MIPS.
      
         - plenty of fixes for build errors in specific obscure configurations
         - remove redundant code on the Lantiq platform
         - removal of a useless SEAD I2C driver that was causing a build issue
         - fix an earlier TLB exeption handler fix to also work on Octeon.
         - fix ISA level dependencies in FPU emulator's instruction decoding.
         - don't hardcode kernel command line in Octeon software emulator.
         - fix an earlier fix for the Loondson 2 clock setting"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: SEAD3: Fix I2C device registration.
        MIPS: SEAD3: Nuke PIC32 I2C driver.
        MIPS: ftrace: Fix a microMIPS build problem
        MIPS: MSP71xx: Fix build error
        MIPS: Malta: Do not build the malta-amon.c file if CMP is not enabled
        MIPS: Prevent compiler warning from cop2_{save,restore}
        MIPS: Kconfig: Add missing MIPS_CPS dependencies to PM and cpuidle
        MIPS: idle: Remove leftover __pastwait symbol and its references
        MIPS: Sibyte: Include the swarm subdir to the sb1250 LittleSur builds
        MIPS: ptrace.h: Add a missing include
        MIPS: ath79: Fix compilation error when CONFIG_PCI is disabled
        MIPS: MSP71xx: Remove compilation error when CONFIG_MIPS_MT is present
        MIPS: Octeon: Remove special case for simulator command line.
        MIPS: tlbex: Properly fix HUGE TLB Refill exception handler
        MIPS: loongson2_cpufreq: Fix CPU clock rate setting mismerge
        pci: pci-lantiq: remove duplicate check on resource
        MIPS: Lasat: Add missing CONFIG_PROC_FS dependency to PICVUE_PROC
        MIPS: cp1emu: Fix ISA restrictions for cop1x_op instructions
      2cc91884
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · cdc63a05
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - enable 48-bit VA space now that KVM has been fixed, together with a
         couple of fixes for pgd allocation alignment and initial memblock
         current_limit.  There is still a dependency on !ARM_SMMU which needs
         to be updated as it uses the page table manipulation macros of the
         host kernel
       - eBPF fixes following changes/conflicts during the merging window
       - Compat types affecting compat_elf_prpsinfo
       - Compilation error on UP builds
       - ASLR fix when /proc/sys/kernel/randomize_va_space == 0
       - DT definitions for CLCD support on ARMv8 model platform
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Fix memblock current_limit with 64K pages and 48-bit VA
        arm64: ASLR: Don't randomise text when randomise_va_space == 0
        arm64: vexpress: Add CLCD support to the ARMv8 model platform
        arm64: Fix compilation error on UP builds
        Documentation/arm64/memory.txt: fix typo
        net: bpf: arm64: minor fix of type in jited
        arm64: bpf: add 'load 64-bit immediate' instruction
        arm64: bpf: add 'shift by register' instructions
        net: bpf: arm64: address randomize and write protect JIT code
        arm64: mm: Correct fixmap pagetable types
        arm64: compat: fix compat types affecting struct compat_elf_prpsinfo
        arm64: Align less than PAGE_SIZE pgds naturally
        arm64: Allow 48-bits VA space without ARM_SMMU
      cdc63a05
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 83da00fb
      Linus Torvalds authored
      Pull two sparc fixes from David Miller:
      
       1) Fix boots with gcc-4.9 compiled sparc64 kernels.
      
       2) Add missing __get_user_pages_fast() on sparc64 to fix hangs on
          futexes used in transparent hugepage areas.
      
          It's really idiotic to have a weak symbolled fallback that just
          returns zero, and causes this kind of bug.  There should be no
          backup implementation and the link should fail if the architecture
          fails to provide __get_user_pages_fast() and supports transparent
          hugepages.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Implement __get_user_pages_fast().
        sparc64: Fix register corruption in top-most kernel stack frame during boot.
      83da00fb
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 96971e9a
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "This is a pretty large update.  I think it is roughly as big as what I
        usually had for the _whole_ rc period.
      
        There are a few bad bugs where the guest can OOPS or crash the host.
        We have also started looking at attack models for nested
        virtualization; bugs that usually result in the guest ring 0 crashing
        itself become more worrisome if you have nested virtualization,
        because the nested guest might bring down the non-nested guest as
        well.  For current uses of nested virtualization these do not really
        have a security impact, but you never know and bugs are bugs
        nevertheless.
      
        A lot of these bugs are in 3.17 too, resulting in a large number of
        stable@ Ccs.  I checked that all the patches apply there with no
        conflicts"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm: vfio: fix unregister kvm_device_ops of vfio
        KVM: x86: Wrong assertion on paging_tmpl.h
        kvm: fix excessive pages un-pinning in kvm_iommu_map error path.
        KVM: x86: PREFETCH and HINT_NOP should have SrcMem flag
        KVM: x86: Emulator does not decode clflush well
        KVM: emulate: avoid accessing NULL ctxt->memopp
        KVM: x86: Decoding guest instructions which cross page boundary may fail
        kvm: x86: don't kill guest on unknown exit reason
        kvm: vmx: handle invvpid vm exit gracefully
        KVM: x86: Handle errors when RIP is set during far jumps
        KVM: x86: Emulator fixes for eip canonical checks on near branches
        KVM: x86: Fix wrong masking on relative jump/call
        KVM: x86: Improve thread safety in pit
        KVM: x86: Prevent host from panicking on shared MSR writes.
        KVM: x86: Check non-canonical addresses upon WRMSR
      96971e9a
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.18-b-rc1-tag' of... · 20ca57cd
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.18-b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
      
      Pull xen bug fixes from David Vrabel:
      
       - Fix regression in xen_clocksource_read() which caused all Xen guests
         to crash early in boot.
       - Several fixes for super rare race conditions in the p2m.
       - Assorted other minor fixes.
      
      * tag 'stable/for-linus-3.18-b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/pci: Allocate memory for physdev_pci_device_add's optarr
        x86/xen: panic on bad Xen-provided memory map
        x86/xen: Fix incorrect per_cpu accessor in xen_clocksource_read()
        x86/xen: avoid race in p2m handling
        x86/xen: delay construction of mfn_list_list
        x86/xen: avoid writing to freed memory after race in p2m handling
        xen/balloon: Don't continue ballooning when BP_ECANCELED is encountered
      20ca57cd
    • Linus Torvalds's avatar
      Merge tag 'sound-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · c6d13403
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Here are a chunk of small fixes since rc1: two PCM core fixes, one is
        a long-standing annoyance about lockdep and another is an ARM64 mmap
        fix.
      
        The rest are a HD-audio HDMI hotplug notification fix, a fix for
        missing NULL termination in Realtek codec quirks and a few new
        device/codec-specific quirks as usual"
      
      * tag 'sound-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Add missing terminating entry to SND_HDA_PIN_QUIRK macro
        ALSA: pcm: Fix false lockdep warnings
        ALSA: hda - Fix inverted LED gpio setup for Lenovo Ideapad
        ALSA: hda - hdmi: Fix missing ELD change event on plug/unplug
        ALSA: usb-audio: Add support for Steinberg UR22 USB interface
        ALSA: ALC283 codec - Avoid pop noise on headphones during suspend/resume
        ALSA: pcm: use the same dma mmap codepath both for arm and arm64
      c6d13403
    • Linus Torvalds's avatar
      Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random · 14d4cc08
      Linus Torvalds authored
      Pull /dev/random updates from Ted Ts'o:
       "This adds a memzero_explicit() call which is guaranteed not to be
        optimized away by GCC.  This is important when we are wiping
        cryptographically sensitive material"
      
      * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
        crypto: memzero_explicit - make sure to clear out sensitive data
        random: add and use memzero_explicit() for clearing data
      14d4cc08
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 1c45d9a9
      Linus Torvalds authored
      Pull ACPI and power management updates from Rafael Wysocki:
       "This is material that didn't make it to my 3.18-rc1 pull request for
        various reasons, mostly related to timing and travel (LinuxCon EU /
        LPC) plus a couple of fixes for recent bugs.
      
        The only really new thing here is the PM QoS class for memory
        bandwidth, but it is simple enough and users of it will be added in
        the next cycle.  One major change in behavior is that platform devices
        enumerated by ACPI will use 32-bit DMA mask by default.  Also included
        is an ACPICA update to a new upstream release, but that's mostly
        cleanups, changes in tools and similar.  The rest is fixes and
        cleanups mostly.
      
        Specifics:
      
         - Fix for a recent PCI power management change that overlooked the
           fact that some IRQ chips might not be able to configure PCIe PME
           for system wakeup from Lucas Stach.
      
         - Fix for a bug introduced in 3.17 where acpi_device_wakeup() is
           called with a wrong ordering of arguments from Zhang Rui.
      
         - A bunch of intel_pstate driver fixes (all -stable candidates) from
           Dirk Brandewie, Gabriele Mazzotta and Pali Rohár.
      
         - Fixes for a rather long-standing problem with the OOM killer and
           the freezer that frozen processes killed by the OOM do not actually
           release any memory until they are thawed, so OOM-killing them is
           rather pointless, with a couple of cleanups on top (Michal Hocko,
           Cong Wang, Rafael J Wysocki).
      
         - ACPICA update to upstream release 20140926, inlcuding mostly
           cleanups reducing differences between the upstream ACPICA and the
           kernel code, tools changes (acpidump, acpiexec) and support for the
           _DDN object (Bob Moore, Lv Zheng).
      
         - New PM QoS class for memory bandwidth from Tomeu Vizoso.
      
         - Default 32-bit DMA mask for platform devices enumerated by ACPI
           (this change is mostly needed for some drivers development in
           progress targeted at 3.19) from Heikki Krogerus.
      
         - ACPI EC driver cleanups, mostly related to debugging, from Lv
           Zheng.
      
         - cpufreq-dt driver updates from Thomas Petazzoni.
      
         - powernv cpuidle driver update from Preeti U Murthy"
      
      * tag 'pm+acpi-3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (34 commits)
        intel_pstate: Correct BYT VID values.
        intel_pstate: Fix BYT frequency reporting
        intel_pstate: Don't lose sysfs settings during cpu offline
        cpufreq: intel_pstate: Reflect current no_turbo state correctly
        cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers
        cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy
        PCI / PM: handle failure to enable wakeup on PCIe PME
        ACPI: invoke acpi_device_wakeup() with correct parameters
        PM / freezer: Clean up code after recent fixes
        PM: convert do_each_thread to for_each_process_thread
        OOM, PM: OOM killed task shouldn't escape PM suspend
        freezer: remove obsolete comments in __thaw_task()
        freezer: Do not freeze tasks killed by OOM killer
        ACPI / platform: provide default DMA mask
        cpuidle: powernv: Populate cpuidle state details by querying the device-tree
        cpufreq: cpufreq-dt: adjust message related to regulators
        cpufreq: cpufreq-dt: extend with platform_data
        cpufreq: allow driver-specific data
        ACPI / EC: Cleanup coding style.
        ACPI / EC: Refine event/query debugging messages.
        ...
      1c45d9a9
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 8264fce6
      Linus Torvalds authored
      Pull thermal management updates from Zhang Rui:
       "Sorry that I missed the merge window as there is a bug found in the
        last minute, and I have to fix it and wait for the code to be tested
        in linux-next tree for a few days.  Now the buggy patch has been
        dropped entirely from my next branch.  Thus I hope those changes can
        still be merged in 3.18-rc2 as most of them are platform thermal
        driver changes.
      
        Specifics:
      
         - introduce ACPI INT340X thermal drivers.
      
           Newer laptops and tablets may have thermal sensors and other
           devices with thermal control capabilities that are exposed for the
           OS to use via the ACPI INT340x device objects.  Several drivers are
           introduced to expose the temperature information and cooling
           ability from these objects to user-space via the normal thermal
           framework.
      
           From: Lu Aaron, Lan Tianyu, Jacob Pan and Zhang Rui.
      
         - introduce a new thermal governor, which just uses a hysteresis to
           switch abruptly on/off a cooling device.  This governor can be used
           to control certain fan devices that can not be throttled but just
           switched on or off.  From: Peter Feuerer.
      
         - introduce support for some new thermal interrupt functions on
           i.MX6SX, in IMX thermal driver.  From: Anson, Huang.
      
         - introduce tracing support on thermal framework.  From: Punit
           Agrawal.
      
         - small fixes in OF thermal and thermal step_wise governor"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
        Thermal: int340x thermal: select ACPI fan driver
        Thermal: int3400_thermal: use acpi_thermal_rel parsing APIs
        Thermal: int340x_thermal: expose acpi thermal relationship tables
        Thermal: introduce int3403 thermal driver
        Thermal: introduce INT3402 thermal driver
        Thermal: move the KELVIN_TO_MILLICELSIUS macro to thermal.h
        ACPI / Fan: support INT3404 thermal device
        ACPI / Fan: add ACPI 4.0 style fan support
        ACPI / fan: convert to platform driver
        ACPI / fan: use acpi_device_xxx_power instead of acpi_bus equivelant
        ACPI / fan: remove no need check for device pointer
        ACPI / fan: remove unused macro
        Thermal: int3400 thermal: register to thermal framework
        Thermal: int3400 thermal: add capability to detect supporting UUIDs
        Thermal: introduce int3400 thermal driver
        ACPI: add ACPI_TYPE_LOCAL_REFERENCE support to acpi_extract_package()
        ACPI: make acpi_create_platform_device() an external API
        thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
        ACPI: introduce ACPI int340x thermal scan handler
        thermal: Added Bang-bang thermal governor
        ...
      8264fce6
    • Catalin Marinas's avatar
      arm64: Fix memblock current_limit with 64K pages and 48-bit VA · 3dec0fe4
      Catalin Marinas authored
      With 48-bit VA space, the 64K page configuration uses 3 levels instead
      of 2 and PUD_SIZE != PMD_SIZE. Since with 64K pages we only cover
      PMD_SIZE with the initial swapper_pg_dir populated in head.S, the
      memblock current_limit needs to be set accordingly in map_mem() to avoid
      allocating unmapped memory. The memblock current_limit is progressively
      increased as more blocks are mapped.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      3dec0fe4
    • David S. Miller's avatar
      sparc64: Implement __get_user_pages_fast(). · 06090e8e
      David S. Miller authored
      It is not sufficient to only implement get_user_pages_fast(), you
      must also implement the atomic version __get_user_pages_fast()
      otherwise you end up using the weak symbol fallback implementation
      which simply returns zero.
      
      This is dangerous, because it causes the futex code to loop forever
      if transparent hugepages are supported (see get_futex_key()).
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06090e8e
    • David S. Miller's avatar
      sparc64: Fix register corruption in top-most kernel stack frame during boot. · ef3e035c
      David S. Miller authored
      Meelis Roos reported that kernels built with gcc-4.9 do not boot, we
      eventually narrowed this down to only impacting machines using
      UltraSPARC-III and derivitive cpus.
      
      The crash happens right when the first user process is spawned:
      
      [   54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
      [   54.451346]
      [   54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab7 #96
      [   54.666431] Call Trace:
      [   54.698453]  [0000000000762f8c] panic+0xb0/0x224
      [   54.759071]  [000000000045cf68] do_exit+0x948/0x960
      [   54.823123]  [000000000042cbc0] fault_in_user_windows+0xe0/0x100
      [   54.902036]  [0000000000404ad0] __handle_user_windows+0x0/0x10
      [   54.978662] Press Stop-A (L1-A) to return to the boot prom
      [   55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
      
      Further investigation showed that compiling only per_cpu_patch() with
      an older compiler fixes the boot.
      
      Detailed analysis showed that the function is not being miscompiled by
      gcc-4.9, but it is using a different register allocation ordering.
      
      With the gcc-4.9 compiled function, something during the code patching
      causes some of the %i* input registers to get corrupted.  Perhaps
      we have a TLB miss path into the firmware that is deep enough to
      cause a register window spill and subsequent restore when we get
      back from the TLB miss trap.
      
      Let's plug this up by doing two things:
      
      1) Stop using the firmware stack for client interface calls into
         the firmware.  Just use the kernel's stack.
      
      2) As soon as we can, call into a new function "start_early_boot()"
         to put a one-register-window buffer between the firmware's
         deepest stack frame and the top-most initial kernel one.
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef3e035c