1. 11 Dec, 2012 4 commits
    • Rik van Riel's avatar
      x86/mm: Introduce pte_accessible() · 2c3cf556
      Rik van Riel authored
      We need pte_present to return true for _PAGE_PROTNONE pages, to indicate that
      the pte is associated with a page.
      
      However, for TLB flushing purposes, we would like to know whether the pte
      points to an actually accessible page.  This allows us to skip remote TLB
      flushes for pages that are not actually accessible.
      
      Fill in this method for x86 and provide a safe (but slower) method
      on other architectures.
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Fixed-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-66p11te4uj23gevgh4j987ip@git.kernel.org
      [ Added Linus's review fixes. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2c3cf556
    • Rik van Riel's avatar
      mm,generic: only flush the local TLB in ptep_set_access_flags · cef23d9d
      Rik van Riel authored
      The function ptep_set_access_flags is only ever used to upgrade
      access permissions to a page. That means the only negative side
      effect of not flushing remote TLBs is that other CPUs may incur
      spurious page faults, if they happen to access the same address,
      and still have a PTE with the old permissions cached in their
      TLB.
      
      Having another CPU maybe incur a spurious page fault is faster
      than always incurring the cost of a remote TLB flush, so replace
      the remote TLB flush with a purely local one.
      
      This should be safe on every architecture that correctly
      implements flush_tlb_fix_spurious_fault() to actually invalidate
      the local TLB entry that caused a page fault, as well as on
      architectures where the hardware invalidates TLB entries that
      cause page faults.
      
      In the unlikely event that you are hitting what appears to be
      an infinite loop of page faults, and 'git bisect' took you to
      this changeset, your architecture needs to implement
      flush_tlb_fix_spurious_fault to actually flush the TLB entry.
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      cef23d9d
    • Rik van Riel's avatar
      x86: mm: drop TLB flush from ptep_set_access_flags · e4a1cc56
      Rik van Riel authored
      Intel has an architectural guarantee that the TLB entry causing
      a page fault gets invalidated automatically. This means
      we should be able to drop the local TLB invalidation.
      
      Because of the way other areas of the page fault code work,
      chances are good that all x86 CPUs do this.  However, if
      someone somewhere has an x86 CPU that does not invalidate
      the TLB entry causing a page fault, this one-liner should
      be easy to revert.
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      e4a1cc56
    • Rik van Riel's avatar
      x86: mm: only do a local tlb flush in ptep_set_access_flags() · 0f9a921c
      Rik van Riel authored
      The function ptep_set_access_flags() is only ever invoked to set access
      flags or add write permission on a PTE.  The write bit is only ever set
      together with the dirty bit.
      
      Because we only ever upgrade a PTE, it is safe to skip flushing entries on
      remote TLBs. The worst that can happen is a spurious page fault on other
      CPUs, which would flush that TLB entry.
      
      Lazily letting another CPU incur a spurious page fault occasionally is
      (much!) cheaper than aggressively flushing everybody else's TLB.
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      0f9a921c
  2. 17 Nov, 2012 2 commits
  3. 16 Nov, 2012 34 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (Fixes from Andrew) · 0cad3ff4
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
        revert "mm: fix-up zone present pages"
        tmpfs: change final i_blocks BUG to WARNING
        tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
        mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
        mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
        rapidio: fix kernel-doc warnings
        swapfile: fix name leak in swapoff
        memcg: fix hotplugged memory zone oops
        mips, arc: fix build failure
        memcg: oom: fix totalpages calculation for memory.swappiness==0
        mm: fix build warning for uninitialized value
        mm: add anon_vma_lock to validate_mm()
      0cad3ff4
    • Andrew Morton's avatar
      revert "mm: fix-up zone present pages" · 5576646f
      Andrew Morton authored
      Revert commit 7f1290f2 ("mm: fix-up zone present pages")
      
      That patch tried to fix a issue when calculating zone->present_pages,
      but it caused a regression on 32bit systems with HIGHMEM.  With that
      change, reset_zone_present_pages() resets all zone->present_pages to
      zero, and fixup_zone_present_pages() is called to recalculate
      zone->present_pages when the boot allocator frees core memory pages into
      buddy allocator.  Because highmem pages are not freed by bootmem
      allocator, all highmem zones' present_pages becomes zero.
      
      Various options for improving the situation are being discussed but for
      now, let's return to the 3.6 code.
      
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Petr Tesarik <ptesarik@suse.cz>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Tested-by: default avatarChris Clayton <chris2553@googlemail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5576646f
    • Hugh Dickins's avatar
      tmpfs: change final i_blocks BUG to WARNING · 0f3c42f5
      Hugh Dickins authored
      Under a particular load on one machine, I have hit shmem_evict_inode()'s
      BUG_ON(inode->i_blocks), enough times to narrow it down to a particular
      race between swapout and eviction.
      
      It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(),
      and the lack of coherent locking between mapping's nrpages and shmem's
      swapped count.  There's a window in shmem_writepage(), between lowering
      nrpages in shmem_delete_from_page_cache() and then raising swapped
      count, when the freed count appears to be +1 when it should be 0, and
      then the asymmetry stops it from being corrected with -1 before hitting
      the BUG.
      
      One answer is coherent locking: using tree_lock throughout, without
      info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on
      used_blocks makes that messier than expected.  Another answer may be a
      further effort to eliminate the weird shmem_recalc_inode() altogether,
      but previous attempts at that failed.
      
      So far undecided, but for now change the BUG_ON to WARN_ON: in usual
      circumstances it remains a useful consistency check.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f3c42f5
    • Hugh Dickins's avatar
      tmpfs: fix shmem_getpage_gfp() VM_BUG_ON · 215c02bc
      Hugh Dickins authored
      Fuzzing with trinity hit the "impossible" VM_BUG_ON(error) (which Fedora
      has converted to WARNING) in shmem_getpage_gfp():
      
        WARNING: at mm/shmem.c:1151 shmem_getpage_gfp+0xa5c/0xa70()
        Pid: 29795, comm: trinity-child4 Not tainted 3.7.0-rc2+ #49
        Call Trace:
          warn_slowpath_common+0x7f/0xc0
          warn_slowpath_null+0x1a/0x20
          shmem_getpage_gfp+0xa5c/0xa70
          shmem_fault+0x4f/0xa0
          __do_fault+0x71/0x5c0
          handle_pte_fault+0x97/0xae0
          handle_mm_fault+0x289/0x350
          __do_page_fault+0x18e/0x530
          do_page_fault+0x2b/0x50
          page_fault+0x28/0x30
          tracesys+0xe1/0xe6
      
      Thanks to Johannes for pointing to truncation: free_swap_and_cache()
      only does a trylock on the page, so the page lock we've held since
      before confirming swap is not enough to protect against truncation.
      
      What cleanup is needed in this case? Just delete_from_swap_cache(),
      which takes care of the memcg uncharge.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      215c02bc
    • Will Deacon's avatar
      mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address · 498c2280
      Will Deacon authored
      kmap_to_page returns the corresponding struct page for a virtual address
      of an arbitrary mapping.  This works by checking whether the address
      falls in the pkmap region and using the pkmap page tables instead of the
      linear mapping if appropriate.
      
      Unfortunately, the bounds checking means that PKMAP_ADDR(LAST_PKMAP) is
      incorrectly treated as a highmem address and we can end up walking off
      the end of pkmap_page_table and subsequently passing junk to pte_page.
      
      This patch fixes the bound check to stay within the pkmap tables.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      498c2280
    • Mel Gorman's avatar
      mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" · 96710098
      Mel Gorman authored
      Jiri Slaby reported the following:
      
      	(It's an effective revert of "mm: vmscan: scale number of pages
      	reclaimed by reclaim/compaction based on failures".) Given kswapd
      	had hours of runtime in ps/top output yesterday in the morning
      	and after the revert it's now 2 minutes in sum for the last 24h,
      	I would say, it's gone.
      
      The intention of the patch in question was to compensate for the loss of
      lumpy reclaim.  Part of the reason lumpy reclaim worked is because it
      aggressively reclaimed pages and this patch was meant to be a sane
      compromise.
      
      When compaction fails, it gets deferred and both compaction and
      reclaim/compaction is deferred avoid excessive reclaim.  However, since
      commit c6543459 ("mm: remove __GFP_NO_KSWAPD"), kswapd is woken up
      each time and continues reclaiming which was not taken into account when
      the patch was developed.
      
      Attempts to address the problem ended up just changing the shape of the
      problem instead of fixing it.  The release window gets closer and while
      a THP allocation failing is not a major problem, kswapd chewing up a lot
      of CPU is.
      
      This patch reverts commit 83fde0f2 ("mm: vmscan: scale number of
      pages reclaimed by reclaim/compaction based on failures") and will be
      revisited in the future.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Tested-by: default avatarValdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96710098
    • Randy Dunlap's avatar
      rapidio: fix kernel-doc warnings · 2ca3cb50
      Randy Dunlap authored
      Fix rapidio kernel-doc warnings:
      
        Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
        Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
        Warning(include/linux/rio.h:290): No description found for parameter 'switches'
        Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Acked-by: default avatarAlexandre Bounine <alexandre.bounine@idt.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2ca3cb50
    • Xiaotian Feng's avatar
      swapfile: fix name leak in swapoff · f58b59c1
      Xiaotian Feng authored
      There's a name leak introduced by commit 91a27b2a ("vfs: define
      struct filename and have getname() return it").  Add the missing
      putname.
      
      [akpm@linux-foundation.org: cleanup]
      Signed-off-by: default avatarXiaotian Feng <dannyfeng@tencent.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f58b59c1
    • Hugh Dickins's avatar
      memcg: fix hotplugged memory zone oops · bea8c150
      Hugh Dickins authored
      When MEMCG is configured on (even when it's disabled by boot option),
      when adding or removing a page to/from its lru list, the zone pointer
      used for stats updates is nowadays taken from the struct lruvec.  (On
      many configurations, calculating zone from page is slower.)
      
      But we have no code to update all the lruvecs (per zone, per memcg) when
      a memory node is hotadded.  Here's an extract from the oops which
      results when running numactl to bind a program to a newly onlined node:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
        IP:  __mod_zone_page_state+0x9/0x60
        Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
        Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
        Call Trace:
          __pagevec_lru_add_fn+0xdf/0x140
          pagevec_lru_move_fn+0xb1/0x100
          __pagevec_lru_add+0x1c/0x30
          lru_add_drain_cpu+0xa3/0x130
          lru_add_drain+0x2f/0x40
         ...
      
      The natural solution might be to use a memcg callback whenever memory is
      hotadded; but that solution has not been scoped out, and it happens that
      we do have an easy location at which to update lruvec->zone.  The lruvec
      pointer is discovered either by mem_cgroup_zone_lruvec() or by
      mem_cgroup_page_lruvec(), and both of those do know the right zone.
      
      So check and set lruvec->zone in those; and remove the inadequate
      attempt to set lruvec->zone from lruvec_init(), which is called before
      NODE_DATA(node) has been allocated in such cases.
      
      Ah, there was one exceptionr.  For no particularly good reason,
      mem_cgroup_force_empty_list() has its own code for deciding lruvec.
      Change it to use the standard mem_cgroup_zone_lruvec() and
      mem_cgroup_get_lru_size() too.  In fact it was already safe against such
      an oops (the lru lists in danger could only be empty), but we're better
      proofed against future changes this way.
      
      I've marked this for stable (3.6) since we introduced the problem in 3.5
      (now closed to stable); but I have no idea if this is the only fix
      needed to get memory hotadd working with memcg in 3.6, and received no
      answer when I enquired twice before.
      Reported-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bea8c150
    • David Rientjes's avatar
      mips, arc: fix build failure · 18f69427
      David Rientjes authored
      Using a cross-compiler to fix another issue, the following build error
      occurred for mips defconfig:
      
        arch/mips/fw/arc/misc.c: In function 'ArcHalt':
        arch/mips/fw/arc/misc.c:25:2: error: implicit declaration of function 'local_irq_disable'
      
      Fix it up by including irqflags.h.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18f69427
    • Michal Hocko's avatar
      memcg: oom: fix totalpages calculation for memory.swappiness==0 · 9a5a8f19
      Michal Hocko authored
      oom_badness() takes a totalpages argument which says how many pages are
      available and it uses it as a base for the score calculation.  The value
      is calculated by mem_cgroup_get_limit which considers both limit and
      total_swap_pages (resp.  memsw portion of it).
      
      This is usually correct but since fe35004f ("mm: avoid swapping out
      with swappiness==0") we do not swap when swappiness is 0 which means
      that we cannot really use up all the totalpages pages.  This in turn
      confuses oom score calculation if the memcg limit is much smaller than
      the available swap because the used memory (capped by the limit) is
      negligible comparing to totalpages so the resulting score is too small
      if adj!=0 (typically task with CAP_SYS_ADMIN or non zero oom_score_adj).
      A wrong process might be selected as result.
      
      The problem can be worked around by checking mem_cgroup_swappiness==0
      and not considering swap at all in such a case.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a5a8f19
    • David Rientjes's avatar
      mm: fix build warning for uninitialized value · 1756954c
      David Rientjes authored
      do_wp_page() sets mmun_called if mmun_start and mmun_end were
      initialized and, if so, may call mmu_notifier_invalidate_range_end()
      with these values.  This doesn't prevent gcc from emitting a build
      warning though:
      
        mm/memory.c: In function `do_wp_page':
        mm/memory.c:2530: warning: `mmun_start' may be used uninitialized in this function
        mm/memory.c:2531: warning: `mmun_end' may be used uninitialized in this function
      
      It's much easier to initialize the variables to impossible values and do
      a simple comparison to determine if they were initialized to remove the
      bool entirely.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1756954c
    • Michel Lespinasse's avatar
      mm: add anon_vma_lock to validate_mm() · 63c3b902
      Michel Lespinasse authored
      Iterating over the vma->anon_vma_chain without anon_vma_lock may cause
      NULL ptr deref in anon_vma_interval_tree_verify(), because the node in the
      chain might have been removed.
      
        BUG: unable to handle kernel paging request at fffffffffffffff0
        IP: [<ffffffff8122c29c>] anon_vma_interval_tree_verify+0xc/0xa0
        PGD 4e28067 PUD 4e29067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        CPU 0
        Pid: 9050, comm: trinity-child64 Tainted: G        W    3.7.0-rc2-next-20121025-sasha-00001-g673f98e-dirty #77
        RIP: 0010: anon_vma_interval_tree_verify+0xc/0xa0
        Process trinity-child64 (pid: 9050, threadinfo ffff880045f80000, task ffff880048eb0000)
        Call Trace:
          validate_mm+0x58/0x1e0
          vma_adjust+0x635/0x6b0
          __split_vma.isra.22+0x161/0x220
          split_vma+0x24/0x30
          sys_madvise+0x5da/0x7b0
          tracesys+0xe1/0xe6
        RIP  anon_vma_interval_tree_verify+0xc/0xa0
        CR2: fffffffffffffff0
      
      Figured out by Bob Liu.
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      63c3b902
    • Takashi Iwai's avatar
      KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update() · 29282fde
      Takashi Iwai authored
      The commit [ad756a16: KVM: VMX: Implement PCID/INVPCID for guests with
      EPT] introduced the unconditional access to SECONDARY_VM_EXEC_CONTROL,
      and this triggers kernel warnings like below on old CPUs:
      
          vmwrite error: reg 401e value a0568000 (err 12)
          Pid: 13649, comm: qemu-kvm Not tainted 3.7.0-rc4-test2+ #154
          Call Trace:
           [<ffffffffa0558d86>] vmwrite_error+0x27/0x29 [kvm_intel]
           [<ffffffffa054e8cb>] vmcs_writel+0x1b/0x20 [kvm_intel]
           [<ffffffffa054f114>] vmx_cpuid_update+0x74/0x170 [kvm_intel]
           [<ffffffffa03629b6>] kvm_vcpu_ioctl_set_cpuid2+0x76/0x90 [kvm]
           [<ffffffffa0341c67>] kvm_arch_vcpu_ioctl+0xc37/0xed0 [kvm]
           [<ffffffff81143f7c>] ? __vunmap+0x9c/0x110
           [<ffffffffa0551489>] ? vmx_vcpu_load+0x39/0x1a0 [kvm_intel]
           [<ffffffffa0340ee2>] ? kvm_arch_vcpu_load+0x52/0x1a0 [kvm]
           [<ffffffffa032dcd4>] ? vcpu_load+0x74/0xd0 [kvm]
           [<ffffffffa032deb0>] kvm_vcpu_ioctl+0x110/0x5e0 [kvm]
           [<ffffffffa032e93d>] ? kvm_dev_ioctl+0x4d/0x4a0 [kvm]
           [<ffffffff8117dc6f>] do_vfs_ioctl+0x8f/0x530
           [<ffffffff81139d76>] ? remove_vma+0x56/0x60
           [<ffffffff8113b708>] ? do_munmap+0x328/0x400
           [<ffffffff81187c8c>] ? fget_light+0x4c/0x100
           [<ffffffff8117e1a1>] sys_ioctl+0x91/0xb0
           [<ffffffff815a942d>] system_call_fastpath+0x1a/0x1f
      
      This patch adds a check for the availability of secondary exec
      control to avoid these warnings.
      
      Cc: <stable@vger.kernel.org> [v3.6+]
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      29282fde
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1d567e19
      Linus Torvalds authored
      Pull networking updates from David Miller:
      
       1) tx_filtered/ps_tx_buf queues need to be accessed with the SKB queue
          lock, from Arik Nemtsov.
      
       2) Don't call 802.11 driver's filter configure method until it's
          actually open, from Felix Fietkau.
      
       3) Use ieee80211_free_txskb otherwise we leak control information.
          From Johannes Berg.
      
       4) Fix memory leak in bluetooth UUID removal,f rom Johan Hedberg.
      
       5) The shift mask trick doesn't work properly when 'optname' is out of
          range in do_ip_setsockopt().  Use a straightforward switch statement
          instead, the compiler emits essentially the same code but without
          the missing range check.  From Xi Wang.
      
       6) Fix when we call tcp_replace_ts_recent() otherwise we can
          erroneously accept a too-high tsval.  From Eric Dumazet.
      
       7) VXLAN bug fixes, mostly to do with VLAN header length handling, from
          Alexander Duyck.
      
       8) Missing return value initialization for IPV6_MINHOPCOUNT socket
          option handling.  From Hannes Frederic.
      
       9) Fix regression in tasklet handling in jme/ksz884x/xilinx drivers,
          from Xiaotian Feng.
      
      10) At smsc911x driver init time, we don't know if the chip is in word
          swap mode or not.  However we do need to wait for the control
          register's ready bit to be set before we program any other part of
          the chip.  Adjust the wait loop to account for this.  From Kamlakant
          Patel.
      
      11) Revert erroneous MDIO bus unregister change to mdio-bitbang.c
      
      12) Fix memory leak in /proc/net/sctp/, from Tommi Rantala.
      
      13) tilegx driver registers IRQ with NULL name, oops, from Simon Marchi.
      
      14) TCP metrics hash table kzalloc() based allocation can fail, back
          down to using vmalloc() if it does.  From Eric Dumazet.
      
      15) Fix packet steering out-of-order delivery regression, from Tom
          Herbert.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
        net-rps: Fix brokeness causing OOO packets
        tcp: handle tcp_net_metrics_init() order-5 memory allocation failures
        batman-adv: process broadcast packets in BLA earlier
        batman-adv: don't add TEMP clients belonging to other backbone nodes
        batman-adv: correctly pass the client flag on tt_response
        batman-adv: fix tt_global_entries flags update
        tilegx: request_irq with a non-null device name
        net: correct check in dev_addr_del()
        tcp: fix retransmission in repair mode
        sctp: fix /proc/net/sctp/ memory leak
        Revert "drivers/net/phy/mdio-bitbang.c: Call mdiobus_unregister before mdiobus_free"
        net/smsc911x: Fix ready check in cases where WORD_SWAP is needed
        drivers/net: fix tasklet misuse issue
        ipv4/ip_vti.c: VTI fix post-decryption forwarding
        brcmfmac: fix typo in CONFIG_BRCMISCAN
        vxlan: Update hard_header_len based on lowerdev when instantiating VXLAN
        vxlan: fix a typo.
        ipv6: setsockopt(IPIPPROTO_IPV6, IPV6_MINHOPCOUNT) forgot to set return value
        doc/net: Fix typo in netdev-features.txt
        vxlan: Fix error that was resulting in VXLAN MTU size being 10 bytes too large
        ...
      1d567e19
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · a8203d3c
      David S. Miller authored
      John W. Linville says:
      
      ====================
      This batch of fixes is intended for the 3.7 stream...
      
      This includes a pull of the Bluetooth tree.  Gustavo says:
      
      "A few important fixes to go into 3.7. There is a new hw support by Marcos
      Chaparro. Johan added a memory leak fix and hci device index list fix.
      Also Marcel fixed a race condition in the device set up that was prevent the
      bt monitor to work properly. Last, Paulo Sérgio added a fix to the error
      status when pairing for LE fails. This was prevent userspace to work to handle
      the failure properly."
      
      Regarding the mac80211 pull, Johannes says:
      
      "I have a locking fix for some SKB queues, a variable initialization to
      avoid crashes in a certain failure case, another free_txskb fix from
      Felix and another fix from him to avoid calling a stopped driver, a fix
      for a (very unlikely) memory leak and a fix to not send null data
      packets when resuming while not associated."
      
      Regarding the iwlwifi pull, Johannes says:
      
      "Two more fixes for iwlwifi ... one to use ieee80211_free_txskb(), and
      one to check DMA mapping errors, please pull."
      
      On top of that, Johannes also included a wireless regulatory fix
      to allow 40 MHz on channels 12 and 13 in world roaming mode.  Also,
      Hauke Mehrtens fixes a #ifdef typo in brcmfmac.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8203d3c
    • Tom Herbert's avatar
      net-rps: Fix brokeness causing OOO packets · baefa31d
      Tom Herbert authored
      In commit c445477d which adds aRFS to the kernel, the CPU
      selected for RFS is not set correctly when CPU is changing.
      This is causing OOO packets and probably other issues.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      baefa31d
    • David S. Miller's avatar
      Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge · 2a953883
      David S. Miller authored
      Included fixes are:
      - update the client entry status flags when using the "early client
        detection". This makes the Distributed AP isolation correctly work;
      - transfer the client entry status flags when recovering the translation
        table from another node. This makes the Distributed AP isolation correctly
        work;
      - prevent the "early client detection mechanism" to add clients belonging to
        other backbone nodes in the same LAN. This breaks connectivity when using this
        mechanism together with the Bridge Loop Avoidance
      - process broadcast packets with the Bridge Loop Avoidance before any other
        component. BLA can possibly drop the packets based on the source address. This
        makes the "early client detection mechanism" correctly work when used with
        BLA.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a953883
    • Eric Dumazet's avatar
      tcp: handle tcp_net_metrics_init() order-5 memory allocation failures · 976a702a
      Eric Dumazet authored
      order-5 allocations can fail with current kernels, we should
      try vmalloc() as well.
      Reported-by: default avatarJulien Tinnes <jln@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      976a702a
    • Zhang Rui's avatar
      Thermal: Add Linux/Thermal subsystem info in MAINTAINER file · d3fb6955
      Zhang Rui authored
      All the changes made to the generic thermal layer, or platform thermal
      drivers that make use of the thermal layer, should be sent to
      linux-pm@vger.kernel.org for discussion.
      
      And as the maintainer, I will only apply the patches that have been sent
      to linux-pm@vger.kernel.org.
      Signed-off-by: default avatarZhang Rui <rui.zhang@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d3fb6955
    • David Rientjes's avatar
      mm, oom: reintroduce /proc/pid/oom_adj · fa0cbbf1
      David Rientjes authored
      This is mostly a revert of 01dc52eb ("oom: remove deprecated oom_adj")
      from Davidlohr Bueso.
      
      It reintroduces /proc/pid/oom_adj for backwards compatibility with earlier
      kernels.  It simply scales the value linearly when /proc/pid/oom_score_adj
      is written.
      
      The major difference is that its scheduled removal is no longer included
      in Documentation/feature-removal-schedule.txt.  We do warn users with a
      single printk, though, to suggest the more powerful and supported
      /proc/pid/oom_score_adj interface.
      Reported-by: default avatarArtem S. Tashkinov <t.artem@lycos.com>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fa0cbbf1
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · f4bcd79c
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "We've been sitting on this longer than we meant to due to travel and
        other activities, but the number of patches is luckily not that high.
      
        Biggest changes are from a batch of OMAP bugfixes, but there are a few
        for the broader set of SoCs too (bcm2835, pxa, highbank, tegra, at91
        and i.MX).
      
        The OMAP patches contain some fixes for MUSB/PHY on omap4 which ends
        up being a bit on the large side but needed for legacy (non-DT)
        platforms.  Beyond that there are a handful of hwmod/pm changes.
      
        So, fairly noncontroversial stuff all in all, and as usual around this
        time the fixes are well targeted at specific problems."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: imx: ehci: fix host power mask bit
        ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()
        ARM: at91/usbh: fix overcurrent gpio setup
        ARM: at91/AT91SAM9G45: fix crypto peripherals irq issue due to sparse irq support
        ARM: boot: Fix usage of kecho
        ARM: OMAP: ocp2scp: create omap device for ocp2scp
        ARM: OMAP4: add _dev_attr_ to ocp2scp for representing usb_phy
        drivers: bus: ocp2scp: add pdata support
        irqchip: irq-bcm2835: Add terminating entry for of_device_id table
        ARM: highbank: retry wfi on reset request
        ARM: OMAP4: PM: fix regulator name for VDD_MPU
        ARM: OMAP4: hwmod data: do not enable or reset the McPDM during kernel init
        ARM: OMAP2+: hwmod: add flag to prevent hwmod code from touching IP block during init
        ARM: dt: tegra: fix length of pad control and mux registers
        ARM: OMAP: hwmod: wait for sysreset complete after enabling hwmod
        ARM: OMAP2+: clockdomain: Fix OMAP4 ISS clk domain to support only SWSUP
        ARM: pxa/spitz_pm: Fix hang when resuming from STR
        ARM: pxa: hx4700: Fix backlight PWM device number
        ARM: OMAP2+: PM: add missing newline to VC warning message
      f4bcd79c
    • John W. Linville's avatar
      Merge branch 'master' of... · 26c6e808
      John W. Linville authored
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
      26c6e808
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 · 5a0c02ba
      Linus Torvalds authored
      Pull arm64 bugfix from Catalin Marinas:
       "Arm64 page permission bug fix.
      
        Without this fix, the CPU speculatively accesses the interrupt
        controller memory causing random IRQ acknowledge."
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
        arm64: Distinguish between user and kernel XN bits
      5a0c02ba
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 1a1e8c6f
      Linus Torvalds authored
      Pull HID fix from Jiri Kosina:
       "This has a build fix for architectures where memcmp() is macro, from
        Jiri Slaby"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: microsoft: do not use compound literal - fix build
      1a1e8c6f
    • Catalin Marinas's avatar
      arm64: Distinguish between user and kernel XN bits · 8e620b04
      Catalin Marinas authored
      On AArch64, the meaning of the XN bit has changed to UXN (user). The PXN
      (privileged) bit must be set to prevent kernel execution. Without the
      PXN bit set, the CPU may speculatively access device memory. This patch
      ensures that all the mappings that the kernel must not execute from
      (including user mappings) have the PXN bit set.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      8e620b04
    • Linus Torvalds's avatar
      Merge tag 'usb-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · ec014873
      Linus Torvalds authored
      Pull USB fixes from Greg Kroah-Hartman:
       "Here are some USB fixes for the 3.7 tree.
      
        Nothing huge here, just a number of tiny bugfixes resolving issues
        that have been found, and two reverts of patches that were found to
        have caused problems.
      
        All of these have been in linux-next already.
      
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
      
      * tag 'usb-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        Revert "USB/host: Cleanup unneccessary irq disable code"
        USB: option: add Alcatel X220/X500D USB IDs
        USB: option: add Novatel E362 and Dell Wireless 5800 USB IDs
        USB: keyspan: fix typo causing GPF on open
        USB: fix build with XEN and EARLY_PRINTK_DBGP enabled but USB_SUPPORT disabled
        USB: usb_wwan: fix bulk-urb allocation
        usb: otg: Fix build errors if USB_MUSB_OMAP2PLUS is selected as module
        usb: musb: ux500: fix 'musbid' undeclared error in ux500_remove()
        Revert "usb: musb: use DMA mode 1 whenever possible"
      ec014873
    • Linus Torvalds's avatar
      Merge tag 'tty-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · d6ee1a28
      Linus Torvalds authored
      Pull TTY fixes from Greg Kroah-Hartman:
       "Here are two TTY driver fixes for 3.7-rc5.
      
        They resolve a bug in the hvc driver that has been reported, and fix a
        problem with the list of device ids in the max310x serial driver.
      
        Both have been in linux-next for a while.
      
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
      
      * tag 'tty-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: serial: max310x: Add terminating entry for spi_device_id table
        TTY: hvc_console, fix port reference count going to zero prematurely
      d6ee1a28
    • Linus Torvalds's avatar
      Merge tag 'staging-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 7e111565
      Linus Torvalds authored
      Pull staging tree fix from Greg Kroah-Hartman:
       "Here is a single patch, a revert of an android driver patch, that
        resolves a bug that has been reported in the Android alarm driver.
      
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
      
      * tag 'staging-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        Revert "Staging: Android alarm: IOCTL command encoding fix"
      7e111565
    • Arnd Bergmann's avatar
      Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes · 6658d6a5
      Arnd Bergmann authored
      From Nicolas Ferre <nicolas.ferre@atmel.com>:
      
      Two little fixes, one related to the move to sparse irq and
      another one fixing the check of a GPIO for USB host overcurrent.
      
      * tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
        ARM: at91/usbh: fix overcurrent gpio setup
        ARM: at91/AT91SAM9G45: fix crypto peripherals irq issue due to sparse irq support
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      6658d6a5
    • Arnd Bergmann's avatar
      Merge tag 'imx-fixes-rc' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes · 57260e40
      Arnd Bergmann authored
      From Sascha Hauer <s.hauer@pengutronix.de>:
      
      ARM i.MX fixes for 3.7-rc
      
      * tag 'imx-fixes-rc' of git://git.pengutronix.de/git/imx/linux-2.6:
        ARM: imx: ehci: fix host power mask bit
        ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      57260e40
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 62735e52
      Linus Torvalds authored
      Pull s390 patches from Martin Schwidefsky:
       "Some more bug fixes and a config change.
      
        The signal bug is nasty, if the clock_gettime vdso function is
        interrupted by a signal while in access-register-mode we end up with
        an endless signal loop until the signal stack is full.  The config
        change is for aligned struct pages, gives us 8% improvement with
        hackbench."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/3215: fix tty close handling
        s390/mm: have 16 byte aligned struct pages
        s390/gup: fix access_ok() usage in __get_user_pages_fast()
        s390/gup: add missing TASK_SIZE check to get_user_pages_fast()
        s390/topology: fix core id vs physical package id mix-up
        s390/signal: set correct address space control
      62735e52
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 7279d7cb
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "All pretty normal: one TTM oops fix, one radeon, a few intel and a
        vmwgfx fix."
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/ttm: remove unneeded preempt_disable/enable
        ttm: Clear the ttm page allocated from high memory zone correctly
        vmwgfx: return an -EFAULT if copy_to_user() fails
        drm/radeon: fix logic error in atombios_encoders.c
        drm/i915: do not ignore eDP bpc settings from vbt
        drm/i915/sdvo: clean up connectors on intel_sdvo_init() failures
        drm/i915/crt: fix DPMS standby and suspend mode handling
      7279d7cb
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux · e057aad1
      Linus Torvalds authored
      Pull another clk layer fix from Michael Turquette:
       "GCC 4.7 users get compilation errors from unnecessary use of inline in
        clk-provider.h.  This pull request fixes the regression by removing
        inline usage from those function declarations."
      
      * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
        clk: remove inline usage from clk-provider.h
      e057aad1