1. 29 Oct, 2014 29 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (incoming from Andrew Morton) · a7ca10f2
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "21 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
        mm/balloon_compaction: fix deflation when compaction is disabled
        sh: fix sh770x SCIF memory regions
        zram: avoid NULL pointer access in concurrent situation
        mm/slab_common: don't check for duplicate cache names
        ocfs2: fix d_splice_alias() return code checking
        mm: rmap: split out page_remove_file_rmap()
        mm: memcontrol: fix missed end-writeback page accounting
        mm: page-writeback: inline account_page_dirtied() into single caller
        lib/bitmap.c: fix undefined shift in __bitmap_shift_{left|right}()
        drivers/rtc/rtc-bq32k.c: fix register value
        memory-hotplug: clear pgdat which is allocated by bootmem in try_offline_node()
        drivers/rtc/rtc-s3c.c: fix initialization failure without rtc source clock
        kernel/kmod: fix use-after-free of the sub_info structure
        drivers/rtc/rtc-pm8xxx.c: rework to support pm8941 rtc
        mm, thp: fix collapsing of hugepages on madvise
        drivers: of: add return value to of_reserved_mem_device_init()
        mm: free compound page with correct order
        gcov: add ARM64 to GCOV_PROFILE_ALL
        fsnotify: next_i is freed during fsnotify_unmount_inodes.
        mm/compaction.c: avoid premature range skip in isolate_migratepages_range
        ...
      a7ca10f2
    • Konstantin Khlebnikov's avatar
      mm/balloon_compaction: fix deflation when compaction is disabled · 4d88e6f7
      Konstantin Khlebnikov authored
      If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages
      with balloon and doesn't set PagePrivate flag, as a result
      balloon_page_dequeue() cannot get any pages because it thinks that all
      of them are isolated.  Without balloon compaction nobody can isolate
      ballooned pages.  It's safe to remove this check.
      
      Fixes: d6d86c0a ("mm/balloon_compaction: redesign ballooned pages management").
      Signed-off-by: default avatarKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: default avatarMatt Mullins <mmullins@mmlx.us>
      Cc: <stable@vger.kernel.org>	[3.17]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4d88e6f7
    • Andriy Skulysh's avatar
      sh: fix sh770x SCIF memory regions · 5417421b
      Andriy Skulysh authored
      Resources scif1_resources & scif2_resources overlap.  Actual SCIF region
      size is 0x10.
      
      This is regression from commit d850acf9 ("sh: Declare SCIF register
      base and IRQ as resources")
      Signed-off-by: default avatarAndriy Skulysh <askulysh@gmail.com>
      Acked-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5417421b
    • Weijie Yang's avatar
      zram: avoid NULL pointer access in concurrent situation · 5a99e95b
      Weijie Yang authored
      There is a rare NULL pointer bug in mem_used_total_show() and
      mem_used_max_store() in concurrent situation, like this:
      
      zram is not initialized, process A is a mem_used_total reader which runs
      periodically, while process B try to init zram.
      
      	process A 				process B
        access meta, get a NULL value
      						init zram, done
        init_done() is true
        access meta->mem_pool, get a NULL pointer BUG
      
      This patch fixes this issue.
      Signed-off-by: default avatarWeijie Yang <weijie.yang@samsung.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a99e95b
    • Mikulas Patocka's avatar
      mm/slab_common: don't check for duplicate cache names · 8aba7e0a
      Mikulas Patocka authored
      The SLUB cache merges caches with the same size and alignment and there
      was long standing bug with this behavior:
      
       - create the cache named "foo"
       - create the cache named "bar" (which is merged with "foo")
       - delete the cache named "foo" (but it stays allocated because "bar"
         uses it)
       - create the cache named "foo" again - it fails because the name "foo"
         is already used
      
      That bug was fixed in commit 69461747 ("slab_common: fix the check
      for duplicate slab names") by not warning on duplicate cache names when
      the SLUB subsystem is used.
      
      Recently, cache merging was implemented the with SLAB subsystem too, in
      12220dea ("mm/slab: support slab merge")).  Therefore we need stop
      checking for duplicate names even for the SLAB subsystem.
      
      This patch fixes the bug by removing the check.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8aba7e0a
    • Richard Weinberger's avatar
      ocfs2: fix d_splice_alias() return code checking · d3556bab
      Richard Weinberger authored
      d_splice_alias() can return a valid dentry, NULL or an ERR_PTR.
      Currently the code checks not for ERR_PTR and will cuase an oops in
      ocfs2_dentry_attach_lock().  Fix this by using IS_ERR_OR_NULL().
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d3556bab
    • Johannes Weiner's avatar
      mm: rmap: split out page_remove_file_rmap() · 8186eb6a
      Johannes Weiner authored
      page_remove_rmap() has too many branches on PageAnon() and is hard to
      follow.  Move the file part into a separate function.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8186eb6a
    • Johannes Weiner's avatar
      mm: memcontrol: fix missed end-writeback page accounting · d7365e78
      Johannes Weiner authored
      Commit 0a31bc97 ("mm: memcontrol: rewrite uncharge API") changed
      page migration to uncharge the old page right away.  The page is locked,
      unmapped, truncated, and off the LRU, but it could race with writeback
      ending, which then doesn't unaccount the page properly:
      
      test_clear_page_writeback()              migration
                                                 wait_on_page_writeback()
        TestClearPageWriteback()
                                                 mem_cgroup_migrate()
                                                   clear PCG_USED
        mem_cgroup_update_page_stat()
          if (PageCgroupUsed(pc))
            decrease memcg pages under writeback
      
        release pc->mem_cgroup->move_lock
      
      The per-page statistics interface is heavily optimized to avoid a
      function call and a lookup_page_cgroup() in the file unmap fast path,
      which means it doesn't verify whether a page is still charged before
      clearing PageWriteback() and it has to do it in the stat update later.
      
      Rework it so that it looks up the page's memcg once at the beginning of
      the transaction and then uses it throughout.  The charge will be
      verified before clearing PageWriteback() and migration can't uncharge
      the page as long as that is still set.  The RCU lock will protect the
      memcg past uncharge.
      
      As far as losing the optimization goes, the following test results are
      from a microbenchmark that maps, faults, and unmaps a 4GB sparse file
      three times in a nested fashion, so that there are two negative passes
      that don't account but still go through the new transaction overhead.
      There is no actual difference:
      
       old:     33.195102545 seconds time elapsed       ( +-  0.01% )
       new:     33.199231369 seconds time elapsed       ( +-  0.03% )
      
      The time spent in page_remove_rmap()'s callees still adds up to the
      same, but the time spent in the function itself seems reduced:
      
           # Children      Self  Command        Shared Object       Symbol
       old:     0.12%     0.11%  filemapstress  [kernel.kallsyms]   [k] page_remove_rmap
       new:     0.12%     0.08%  filemapstress  [kernel.kallsyms]   [k] page_remove_rmap
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: <stable@vger.kernel.org>	[3.17.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7365e78
    • Johannes Weiner's avatar
      mm: page-writeback: inline account_page_dirtied() into single caller · 3a3c02ec
      Johannes Weiner authored
      A follow-up patch would have changed the call signature.  To save the
      trouble, just fold it instead.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: <stable@vger.kernel.org>	[3.17.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a3c02ec
    • Jan Kara's avatar
      lib/bitmap.c: fix undefined shift in __bitmap_shift_{left|right}() · ea5d05b3
      Jan Kara authored
      If __bitmap_shift_left() or __bitmap_shift_right() are asked to shift by
      a multiple of BITS_PER_LONG, they will try to shift a long value by
      BITS_PER_LONG bits which is undefined.  Change the functions to avoid
      the undefined shift.
      
      Coverity id: 1192175
      Coverity id: 1192174
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea5d05b3
    • Pavel Machek's avatar
      drivers/rtc/rtc-bq32k.c: fix register value · 5a6e7599
      Pavel Machek authored
      Fix register value in bq32000 trickle charging.
      
      Mike reported that I'm using wrong value in one trickle-charging case,
      and after checking docs, I must admit he's right.
      Signed-off-by: default avatarPavel Machek <pavel@denx.de>
      Reported-by: default avatarMike Bremford <mike@bfo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a6e7599
    • Yasuaki Ishimatsu's avatar
      memory-hotplug: clear pgdat which is allocated by bootmem in try_offline_node() · 35dca71c
      Yasuaki Ishimatsu authored
      When hot adding the same memory after hot removal, the following
      messages are shown:
      
        WARNING: CPU: 20 PID: 6 at mm/page_alloc.c:4968 free_area_init_node+0x3fe/0x426()
        ...
        Call Trace:
          dump_stack+0x46/0x58
          warn_slowpath_common+0x81/0xa0
          warn_slowpath_null+0x1a/0x20
          free_area_init_node+0x3fe/0x426
          hotadd_new_pgdat+0x90/0x110
          add_memory+0xd4/0x200
          acpi_memory_device_add+0x1aa/0x289
          acpi_bus_attach+0xfd/0x204
          acpi_bus_attach+0x178/0x204
          acpi_bus_scan+0x6a/0x90
          acpi_device_hotplug+0xe8/0x418
          acpi_hotplug_work_fn+0x1f/0x2b
          process_one_work+0x14e/0x3f0
          worker_thread+0x11b/0x510
          kthread+0xe1/0x100
          ret_from_fork+0x7c/0xb0
      
      The detaled explanation is as follows:
      
      When hot removing memory, pgdat is set to 0 in try_offline_node().  But
      if the pgdat is allocated by bootmem allocator, the clearing step is
      skipped.
      
      And when hot adding the same memory, the uninitialized pgdat is reused.
      But free_area_init_node() checks wether pgdat is set to zero.  As a
      result, free_area_init_node() hits WARN_ON().
      
      This patch clears pgdat which is allocated by bootmem allocator in
      try_offline_node().
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: default avatarToshi Kani <toshi.kani@hp.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      35dca71c
    • Marek Szyprowski's avatar
      drivers/rtc/rtc-s3c.c: fix initialization failure without rtc source clock · eaf3a659
      Marek Szyprowski authored
      Fix unconditional initialization failure on non-exynos3250 SoCs.
      
      Commit df9e26d0 ("rtc: s3c: add support for RTC of Exynos3250 SoC")
      introduced rtc source clock support, but also added initialization
      failure on SoCs, which doesn't need such clock.
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: default avatarChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eaf3a659
    • Martin Schwidefsky's avatar
      kernel/kmod: fix use-after-free of the sub_info structure · 0baf2a4d
      Martin Schwidefsky authored
      Found this in the message log on a s390 system:
      
          BUG kmalloc-192 (Not tainted): Poison overwritten
          Disabling lock debugging due to kernel taint
          INFO: 0x00000000684761f4-0x00000000684761f7. First byte 0xff instead of 0x6b
          INFO: Allocated in call_usermodehelper_setup+0x70/0x128 age=71 cpu=2 pid=648
           __slab_alloc.isra.47.constprop.56+0x5f6/0x658
           kmem_cache_alloc_trace+0x106/0x408
           call_usermodehelper_setup+0x70/0x128
           call_usermodehelper+0x62/0x90
           cgroup_release_agent+0x178/0x1c0
           process_one_work+0x36e/0x680
           worker_thread+0x2f0/0x4f8
           kthread+0x10a/0x120
           kernel_thread_starter+0x6/0xc
           kernel_thread_starter+0x0/0xc
          INFO: Freed in call_usermodehelper_exec+0x110/0x1b8 age=71 cpu=2 pid=648
           __slab_free+0x94/0x560
           kfree+0x364/0x3e0
           call_usermodehelper_exec+0x110/0x1b8
           cgroup_release_agent+0x178/0x1c0
           process_one_work+0x36e/0x680
           worker_thread+0x2f0/0x4f8
           kthread+0x10a/0x120
           kernel_thread_starter+0x6/0xc
           kernel_thread_starter+0x0/0xc
      
      There is a use-after-free bug on the subprocess_info structure allocated
      by the user mode helper.  In case do_execve() returns with an error
      ____call_usermodehelper() stores the error code to sub_info->retval, but
      sub_info can already have been freed.
      
      Regarding UMH_NO_WAIT, the sub_info structure can be freed by
      __call_usermodehelper() before the worker thread returns from
      do_execve(), allowing memory corruption when do_execve() failed after
      exec_mmap() is called.
      
      Regarding UMH_WAIT_EXEC, the call to umh_complete() allows
      call_usermodehelper_exec() to continue which then frees sub_info.
      
      To fix this race the code needs to make sure that the call to
      call_usermodehelper_freeinfo() is always done after the last store to
      sub_info->retval.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0baf2a4d
    • Stanimir Varbanov's avatar
      drivers/rtc/rtc-pm8xxx.c: rework to support pm8941 rtc · c8d523a4
      Stanimir Varbanov authored
      Adds support for RTC device inside PM8941 PMIC.  The RTC in this PMIC
      have two register spaces.  Thus the rtc-pm8xxx is slightly reworked to
      reflect these differences.
      
      The register set for different PMIC chips are selected on DT compatible
      string base.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: simplify and fix locking in pm8xxx_rtc_set_time()]
      Signed-off-by: default avatarStanimir Varbanov <svarbanov@mm-sol.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Josh Cartwright <joshc@codeaurora.org>
      Cc: Stanimir Varbanov <svarbanov@mm-sol.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c8d523a4
    • David Rientjes's avatar
      mm, thp: fix collapsing of hugepages on madvise · 6d50e60c
      David Rientjes authored
      If an anonymous mapping is not allowed to fault thp memory and then
      madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never
      collapse this memory into thp memory.
      
      This occurs because the madvise(2) handler for thp, hugepage_madvise(),
      clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags
      until the final action of madvise_behavior().  This causes the
      khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when
      the vma had previously had VM_NOHUGEPAGE set.
      
      Fix this by passing the correct vma flags to the khugepaged mm slot
      handler.  There's no chance khugepaged can run on this vma until after
      madvise_behavior() returns since we hold mm->mmap_sem.
      
      It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags
      in hugepage_advise(), but I didn't want to introduce special case
      behavior into madvise_behavior().  I think it's best to just let it
      always set vma->vm_flags itself.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reported-by: default avatarSuleiman Souhlal <suleiman@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d50e60c
    • Marek Szyprowski's avatar
      drivers: of: add return value to of_reserved_mem_device_init() · 47f29df7
      Marek Szyprowski authored
      Driver calling of_reserved_mem_device_init() might be interested if the
      initialization has been successful or not, so add support for returning
      error code.
      
      This fixes a build warining caused by commit 7bfa5ab6 ("drivers:
      dma-coherent: add initialization from device tree"), which has been
      merged without this change and without fixing function return value.
      
      Fixes: 7bfa5ab6 ("drivers: dma-coherent: add initialization from device tree")
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Josh Cartwright <joshc@codeaurora.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      47f29df7
    • Yu Zhao's avatar
      mm: free compound page with correct order · 5ddacbe9
      Yu Zhao authored
      Compound page should be freed by put_page() or free_pages() with correct
      order.  Not doing so will cause tail pages leaked.
      
      The compound order can be obtained by compound_order() or use
      HPAGE_PMD_ORDER in our case.  Some people would argue the latter is
      faster but I prefer the former which is more general.
      
      This bug was observed not just on our servers (the worst case we saw is
      11G leaked on a 48G machine) but also on our workstations running Ubuntu
      based distro.
      
        $ cat /proc/vmstat  | grep thp_zero_page_alloc
        thp_zero_page_alloc 55
        thp_zero_page_alloc_failed 0
      
      This means there is (thp_zero_page_alloc - 1) * (2M - 4K) memory leaked.
      
      Fixes: 97ae1749 ("thp: implement refcounting for huge zero page")
      Signed-off-by: default avatarYu Zhao <yuzhao@google.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Cc: <stable@vger.kernel.org>	[3.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5ddacbe9
    • Riku Voipio's avatar
      gcov: add ARM64 to GCOV_PROFILE_ALL · f601de20
      Riku Voipio authored
      Following up the arm testing of gcov, turns out gcov on ARM64 works fine
      as well.  Only change needed is adding ARM64 to Kconfig depends.
      
      Tested with qemu and mach-virt
      Signed-off-by: default avatarRiku Voipio <riku.voipio@linaro.org>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f601de20
    • Jerry Hoemann's avatar
      fsnotify: next_i is freed during fsnotify_unmount_inodes. · 6424babf
      Jerry Hoemann authored
      During file system stress testing on 3.10 and 3.12 based kernels, the
      umount command occasionally hung in fsnotify_unmount_inodes in the
      section of code:
      
                      spin_lock(&inode->i_lock);
                      if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
                              spin_unlock(&inode->i_lock);
                              continue;
                      }
      
      As this section of code holds the global inode_sb_list_lock, eventually
      the system hangs trying to acquire the lock.
      
      Multiple crash dumps showed:
      
      The inode->i_state == 0x60 and i_count == 0 and i_sb_list would point
      back at itself.  As this is not the value of list upon entry to the
      function, the kernel never exits the loop.
      
      To help narrow down problem, the call to list_del_init in
      inode_sb_list_del was changed to list_del.  This poisons the pointers in
      the i_sb_list and causes a kernel to panic if it transverse a freed
      inode.
      
      Subsequent stress testing paniced in fsnotify_unmount_inodes at the
      bottom of the list_for_each_entry_safe loop showing next_i had become
      free.
      
      We believe the root cause of the problem is that next_i is being freed
      during the window of time that the list_for_each_entry_safe loop
      temporarily releases inode_sb_list_lock to call fsnotify and
      fsnotify_inode_delete.
      
      The code in fsnotify_unmount_inodes attempts to prevent the freeing of
      inode and next_i by calling __iget.  However, the code doesn't do the
      __iget call on next_i
      
      	if i_count == 0 or
      	if i_state & (I_FREEING | I_WILL_FREE)
      
      The patch addresses this issue by advancing next_i in the above two cases
      until we either find a next_i which we can __iget or we reach the end of
      the list.  This makes the handling of next_i more closely match the
      handling of the variable "inode."
      
      The time to reproduce the hang is highly variable (from hours to days.) We
      ran the stress test on a 3.10 kernel with the proposed patch for a week
      without failure.
      
      During list_for_each_entry_safe, next_i is becoming free causing
      the loop to never terminate.  Advance next_i in those cases where
      __iget is not done.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hp.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ken Helias <kenhelias@firemail.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6424babf
    • Joonsoo Kim's avatar
      mm/compaction.c: avoid premature range skip in isolate_migratepages_range · 6ea41c0c
      Joonsoo Kim authored
      Commit edc2ca61 ("mm, compaction: move pageblock checks up from
      isolate_migratepages_range()") commonizes isolate_migratepages variants
      and make them use isolate_migratepages_block().
      
      isolate_migratepages_block() could stop the execution when enough pages
      are isolated, but, there is no code in isolate_migratepages_range() to
      handle this case.  In the result, even if isolate_migratepages_block()
      returns prematurely without checking all pages in the range,
      
      isolate_migratepages_block() is called repeately on the following
      pageblock and some pages in the previous range are skipped to check.
      Then, CMA is failed frequently due to this fact.
      
      To fix this problem, this patch let isolate_migratepages_range() know
      the situation that enough pages are isolated and stop the isolation in
      that case.
      
      Note that isolate_migratepages() has no such problem, because, it always
      stops the isolation after just one call of isolate_migratepages_block().
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ea41c0c
    • Wang Nan's avatar
      cgroup/kmemleak: add kmemleak_free() for cgroup deallocations. · 401507d6
      Wang Nan authored
      Commit ff7ee93f ("cgroup/kmemleak: Annotate alloc_page() for cgroup
      allocations") introduces kmemleak_alloc() for alloc_page_cgroup(), but
      corresponding kmemleak_free() is missing, which makes kmemleak be
      wrongly disabled after memory offlining.  Log is pasted at the end of
      this commit message.
      
      This patch add kmemleak_free() into free_page_cgroup().  During page
      offlining, this patch removes corresponding entries in kmemleak rbtree.
      After that, the freed memory can be allocated again by other subsystems
      without killing kmemleak.
      
        bash # for x in 1 2 3 4; do echo offline > /sys/devices/system/memory/memory$x/state ; sleep 1; done ; dmesg | grep leak
      
        Offlined Pages 32768
        kmemleak: Cannot insert 0xffff880016969000 into the object search tree (overlaps existing)
        CPU: 0 PID: 412 Comm: sleep Not tainted 3.17.0-rc5+ #86
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          dump_stack+0x46/0x58
          create_object+0x266/0x2c0
          kmemleak_alloc+0x26/0x50
          kmem_cache_alloc+0xd3/0x160
          __sigqueue_alloc+0x49/0xd0
          __send_signal+0xcb/0x410
          send_signal+0x45/0x90
          __group_send_sig_info+0x13/0x20
          do_notify_parent+0x1bb/0x260
          do_exit+0x767/0xa40
          do_group_exit+0x44/0xa0
          SyS_exit_group+0x17/0x20
          system_call_fastpath+0x16/0x1b
      
        kmemleak: Kernel memory leak detector disabled
        kmemleak: Object 0xffff880016900000 (size 524288):
        kmemleak:   comm "swapper/0", pid 0, jiffies 4294667296
        kmemleak:   min_count = 0
        kmemleak:   count = 0
        kmemleak:   flags = 0x1
        kmemleak:   checksum = 0
        kmemleak:   backtrace:
              log_early+0x63/0x77
              kmemleak_alloc+0x4b/0x50
              init_section_page_cgroup+0x7f/0xf5
              page_cgroup_init+0xc5/0xd0
              start_kernel+0x333/0x408
              x86_64_start_reservations+0x2a/0x2c
              x86_64_start_kernel+0xf5/0xfc
      
      Fixes: ff7ee93f (cgroup/kmemleak: Annotate alloc_page() for cgroup allocations)
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: <stable@vger.kernel.org>	[3.2+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      401507d6
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · d506aa68
      Linus Torvalds authored
      Pull block layer fixes from Jens Axboe:
       "A small collection of fixes for the current kernel.  This contains:
      
         - Two error handling fixes from Jan Kara.  One for null_blk on
           failure to add a device, and the other for the block/scsi_ioctl
           SCSI_IOCTL_SEND_COMMAND fixing up the error jump point.
      
         - A commit added in the merge window for the bio integrity bits
           unfortunately disabled merging for all requests if
           CONFIG_BLK_DEV_INTEGRITY wasn't set.  Reverse the logic, so that
           integrity checking wont disallow merges when not enabled.
      
         - A fix from Ming Lei for merging and generating too many segments.
           This caused a BUG in virtio_blk.
      
         - Two error handling printk() fixups from Robert Elliott, improving
           the information given when we rate limit.
      
         - Error handling fixup on elevator_init() failure from Sudip
           Mukherjee.
      
         - A fix from Tony Battersby, fixing up a memory leak in the
           scatterlist handling with scsi-mq"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not defined
        lib/scatterlist: fix memory leak with scsi-mq
        block: fix wrong error return in elevator_init()
        scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND
        null_blk: Cleanup error recovery in null_add_dev()
        blk-merge: recaculate segment if it isn't less than max segments
        fs: clarify rate limit suppressed buffer I/O errors
        fs: merge I/O error prints into one line
      d506aa68
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 7f474df0
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
       - workarounds for a couple of misbehaving Elan Touchscreens, by Adel
         Gadllah
       - fix for TransducerSerialNumber field implementation, by Jason Gerecke
       - a couple of new HID usages (added by HUT), by Olivier Gay
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: input: Fix TransducerSerialNumber implementation
        HID: add keyboard input assist hid usages
        HID: usbhid: enable always-poll quirk for Elan Touchscreen 016f
        HID: usbhid: enable always-poll quirk for Elan Touchscreen 009b
      7f474df0
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 8c782932
      Linus Torvalds authored
      Pull Integrity subsystem fix from James Morris:
       "These changes fix a bug in xattr handling, where the evm and ima
        inode_setxattr() functions do not check for empty xattrs being passed
        from userspace (leading to user-triggerable null pointer
        dereferences)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        evm: check xattr value length and type in evm_inode_setxattr()
        ima: check xattr value length and type in the ima_inode_setxattr()
      8c782932
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux · 19be9e8a
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
       "There's some bug fixes or cleanups to facilitate fixes, a MAINTAINERS
        update, and a new syscall (bpf)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
        powerpc/numa: ensure per-cpu NUMA mappings are correct on topology update
        powerpc/numa: use cached value of update->cpu in update_cpu_topology
        cxl: Fix PSL error due to duplicate segment table entries
        powerpc/mm: Use appropriate ESID mask in copro_calculate_slb()
        cxl: Refactor cxl_load_segment() and find_free_sste()
        cxl: Disable secondary hash in segment table
        Revert "powerpc/powernv: Fix endian bug in LPC bus debugfs accessors"
        powernv: Use _GLOBAL_TOC for opal wrappers
        powerpc: Wire up sys_bpf() syscall
        MAINTAINERS: nx-842 driver maintainer change
        powerpc/mm: Remove redundant #if case
        powerpc/mm: Fix build error with hugetlfs disabled
      19be9e8a
    • Jason Gerecke's avatar
      HID: input: Fix TransducerSerialNumber implementation · 5989a55a
      Jason Gerecke authored
      The commit which introduced TransducerSerialNumber (368c9664) is missing
      two crucial implementation details. Firstly, the commit does not set the
      type/code/bit/max fields as expected later down the code which can cause
      the driver to crash when a tablet with this usage is connected. Secondly,
      the call to 'set_bit' causes MSC_PULSELED to be sent instead of the
      expected MSC_SERIAL. This commit addreses both issues.
      Signed-off-by: default avatarJason Gerecke <jason.gerecke@wacom.com>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Reviewed-by: default avatarPing Cheng <pingc@wacom.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      5989a55a
    • James Morris's avatar
      Merge branch 'for-linus' of... · 6c880ad5
      James Morris authored
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into for-linus
      6c880ad5
    • Martin K. Petersen's avatar
      block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not defined · cb1a5ab6
      Martin K. Petersen authored
      Commit 4eaf99be switched to returning bool and as a result reversed
      the logic of the integrity merge checks.  However, the empty stubs used
      when the block integrity code is compiled out were still returning
      0. Make these stubs return "true".
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reported-by: default avatarMichael L. Semon <mlsemon35@gmail.com>
      Tested-by: default avatarMichael L. Semon <mlsemon35@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      cb1a5ab6
  2. 28 Oct, 2014 11 commits
    • Nishanth Aravamudan's avatar
      powerpc/numa: ensure per-cpu NUMA mappings are correct on topology update · 2c0a33f9
      Nishanth Aravamudan authored
      We received a report of warning in kernel/sched/core.c where the sched
      group was NULL on an LPAR after a topology update. This seems to occur
      because after the topology update has moved the CPUs, cpu_to_node is
      returning the old value still, which ends up breaking the consistency of
      the NUMA topology in the per-cpu maps. Ensure that we update the per-cpu
      fields when we re-map CPUs.
      Signed-off-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2c0a33f9
    • Nishanth Aravamudan's avatar
      powerpc/numa: use cached value of update->cpu in update_cpu_topology · 49f8d8c0
      Nishanth Aravamudan authored
      There isn't any need to keep referring to update->cpu, as we've already
      checked cpu == update->cpu at this point.
      Signed-off-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      49f8d8c0
    • Linus Torvalds's avatar
      Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux · 9f76628d
      Linus Torvalds authored
      Pull two nfsd fixes from Bruce Fields:
       "One regression from the 3.16 xdr rewrite, one an older bug exposed by
        a separate bug in the client's new SEEK code"
      
      * 'for-3.18' of git://linux-nfs.org/~bfields/linux:
        nfsd4: fix crash on unknown operation number
        nfsd4: fix response size estimation for OP_SEQUENCE
      9f76628d
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.18-rc1' of... · 6234056e
      Linus Torvalds authored
      Merge tag 'trace-fixes-v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull ftrace trampoline accounting fixes from Steven Rostedt:
       "Adding the new code for 3.19, I discovered a couple of minor bugs with
        the accounting of the ftrace_ops trampoline logic.
      
        One was that the old hash was not updated before calling the modify
        code for an ftrace_ops.  The second bug was what let the first bug go
        unnoticed, as the update would check the current hash for all
        ftrace_ops (where it should only check the old hash for modified
        ones).  This let things work when only one ftrace_ops was registered
        to a function, but could break if more than one was registered
        depending on the order of the look ups.
      
        The worse thing that can happen if this bug triggers is that the
        ftrace self checks would find an anomaly and shut itself down"
      
      * tag 'trace-fixes-v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix checking of trampoline ftrace_ops in finding trampoline
        ftrace: Set ops->old_hash on modifying what an ops hooks to
      6234056e
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 6e2028aa
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A couple of ARM fixes.
      
        We fix some printk formats for ptrdiff_t quantities which cause GCC
        4.9 to complain, and we also blacklist known buggy GCC 4.8.x compilers
        as their miscompilation is serious enough to cause filesystem
        corruption, even through many distros have fixed their versions"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: fix some printk formats
        ARM: Blacklist GCC 4.8.0 to GCC 4.8.2 - PR58854
      6e2028aa
    • Will Deacon's avatar
      zap_pte_range: update addr when forcing flush after TLB batching faiure · ce9ec37b
      Will Deacon authored
      When unmapping a range of pages in zap_pte_range, the page being
      unmapped is added to an mmu_gather_batch structure for asynchronous
      freeing. If we run out of space in the batch structure before the range
      has been completely unmapped, then we break out of the loop, force a
      TLB flush and free the pages that we have batched so far. If there are
      further pages to unmap, then we resume the loop where we left off.
      
      Unfortunately, we forget to update addr when we break out of the loop,
      which causes us to truncate the range being invalidated as the end
      address is exclusive. When we re-enter the loop at the same address, the
      page has already been freed and the pte_present test will fail, meaning
      that we do not reconsider the address for invalidation.
      
      This patch fixes the problem by incrementing addr by the PAGE_SIZE
      before breaking out of the loop on batch failure.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ce9ec37b
    • Tony Battersby's avatar
      lib/scatterlist: fix memory leak with scsi-mq · c21e59d8
      Tony Battersby authored
      Fix a memory leak with scsi-mq triggered by commands with large data
      transfer length.
      
      Fixes: c53c6d6a ("scatterlist: allow chaining to preallocated chunks")
      Cc: <stable@vger.kernel.org> # 3.17.x
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c21e59d8
    • Dmitry Kasatkin's avatar
      evm: check xattr value length and type in evm_inode_setxattr() · 3b1deef6
      Dmitry Kasatkin authored
      evm_inode_setxattr() can be called with no value. The function does not
      check the length so that following command can be used to produce the
      kernel oops: setfattr -n security.evm FOO. This patch fixes it.
      
      Changes in v3:
      * there is no reason to return different error codes for EVM_XATTR_HMAC
        and non EVM_XATTR_HMAC. Remove unnecessary test then.
      
      Changes in v2:
      * testing for validity of xattr type
      
      [ 1106.396921] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 1106.398192] IP: [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
      [ 1106.399244] PGD 29048067 PUD 290d7067 PMD 0
      [ 1106.399953] Oops: 0000 [#1] SMP
      [ 1106.400020] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse
      [ 1106.400020] CPU: 0 PID: 3635 Comm: setxattr Not tainted 3.16.0-kds+ #2936
      [ 1106.400020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [ 1106.400020] task: ffff8800291a0000 ti: ffff88002917c000 task.ti: ffff88002917c000
      [ 1106.400020] RIP: 0010:[<ffffffff812af7b8>]  [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
      [ 1106.400020] RSP: 0018:ffff88002917fd50  EFLAGS: 00010246
      [ 1106.400020] RAX: 0000000000000000 RBX: ffff88002917fdf8 RCX: 0000000000000000
      [ 1106.400020] RDX: 0000000000000000 RSI: ffffffff818136d3 RDI: ffff88002917fdf8
      [ 1106.400020] RBP: ffff88002917fd68 R08: 0000000000000000 R09: 00000000003ec1df
      [ 1106.400020] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800438a0a00
      [ 1106.400020] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      [ 1106.400020] FS:  00007f7dfa7d7740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
      [ 1106.400020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1106.400020] CR2: 0000000000000000 CR3: 000000003763e000 CR4: 00000000000006f0
      [ 1106.400020] Stack:
      [ 1106.400020]  ffff8800438a0a00 ffff88002917fdf8 0000000000000000 ffff88002917fd98
      [ 1106.400020]  ffffffff812a1030 ffff8800438a0a00 ffff88002917fdf8 0000000000000000
      [ 1106.400020]  0000000000000000 ffff88002917fde0 ffffffff8116d08a ffff88002917fdc8
      [ 1106.400020] Call Trace:
      [ 1106.400020]  [<ffffffff812a1030>] security_inode_setxattr+0x5d/0x6a
      [ 1106.400020]  [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f
      [ 1106.400020]  [<ffffffff8116d1e0>] setxattr+0x122/0x16c
      [ 1106.400020]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
      [ 1106.400020]  [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143
      [ 1106.400020]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
      [ 1106.400020]  [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f
      [ 1106.400020]  [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0
      [ 1106.400020]  [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b
      [ 1106.400020] Code: c3 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 41 54 49 89 fc 53 48 89 f3 48 c7 c6 d3 36 81 81 48 89 df e8 18 22 04 00 85 c0 75 07 <41> 80 7d 00 02 74 0d 48 89 de 4c 89 e7 e8 5a fe ff ff eb 03 83
      [ 1106.400020] RIP  [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
      [ 1106.400020]  RSP <ffff88002917fd50>
      [ 1106.400020] CR2: 0000000000000000
      [ 1106.428061] ---[ end trace ae08331628ba3050 ]---
      Reported-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDmitry Kasatkin <d.kasatkin@samsung.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      3b1deef6
    • Dmitry Kasatkin's avatar
      ima: check xattr value length and type in the ima_inode_setxattr() · a48fda9d
      Dmitry Kasatkin authored
      ima_inode_setxattr() can be called with no value. Function does not
      check the length so that following command can be used to produce
      kernel oops: setfattr -n security.ima FOO. This patch fixes it.
      
      Changes in v3:
      * for stable reverted "allow setting hash only in fix or log mode"
        It will be a separate patch.
      
      Changes in v2:
      * testing validity of xattr type
      * allow setting hash only in fix or log mode (Mimi)
      
      [  261.562522] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [  261.564109] IP: [<ffffffff812af272>] ima_inode_setxattr+0x3e/0x5a
      [  261.564109] PGD 3112f067 PUD 42965067 PMD 0
      [  261.564109] Oops: 0000 [#1] SMP
      [  261.564109] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse
      [  261.564109] CPU: 0 PID: 3299 Comm: setxattr Not tainted 3.16.0-kds+ #2924
      [  261.564109] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  261.564109] task: ffff8800428c2430 ti: ffff880042be0000 task.ti: ffff880042be0000
      [  261.564109] RIP: 0010:[<ffffffff812af272>]  [<ffffffff812af272>] ima_inode_setxattr+0x3e/0x5a
      [  261.564109] RSP: 0018:ffff880042be3d50  EFLAGS: 00010246
      [  261.564109] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000015
      [  261.564109] RDX: 0000001500000000 RSI: 0000000000000000 RDI: ffff8800375cc600
      [  261.564109] RBP: ffff880042be3d68 R08: 0000000000000000 R09: 00000000004d6256
      [  261.564109] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88002149ba00
      [  261.564109] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      [  261.564109] FS:  00007f6c1e219740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
      [  261.564109] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  261.564109] CR2: 0000000000000000 CR3: 000000003b35a000 CR4: 00000000000006f0
      [  261.564109] Stack:
      [  261.564109]  ffff88002149ba00 ffff880042be3df8 0000000000000000 ffff880042be3d98
      [  261.564109]  ffffffff812a101b ffff88002149ba00 ffff880042be3df8 0000000000000000
      [  261.564109]  0000000000000000 ffff880042be3de0 ffffffff8116d08a ffff880042be3dc8
      [  261.564109] Call Trace:
      [  261.564109]  [<ffffffff812a101b>] security_inode_setxattr+0x48/0x6a
      [  261.564109]  [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f
      [  261.564109]  [<ffffffff8116d1e0>] setxattr+0x122/0x16c
      [  261.564109]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
      [  261.564109]  [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143
      [  261.564109]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
      [  261.564109]  [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f
      [  261.564109]  [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0
      [  261.564109]  [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b
      [  261.564109] Code: 48 89 f7 48 c7 c6 58 36 81 81 53 31 db e8 73 27 04 00 85 c0 75 28 bf 15 00 00 00 e8 8a a5 d9 ff 84 c0 75 05 83 cb ff eb 15 31 f6 <41> 80 7d 00 03 49 8b 7c 24 68 40 0f 94 c6 e8 e1 f9 ff ff 89 d8
      [  261.564109] RIP  [<ffffffff812af272>] ima_inode_setxattr+0x3e/0x5a
      [  261.564109]  RSP <ffff880042be3d50>
      [  261.564109] CR2: 0000000000000000
      [  261.599998] ---[ end trace 39a89a3fc267e652 ]---
      Reported-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDmitry Kasatkin <d.kasatkin@samsung.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      a48fda9d
    • Ian Munsie's avatar
      cxl: Fix PSL error due to duplicate segment table entries · eb01d4c2
      Ian Munsie authored
      In certain circumstances the PSL (Power Service Layer, which provides
      translation services for CXL hardware) can send an interrupt for a
      segment miss that the kernel has already handled. This can happen if
      multiple translations for the same segment are queued in the PSL before
      the kernel has restarted the first translation.
      
      The CXL driver does not expect this situation and does not check if a
      segment had already been handled. This could cause a duplicate segment
      table entry which in turn caused a PSL error taking down the card.
      
      This patch fixes the issue by checking for existing entries in the
      segment table that match the segment we are trying to insert, so as to
      avoid inserting duplicate entries.
      Signed-off-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      eb01d4c2
    • Ian Munsie's avatar
      powerpc/mm: Use appropriate ESID mask in copro_calculate_slb() · 03f54397
      Ian Munsie authored
      This patch makes copro_calculate_slb() mask the ESID by the correct mask
      for 1T vs 256M segments.
      
      This has no effect by itself as the extra bits were ignored, but it
      makes debugging the segment table entries easier and means that we can
      directly compare the ESID values for duplicates without needing to worry
      about masking in the comparison.
      
      This will be used to simplify a comparison in the following patch.
      Signed-off-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      03f54397