1. 05 Jul, 2014 1 commit
  2. 04 Jul, 2014 15 commits
  3. 03 Jul, 2014 24 commits
    • Greg Kroah-Hartman's avatar
      lz4: add overrun checks to lz4_uncompress_unknownoutputsize() · 4a3a9904
      Greg Kroah-Hartman authored
      Jan points out that I forgot to make the needed fixes to the
      lz4_uncompress_unknownoutputsize() function to mirror the changes done
      in lz4_decompress() with regards to potential pointer overflows.
      
      The only in-kernel user of this function is the zram code, which only
      takes data from a valid compressed buffer that it made itself, so it's
      not a big issue.  But due to external kernel modules using this
      function, it's better to be safe here.
      Reported-by: default avatarJan Beulich <JBeulich@suse.com>
      Cc: "Don A. Bailey" <donb@securitymouse.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a3a9904
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew Morton) · 5170a3b2
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "14 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        shmem: fix init_page_accessed use to stop !PageLRU bug
        kernel/printk/printk.c: revert "printk: enable interrupts before calling console_trylock_for_printk()"
        tools/testing/selftests/ipc/msgque.c: improve error handling when not running as root
        fs/seq_file: fallback to vmalloc allocation
        /proc/stat: convert to single_open_size()
        hwpoison: fix the handling path of the victimized page frame that belong to non-LRU
        mm:vmscan: update the trace-vmscan-postprocess.pl for event vmscan/mm_vmscan_lru_isolate
        msync: fix incorrect fstart calculation
        zram: revalidate disk after capacity change
        tools: memory-hotplug fix unexpected operator error
        tools: cpu-hotplug fix unexpected operator error
        autofs4: fix false positive compile error
        slub: fix off by one in number of slab tests
        mm: page_alloc: fix CMA area initialisation when pageblock > MAX_ORDER
      5170a3b2
    • Hugh Dickins's avatar
      shmem: fix init_page_accessed use to stop !PageLRU bug · 66d2f4d2
      Hugh Dickins authored
      Under shmem swapping load, I sometimes hit the VM_BUG_ON_PAGE(!PageLRU)
      in isolate_lru_pages() at mm/vmscan.c:1281!
      
      Commit 2457aec6 ("mm: non-atomically mark page accessed during page
      cache allocation where possible") looks like interrupted work-in-progress.
      
      mm/filemap.c's call to init_page_accessed() is fine, but not mm/shmem.c's
      - shmem_write_begin() is clearly wrong to use it after shmem_getpage(),
      when the page is always visible in radix_tree, and often already on LRU.
      
      Revert change to shmem_write_begin(), and use init_page_accessed() or
      mark_page_accessed() appropriately for SGP_WRITE in shmem_getpage_gfp().
      
      SGP_WRITE also covers shmem_symlink(), which did not mark_page_accessed()
      before; but since many other filesystems use [__]page_symlink(), which did
      and does mark the page accessed, consider this as rectifying an oversight.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Prabhakar Lad <prabhakar.csengg@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      66d2f4d2
    • Andrew Morton's avatar
      kernel/printk/printk.c: revert "printk: enable interrupts before calling... · d18bbc21
      Andrew Morton authored
      kernel/printk/printk.c: revert "printk: enable interrupts before calling console_trylock_for_printk()"
      
      Revert commit 939f04be ("printk: enable interrupts before calling
      console_trylock_for_printk()").
      
      Andreas reported:
      
      : None of the post 3.15 kernel boot for me. They all hang at the GRUB
      : screen telling me it loaded and started the kernel, but the kernel
      : itself stops before it prints anything (or even replaces the GRUB
      : background graphics).
      
      939f04be is modest latency reduction.  Revert it until we understand
      the reason for these failures.
      Reported-by: default avatarAndreas Bombe <aeb@debian.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d18bbc21
    • Shuah Khan's avatar
      tools/testing/selftests/ipc/msgque.c: improve error handling when not running as root · e84f1ab3
      Shuah Khan authored
      The test fails in the middle when it is not run as root while accessing
      /proc/sys/kernel/msg_next_id.  Changed it to check for root at the
      beginning of the test and exit if not root.
      Signed-off-by: default avatarShuah Khan <shuah.kh@samsung.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e84f1ab3
    • Heiko Carstens's avatar
      fs/seq_file: fallback to vmalloc allocation · 058504ed
      Heiko Carstens authored
      There are a couple of seq_files which use the single_open() interface.
      This interface requires that the whole output must fit into a single
      buffer.
      
      E.g.  for /proc/stat allocation failures have been observed because an
      order-4 memory allocation failed due to memory fragmentation.  In such
      situations reading /proc/stat is not possible anymore.
      
      Therefore change the seq_file code to fallback to vmalloc allocations
      which will usually result in a couple of order-0 allocations and hence
      also work if memory is fragmented.
      
      For reference a call trace where reading from /proc/stat failed:
      
        sadc: page allocation failure: order:4, mode:0x1040d0
        CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
        [...]
        Call Trace:
          show_stack+0x6c/0xe8
          warn_alloc_failed+0xd6/0x138
          __alloc_pages_nodemask+0x9da/0xb68
          __get_free_pages+0x2e/0x58
          kmalloc_order_trace+0x44/0xc0
          stat_open+0x5a/0xd8
          proc_reg_open+0x8a/0x140
          do_dentry_open+0x1bc/0x2c8
          finish_open+0x46/0x60
          do_last+0x382/0x10d0
          path_openat+0xc8/0x4f8
          do_filp_open+0x46/0xa8
          do_sys_open+0x114/0x1f0
          sysc_tracego+0x14/0x1a
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      058504ed
    • Heiko Carstens's avatar
      /proc/stat: convert to single_open_size() · f74373a5
      Heiko Carstens authored
      These two patches are supposed to "fix" failed order-4 memory
      allocations which have been observed when reading /proc/stat.  The
      problem has been observed on s390 as well as on x86.
      
      To address the problem change the seq_file memory allocations to
      fallback to use vmalloc, so that allocations also work if memory is
      fragmented.
      
      This approach seems to be simpler and less intrusive than changing
      /proc/stat to use an interator.  Also it "fixes" other users as well,
      which use seq_file's single_open() interface.
      
      This patch (of 2):
      
      Use seq_file's single_open_size() to preallocate a buffer that is large
      enough to hold the whole output, instead of open coding it.  Also
      calculate the requested size using the number of online cpus instead of
      possible cpus, since the size of the output only depends on the number
      of online cpus.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f74373a5
    • Chen Yucong's avatar
      hwpoison: fix the handling path of the victimized page frame that belong to non-LRU · 0bc1f8b0
      Chen Yucong authored
      Until now, the kernel has the same policy to handle victimized page
      frames that belong to kernel-space(reserved/slab-subsystem) or
      non-LRU(unknown page state).  In other word, the result of handling
      either of these victimized page frames is (IGNORED | FAILED), and the
      return value of memory_failure() is -EBUSY.
      
      This patch is to avoid that memory_failure() returns very soon due to
      the "true" value of (!PageLRU(p)), and it also ensures that
      action_result() can report more precise information("reserved kernel",
      "kernel slab", and "unknown page state") instead of "non LRU",
      especially for memory errors which are detected by memory-scrubbing.
      
      Andi said:
      
      : While running the mcelog test suite on 3.14 I hit the following VM_BUG_ON:
      :
      : soft_offline: 0x56d4: unknown non LRU page type 3ffff800008000
      : page:ffffea000015b400 count:3 mapcount:2097169 mapping:          (null) index:0xffff8800056d7000
      : page flags: 0x3ffff800004081(locked|slab|head)
      : ------------[ cut here ]------------
      : kernel BUG at mm/rmap.c:1495!
      :
      : I think what happened is that a LRU page turned into a slab page in
      : parallel with offlining.  memory_failure initially tests for this case,
      : but doesn't retest later after the page has been locked.
      :
      : ...
      :
      : I ran this patch in a loop over night with some stress plus
      : the mcelog test suite running in a loop. I cannot guarantee it hit it,
      : but it should have given it a good beating.
      :
      : The kernel survived with no messages, although the mcelog test suite
      : got killed at some point because it couldn't fork anymore. Probably
      : some unrelated problem.
      :
      : So the patch is ok for me for .16.
      Signed-off-by: default avatarChen Yucong <slaoub@gmail.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reported-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0bc1f8b0
    • Chen Yucong's avatar
      mm:vmscan: update the trace-vmscan-postprocess.pl for event vmscan/mm_vmscan_lru_isolate · b27ebf77
      Chen Yucong authored
      When using trace-vmscan-postprocess.pl for checking the file/anon rate
      of scanning, we can find that it can not be performed.  At the same
      time, the following message will be reported:
      
        WARNING: Format not as expected for event vmscan/mm_vmscan_lru_isolate
        'file' != 'contig_taken' Fewer fields than expected in format at
        ./trace-vmscan-postprocess.pl line 171, <FORMAT> line 76.
      
      In trace-vmscan-postprocess.pl, (contig_taken, contig_dirty, and
      contig_failed) are be associated respectively to (nr_lumpy_taken,
      nr_lumpy_dirty, and nr_lumpy_failed) for lumpy reclaim.  Via commit
      c53919ad ("mm: vmscan: remove lumpy reclaim"), lumpy reclaim had
      already been removed by Mel, but the update for
      trace-vmscan-postprocess.pl was missed.
      Signed-off-by: default avatarChen Yucong <slaoub@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b27ebf77
    • Namjae Jeon's avatar
      msync: fix incorrect fstart calculation · 496a8e68
      Namjae Jeon authored
      Fix a regression caused by 7fc34a62 ("mm/msync.c: sync only the
      requested range in msync()").
      
      xfstests generic/075 fail occured on ext4 data=journal mode because the
      intended range was not syncing due to wrong fstart calculation.
      Signed-off-by: default avatarNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: default avatarAshish Sangwan <a.sangwan@samsung.com>
      Reported-by: default avatarEric Whitney <enwlinux@gmail.com>
      Tested-by: default avatarEric Whitney <enwlinux@gmail.com>
      Acked-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: default avatarLukas Czerner <lczerner@redhat.com>
      Tested-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      496a8e68
    • Minchan Kim's avatar
      zram: revalidate disk after capacity change · 2e32baea
      Minchan Kim authored
      Alexander reported mkswap on /dev/zram0 is failed if other process is
      opening the block device file.
      
      Step is as follows,
      
      0. Reset the unused zram device.
      1. Use a program that opens /dev/zram0 with O_RDWR and sleeps
         until killed.
      2. While that program sleeps, echo the correct value to
         /sys/block/zram0/disksize.
      3. Verify (e.g. in /proc/partitions) that the disk size is applied
         correctly. It is.
      4. While that program still sleeps, attempt to mkswap /dev/zram0.
         This fails: mkswap: error: swap area needs to be at least 40 KiB
      
      When I investigated, the size get by ioctl(fd, BLKGETSIZE64, xxx) on
      mkswap to get a size of blockdev was zero although zram0 has right size by
      2.
      
      The reason is zram didn't revalidate disk after changing capacity so that
      size of blockdev's inode is not uptodate until all of file is close.
      
      This patch should fix the BUG.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reported-by: default avatarAlexander E. Patrakov <patrakov@gmail.com>
      Tested-by: default avatarAlexander E. Patrakov <patrakov@gmail.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Acked-by: default avatarJerome Marchand <jmarchan@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e32baea
    • Shuah Khan's avatar
      tools: memory-hotplug fix unexpected operator error · e98f7762
      Shuah Khan authored
      on-off-test uses "$UID != 0" to test for root, but $UID is a construct
      specific to bash.  Using /bin/sh that isn't bash results in the
      following error (due to the "$UID" part expanding to nothing):
      
        ./on-off-test.sh: 9: [: !=: unexpected operator
      
      Change Makefile to use bash instead.
      Signed-off-by: default avatarShuah Khan <shuah.kh@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e98f7762
    • Shuah Khan's avatar
      tools: cpu-hotplug fix unexpected operator error · 1bd702e6
      Shuah Khan authored
      on-off-test uses "$UID != 0" to test for root, but $UID is a construct
      specific to bash.  Using /bin/sh that isn't bash results in the
      following error (due to the "$UID" part expanding to nothing):
      
        ./on-off-test.sh: 9: [: !=: unexpected operator
      
      Change Makefile to use bash instead.
      Signed-off-by: default avatarShuah Khan <shuah.kh@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1bd702e6
    • Ian Kent's avatar
      autofs4: fix false positive compile error · 571ff473
      Ian Kent authored
      On strict build environments we can see:
      
        fs/autofs4/inode.c: In function 'autofs4_fill_super':
        fs/autofs4/inode.c:312: error: 'pgrp' may be used uninitialized in this function
        make[2]: *** [fs/autofs4/inode.o] Error 1
        make[1]: *** [fs/autofs4] Error 2
        make: *** [fs] Error 2
        make: *** Waiting for unfinished jobs....
      
      This is due to the use of pgrp_set being used to indicate pgrp has has
      been set rather than initializing pgrp itself.
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      571ff473
    • Joonsoo Kim's avatar
      slub: fix off by one in number of slab tests · 8a5b20ae
      Joonsoo Kim authored
      min_partial means minimum number of slab cached in node partial list.
      So, if nr_partial is less than it, we keep newly empty slab on node
      partial list rather than freeing it.  But if nr_partial is equal or
      greater than it, it means that we have enough partial slabs so should
      free newly empty slab.  Current implementation missed the equal case so
      if we set min_partial is 0, then, at least one slab could be cached.
      This is critical problem to kmemcg destroying logic because it doesn't
      works properly if some slabs is cached.  This patch fixes this problem.
      
      Fixes 91cb69620284 ("slub: make dead memcg caches discard free slabs
      immediately").
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a5b20ae
    • Michal Nazarewicz's avatar
      mm: page_alloc: fix CMA area initialisation when pageblock > MAX_ORDER · dc78327c
      Michal Nazarewicz authored
      With a kernel configured with ARM64_64K_PAGES && !TRANSPARENT_HUGEPAGE,
      the following is triggered at early boot:
      
        SMP: Total of 8 processors activated.
        devtmpfs: initialized
        Unable to handle kernel NULL pointer dereference at virtual address 00000008
        pgd = fffffe0000050000
        [00000008] *pgd=00000043fba00003, *pmd=00000043fba00003, *pte=00e0000078010407
        Internal error: Oops: 96000006 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc864k+ #44
        task: fffffe03bc040000 ti: fffffe03bc080000 task.ti: fffffe03bc080000
        PC is at __list_add+0x10/0xd4
        LR is at free_one_page+0x270/0x638
        ...
        Call trace:
          __list_add+0x10/0xd4
          free_one_page+0x26c/0x638
          __free_pages_ok.part.52+0x84/0xbc
          __free_pages+0x74/0xbc
          init_cma_reserved_pageblock+0xe8/0x104
          cma_init_reserved_areas+0x190/0x1e4
          do_one_initcall+0xc4/0x154
          kernel_init_freeable+0x204/0x2a8
          kernel_init+0xc/0xd4
      
      This happens because init_cma_reserved_pageblock() calls
      __free_one_page() with pageblock_order as page order but it is bigger
      than MAX_ORDER.  This in turn causes accesses past zone->free_list[].
      
      Fix the problem by changing init_cma_reserved_pageblock() such that it
      splits pageblock into individual MAX_ORDER pages if pageblock is bigger
      than a MAX_ORDER page.
      
      In cases where !CONFIG_HUGETLB_PAGE_SIZE_VARIABLE, which is all
      architectures expect for ia64, powerpc and tile at the moment, the
      “pageblock_order > MAX_ORDER” condition will be optimised out since both
      sides of the operator are constants.  In cases where pageblock size is
      variable, the performance degradation should not be significant anyway
      since init_cma_reserved_pageblock() is called only at boot time at most
      MAX_CMA_AREAS times which by default is eight.
      Signed-off-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Reported-by: default avatarMark Salter <msalter@redhat.com>
      Tested-by: default avatarMark Salter <msalter@redhat.com>
      Tested-by: default avatarChristopher Covington <cov@codeaurora.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: <stable@vger.kernel.org>	[3.5+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc78327c
    • Filipe Manana's avatar
      Btrfs: fix crash when starting transaction · abdd2e80
      Filipe Manana authored
      Often when starting a transaction we commit the currently running transaction,
      which can end up writing block group caches when the current process has its
      journal_info set to NULL (and not to a transaction). This makes our assertion
      at btrfs_check_data_free_space() (current_journal != NULL) fail, resulting
      in a crash/hang. Therefore fix it by setting journal_info.
      
      Two different traces of this issue follow below.
      
      1)
      
          [51502.241936] BTRFS: assertion failed: current->journal_info, file: fs/btrfs/extent-tree.c, line: 3670
          [51502.242213] ------------[ cut here ]------------
          [51502.242493] kernel BUG at fs/btrfs/ctree.h:3964!
          [51502.242669] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
          (...)
          [51502.244010] Call Trace:
          [51502.244010]  [<ffffffffa02bc025>] btrfs_check_data_free_space+0x395/0x3a0 [btrfs]
          [51502.244010]  [<ffffffffa02c3bdc>] btrfs_write_dirty_block_groups+0x4ac/0x640 [btrfs]
          [51502.244010]  [<ffffffffa0357a6a>] commit_cowonly_roots+0x164/0x226 [btrfs]
          [51502.244010]  [<ffffffffa02d53cd>] btrfs_commit_transaction+0x4ed/0xab0 [btrfs]
          [51502.244010]  [<ffffffff8168ec7b>] ? _raw_spin_unlock+0x2b/0x40
          [51502.244010]  [<ffffffffa02d6259>] start_transaction+0x459/0x620 [btrfs]
          [51502.244010]  [<ffffffffa02d67ab>] btrfs_start_transaction+0x1b/0x20 [btrfs]
          [51502.244010]  [<ffffffffa02d73e1>] __unlink_start_trans+0x31/0xe0 [btrfs]
          [51502.244010]  [<ffffffffa02dea67>] btrfs_unlink+0x37/0xc0 [btrfs]
          [51502.244010]  [<ffffffff811bb054>] ? do_unlinkat+0x114/0x2a0
          [51502.244010]  [<ffffffff811baebc>] vfs_unlink+0xcc/0x150
          [51502.244010]  [<ffffffff811bb1a0>] do_unlinkat+0x260/0x2a0
          [51502.244010]  [<ffffffff811a9ef4>] ? filp_close+0x64/0x90
          [51502.244010]  [<ffffffff810aaea6>] ? trace_hardirqs_on_caller+0x16/0x1e0
          [51502.244010]  [<ffffffff81349cab>] ? trace_hardirqs_on_thunk+0x3a/0x3f
          [51502.244010]  [<ffffffff811be9eb>] SyS_unlinkat+0x1b/0x40
          [51502.244010]  [<ffffffff81698452>] system_call_fastpath+0x16/0x1b
          [51502.244010] Code: 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 89 f1 48 c7 c2 71 13 36 a0 48 89 fe 31 c0 48 c7 c7 b8 43 36 a0 48 89 e5 e8 5d b0 32 e1 <0f> 0b 0f 1f 44 00 00 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5
          [51502.244010] RIP  [<ffffffffa03575da>] assfail.constprop.88+0x1e/0x20 [btrfs]
      
      2)
      
          [25405.097230] BTRFS: assertion failed: current->journal_info, file: fs/btrfs/extent-tree.c, line: 3670
          [25405.097488] ------------[ cut here ]------------
          [25405.097767] kernel BUG at fs/btrfs/ctree.h:3964!
          [25405.097940] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
          (...)
          [25405.100008] Call Trace:
          [25405.100008]  [<ffffffffa02bc025>] btrfs_check_data_free_space+0x395/0x3a0 [btrfs]
          [25405.100008]  [<ffffffffa02c3bdc>] btrfs_write_dirty_block_groups+0x4ac/0x640 [btrfs]
          [25405.100008]  [<ffffffffa035755a>] commit_cowonly_roots+0x164/0x226 [btrfs]
          [25405.100008]  [<ffffffffa02d53cd>] btrfs_commit_transaction+0x4ed/0xab0 [btrfs]
          [25405.100008]  [<ffffffff8109c170>] ? bit_waitqueue+0xc0/0xc0
          [25405.100008]  [<ffffffffa02d6259>] start_transaction+0x459/0x620 [btrfs]
          [25405.100008]  [<ffffffffa02d67ab>] btrfs_start_transaction+0x1b/0x20 [btrfs]
          [25405.100008]  [<ffffffffa02e3407>] btrfs_create+0x47/0x210 [btrfs]
          [25405.100008]  [<ffffffffa02d74cc>] ? btrfs_permission+0x3c/0x80 [btrfs]
          [25405.100008]  [<ffffffff811bc63b>] vfs_create+0x9b/0x130
          [25405.100008]  [<ffffffff811bcf19>] do_last+0x849/0xe20
          [25405.100008]  [<ffffffff811b9409>] ? link_path_walk+0x79/0x820
          [25405.100008]  [<ffffffff811bd5b5>] path_openat+0xc5/0x690
          [25405.100008]  [<ffffffff810ab07d>] ? trace_hardirqs_on+0xd/0x10
          [25405.100008]  [<ffffffff811cdcd2>] ? __alloc_fd+0x32/0x1d0
          [25405.100008]  [<ffffffff811be2a3>] do_filp_open+0x43/0xa0
          [25405.100008]  [<ffffffff811cddf1>] ? __alloc_fd+0x151/0x1d0
          [25405.100008]  [<ffffffff811abcfc>] do_sys_open+0x13c/0x230
          [25405.100008]  [<ffffffff810aaea6>] ? trace_hardirqs_on_caller+0x16/0x1e0
          [25405.100008]  [<ffffffff811abe12>] SyS_open+0x22/0x30
          [25405.100008]  [<ffffffff81698452>] system_call_fastpath+0x16/0x1b
          [25405.100008] Code: 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 89 f1 48 c7 c2 51 13 36 a0 48 89 fe 31 c0 48 c7 c7 d0 43 36 a0 48 89 e5 e8 6d b5 32 e1 <0f> 0b 0f 1f 44 00 00 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5
          [25405.100008] RIP  [<ffffffffa03570ca>] assfail.constprop.88+0x1e/0x20 [btrfs]
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      abdd2e80
    • Josef Bacik's avatar
      Btrfs: fix btrfs_print_leaf for skinny metadata · be2c765d
      Josef Bacik authored
      We wouldn't actuall print the extent information if we had a skinny metadata
      item, this fixes that.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      be2c765d
    • Liu Bo's avatar
      Btrfs: fix race of using total_bytes_pinned · d288db5d
      Liu Bo authored
      This percpu counter @total_bytes_pinned is introduced to skip unnecessary
      operations of 'commit transaction', it accounts for those space we may free
      but are stuck in delayed refs.
      
      And we zero out @space_info->total_bytes_pinned every transaction period so
      we have a better idea of how much space we'll actually free up by committing
      this transaction.  However, we do the 'zero out' part a little earlier, before
      we actually unpin space, so we end up returning ENOSPC when we actually have
      free space that's just unpinned from committing transaction.
      
      xfstests/generic/074 complained then.
      
      This fixes it by actually accounting the percpu pinned number when 'unpin',
      and since it's protected by space_info->lock, the race is gone now.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      d288db5d
    • David Sterba's avatar
      btrfs: use E2BIG instead of EIO if compression does not help · 130d5b41
      David Sterba authored
      Return codes got updated in 60e1975a
      (btrfs: return errno instead of -1 from compression)
      lzo wrapper returns E2BIG in this case, do the same for zlib.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      130d5b41
    • David Sterba's avatar
      btrfs: remove stale comment from btrfs_flush_all_pending_stuffs · 0a4eaea8
      David Sterba authored
      Commit fcebe456 (Btrfs: rework qgroup
      accounting) removed the qgroup accounting after delayed refs.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      0a4eaea8
    • Filipe Manana's avatar
      Btrfs: fix use-after-free when cloning a trailing file hole · 14f59796
      Filipe Manana authored
      The transaction handle was being used after being freed.
      
      Cc: Chris Mason <clm@fb.com>
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      14f59796
    • Anand Jain's avatar
      btrfs: fix null pointer dereference in btrfs_show_devname when name is null · 0aeb8a6e
      Anand Jain authored
      dev->name is null but missing flag is not set.
      Strictly speaking the missing flag should have been set, but there
      are more places where code just checks if name is null. For now this
      patch does the same.
      
      stack:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000064
      IP: [<ffffffffa0228908>] btrfs_show_devname+0x58/0xf0 [btrfs]
      
      [<ffffffff81198879>] show_vfsmnt+0x39/0x130
      [<ffffffff81178056>] m_show+0x16/0x20
      [<ffffffff8117d706>] seq_read+0x296/0x390
      [<ffffffff8115aa7d>] vfs_read+0x9d/0x160
      [<ffffffff8115b549>] SyS_read+0x49/0x90
      [<ffffffff817abe52>] system_call_fastpath+0x16/0x1b
      
      reproducer:
      mkfs.btrfs -draid1 -mraid1 /dev/sdg1 /dev/sdg2
      btrfstune -S 1 /dev/sdg1
      modprobe -r btrfs && modprobe btrfs
      mount -o degraded /dev/sdg1 /btrfs
      btrfs dev add /dev/sdg3 /btrfs
      Signed-off-by: default avatarAnand Jain <Anand.Jain@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      0aeb8a6e
    • Anand Jain's avatar
      btrfs: fix null pointer dereference in clone_fs_devices when name is null · e755f780
      Anand Jain authored
      when one of the device path is missing btrfs_device name is null. So this
      patch will check for that.
      
      stack:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      IP: [<ffffffff812e18c0>] strlen+0x0/0x30
      [<ffffffffa01cd92a>] ? clone_fs_devices+0xaa/0x160 [btrfs]
      [<ffffffffa01cdcf7>] btrfs_init_new_device+0x317/0xca0 [btrfs]
      [<ffffffff81155bca>] ? __kmalloc_track_caller+0x15a/0x1a0
      [<ffffffffa01d6473>] btrfs_ioctl+0xaa3/0x2860 [btrfs]
      [<ffffffff81132a6c>] ? handle_mm_fault+0x48c/0x9c0
      [<ffffffff81192a61>] ? __blkdev_put+0x171/0x180
      [<ffffffff817a784c>] ? __do_page_fault+0x4ac/0x590
      [<ffffffff81193426>] ? blkdev_put+0x106/0x110
      [<ffffffff81179175>] ? mntput+0x35/0x40
      [<ffffffff8116d4b0>] do_vfs_ioctl+0x460/0x4a0
      [<ffffffff8115c72e>] ? ____fput+0xe/0x10
      [<ffffffff81068033>] ? task_work_run+0xb3/0xd0
      [<ffffffff8116d547>] SyS_ioctl+0x57/0x90
      [<ffffffff817a793e>] ? do_page_fault+0xe/0x10
      [<ffffffff817abe52>] system_call_fastpath+0x16/0x1b
      
      reproducer:
      mkfs.btrfs -draid1 -mraid1 /dev/sdg1 /dev/sdg2
      btrfstune -S 1 /dev/sdg1
      modprobe -r btrfs && modprobe btrfs
      mount -o degraded /dev/sdg1 /btrfs
      btrfs dev add /dev/sdg3 /btrfs
      Signed-off-by: default avatarAnand Jain <Anand.Jain@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      e755f780