1. 01 Jun, 2015 2 commits
  2. 27 May, 2015 26 commits
    • Andreas Schwab's avatar
      powerpc: Add vr save/restore functions · d6bb2a19
      Andreas Schwab authored
      commit 8fe9c93e upstream.
      
      GCC 4.8 now generates out-of-line vr save/restore functions when
      optimizing for size.  They are needed for the raid6 altivec support.
      Signed-off-by: default avatarAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d6bb2a19
    • Lukas Czerner's avatar
      ext4: fix data corruption caused by unwritten and delayed extents · 93db0c19
      Lukas Czerner authored
      commit d2dc317d upstream.
      
      Currently it is possible to lose whole file system block worth of data
      when we hit the specific interaction with unwritten and delayed extents
      in status extent tree.
      
      The problem is that when we insert delayed extent into extent status
      tree the only way to get rid of it is when we write out delayed buffer.
      However there is a limitation in the extent status tree implementation
      so that when inserting unwritten extent should there be even a single
      delayed block the whole unwritten extent would be marked as delayed.
      
      At this point, there is no way to get rid of the delayed extents,
      because there are no delayed buffers to write out. So when a we write
      into said unwritten extent we will convert it to written, but it still
      remains delayed.
      
      When we try to write into that block later ext4_da_map_blocks() will set
      the buffer new and delayed and map it to invalid block which causes
      the rest of the block to be zeroed loosing already written data.
      
      For now we can fix this by simply not allowing to set delayed status on
      written extent in the extent status tree. Also add WARN_ON() to make
      sure that we notice if this happens in the future.
      
      This problem can be easily reproduced by running the following xfs_io.
      
      xfs_io -f -c "pwrite -S 0xaa 4096 2048" \
                -c "falloc 0 131072" \
                -c "pwrite -S 0xbb 65536 2048" \
                -c "fsync" /mnt/test/fff
      
      echo 3 > /proc/sys/vm/drop_caches
      xfs_io -c "pwrite -S 0xdd 67584 2048" /mnt/test/fff
      
      This can be theoretically also reproduced by at random by running fsx,
      but it's not very reliable, though on machines with bigger page size
      (like ppc) this can be seen more often (especially xfstest generic/127)
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      93db0c19
    • Thomas D's avatar
      tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile · fd8b6562
      Thomas D authored
      commit f82263c6 upstream.
      
      Since commit ee0778a3
      ("tools/power: turbostat: make Makefile a bit more capable")
      turbostat's Makefile is using
      
        [...]
        BUILD_OUTPUT    := $(PWD)
        [...]
      
      which obviously causes trouble when building "turbostat" with
      
        make -C /usr/src/linux/tools/power/x86/turbostat ARCH=x86 turbostat
      
      because GNU make does not update nor guarantee that $PWD is set.
      
      This patch changes the Makefile to use $CURDIR instead, which GNU make
      guarantees to set and update (i.e. when using "make -C ...") and also
      adds support for the O= option (see "make help" in your root of your
      kernel source tree for more details).
      
      Link: https://bugs.gentoo.org/show_bug.cgi?id=533918
      Fixes: ee0778a3 ("tools/power: turbostat: make Makefile a bit more capable")
      Signed-off-by: default avatarThomas D. <whissi@whissi.de>
      Cc: Mark Asselstine <mark.asselstine@windriver.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fd8b6562
    • Dan Carpenter's avatar
      memstick: mspro_block: add missing curly braces · b1ce2bd4
      Dan Carpenter authored
      commit 13f6b191 upstream.
      
      Using the indenting we can see the curly braces were obviously intended.
      This is a static checker fix, but my guess is that we don't read enough
      bytes, because we don't calculate "t_len" correctly.
      
      Fixes: f1d82698 ('memstick: use fully asynchronous request processing')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b1ce2bd4
    • Nicolas Iooss's avatar
      firmware/ihex2fw.c: restore missing default in switch statement · 204dec04
      Nicolas Iooss authored
      commit d43698e8 upstream.
      
      Commit 2473238e ("ihex: add support for CS:IP/EIP records") removes
      the "default:" statement in the switch block, making the "return
      usage();" line dead code and ihex2fw silently ignoring unknown options.
      Restore this statement.
      
      This bug was found by building with HOSTCC=clang and adding
      -Wunreachable-code-return to HOSTCFLAGS.
      
      Fixes: 2473238e ("ihex: add support for CS:IP/EIP records")
      Signed-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      204dec04
    • Herbert Xu's avatar
      skbuff: Do not scrub skb mark within the same name space · 615e3227
      Herbert Xu authored
      commit 213dd74a upstream.
      
      On Wed, Apr 15, 2015 at 05:41:26PM +0200, Nicolas Dichtel wrote:
      > Le 15/04/2015 15:57, Herbert Xu a écrit :
      > >On Wed, Apr 15, 2015 at 06:22:29PM +0800, Herbert Xu wrote:
      > [snip]
      > >Subject: skbuff: Do not scrub skb mark within the same name space
      > >
      > >The commit ea23192e ("tunnels:
      > Maybe add a Fixes tag?
      > Fixes: ea23192e ("tunnels: harmonize cleanup done on skb on rx path")
      >
      > >harmonize cleanup done on skb on rx path") broke anyone trying to
      > >use netfilter marking across IPv4 tunnels.  While most of the
      > >fields that are cleared by skb_scrub_packet don't matter, the
      > >netfilter mark must be preserved.
      > >
      > >This patch rearranges skb_scurb_packet to preserve the mark field.
      > nit: s/scurb/scrub
      >
      > Else it's fine for me.
      
      Sure.
      
      PS I used the wrong email for James the first time around.  So
      let me repeat the question here.  Should secmark be preserved
      or cleared across tunnels within the same name space? In fact,
      do our security models even support name spaces?
      
      ---8<---
      The commit ea23192e ("tunnels:
      harmonize cleanup done on skb on rx path") broke anyone trying to
      use netfilter marking across IPv4 tunnels.  While most of the
      fields that are cleared by skb_scrub_packet don't matter, the
      netfilter mark must be preserved.
      
      This patch rearranges skb_scrub_packet to preserve the mark field.
      
      Fixes: ea23192e ("tunnels: harmonize cleanup done on skb on rx path")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      615e3227
    • Honggang LI's avatar
      mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures · 96858104
      Honggang LI authored
      commit 59d2d18c upstream.
      
      If CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for x86 systems and physical
      memory is more than 4GB, dma_map_page may return a valid memory
      address which greater than 0xffffffff. As a result, the mlx5 device page
      allocator RB tree will be initialized with valid addresses greater than
      0xfffffff.
      
      However, (addr & PAGE_MASK) set the high four bytes to zeros. So, it's
      impossible for the function, free_4k, to release the pages whose
      addresses greater than 4GB. Memory leaks. And mlx5_ib module can't
      release the pages when user try to remove the module, as a result,
      system hang.
      
      [root@rdma05 root]# dmesg  | grep addr | head
      addr             = 3fe384000
      addr & PAGE_MASK =  fe384000
      [root@rdma05 root]# rmmod mlx5_ib   <---- hang on
      
      ---------------------- cosnole log -----------------
      mlx5_ib 0000:04:00.0: irq 138 for MSI/MSI-X
        alloc irq_desc for 139 on node -1
        alloc kstat_irqs on node -1
      mlx5_ib 0000:04:00.0: irq 139 for MSI/MSI-X
      0000:04:00.0:free_4k:221:(pid 1519): page not found
      0000:04:00.0:free_4k:221:(pid 1519): page not found
      0000:04:00.0:free_4k:221:(pid 1519): page not found
      0000:04:00.0:free_4k:221:(pid 1519): page not found
      ---------------------- cosnole log -----------------
      
      Fixes: bf0bf77f ('mlx5: Support communicating arbitrary host page size to firmware')
      Signed-off-by: default avatarHonggang Li <honli@redhat.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      96858104
    • Eli Cohen's avatar
      mlx5_core: Fix PowerPC support · 4636c5f9
      Eli Cohen authored
      commit 05bdb2ab upstream.
      
      1. Fix derivation of sub-page index from the dma address in free_4k.
      2. Fix the DMA address passed to dma_unmap_page by masking it properly.
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4636c5f9
    • Erez Shitrit's avatar
      IB/mlx4: Fix WQE LSO segment calculation · c8bf722e
      Erez Shitrit authored
      commit ca9b590c upstream.
      
      The current code decreases from the mss size (which is the gso_size
      from the kernel skb) the size of the packet headers.
      
      It shouldn't do that because the mss that comes from the stack
      (e.g IPoIB) includes only the tcp payload without the headers.
      
      The result is indication to the HW that each packet that the HW sends
      is smaller than what it could be, and too many packets will be sent
      for big messages.
      
      An easy way to demonstrate one more aspect of the problem is by
      configuring the ipoib mtu to be less than 2*hlen (2*56) and then
      run app sending big TCP messages. This will tell the HW to send packets
      with giant (negative value which under unsigned arithmetics becomes
      a huge positive one) length and the QP moves to SQE state.
      
      Fixes: b832be1e ('IB/mlx4: Add IPoIB LSO support')
      Reported-by: default avatarMatthew Finlay <matt@mellanox.com>
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c8bf722e
    • Mark Brown's avatar
      i2c: core: Export bus recovery functions · bccbeee0
      Mark Brown authored
      commit c1c21f4e upstream.
      
      Current -next fails to link an ARM allmodconfig because drivers that use
      the core recovery functions can be built as modules but those functions
      are not exported:
      
      ERROR: "i2c_generic_gpio_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined!
      ERROR: "i2c_generic_scl_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined!
      ERROR: "i2c_recover_bus" [drivers/i2c/busses/i2c-davinci.ko] undefined!
      
      Add exports to fix this.
      
      Fixes: 5f9296ba (i2c: Add bus recovery infrastructure)
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      bccbeee0
    • Andrew Elble's avatar
      NFS: fix BUG() crash in notify_change() with patch to chown_common() · 54135205
      Andrew Elble authored
      commit c1b8940b upstream.
      
      We have observed a BUG() crash in fs/attr.c:notify_change(). The crash
      occurs during an rsync into a filesystem that is exported via NFS.
      
      1.) fs/attr.c:notify_change() modifies the caller's version of attr.
      2.) 6de0ec00 ("VFS: make notify_change pass ATTR_KILL_S*ID to
          setattr operations") introduced a BUG() restriction such that "no
          function will ever call notify_change() with both ATTR_MODE and
          ATTR_KILL_S*ID set". Under some circumstances though, it will have
          assisted in setting the caller's version of attr to this very
          combination.
      3.) 27ac0ffe ("locks: break delegations on any attribute
          modification") introduced code to handle breaking
          delegations. This can result in notify_change() being re-called. attr
          _must_ be explicitly reset to avoid triggering the BUG() established
          in #2.
      4.) The path that that triggers this is via fs/open.c:chmod_common().
          The combination of attr flags set here and in the first call to
          notify_change() along with a later failed break_deleg_wait()
          results in notify_change() being called again via retry_deleg
          without resetting attr.
      
      Solution is to move retry_deleg in chmod_common() a bit further up to
      ensure attr is completely reset.
      
      There are other places where this seemingly could occur, such as
      fs/utimes.c:utimes_common(), but the attr flags are not initially
      set in such a way to trigger this.
      
      Fixes: 27ac0ffe ("locks: break delegations on any attribute modification")
      Reported-by: default avatarEric Meddaugh <etmsys@rit.edu>
      Tested-by: default avatarEric Meddaugh <etmsys@rit.edu>
      Signed-off-by: default avatarAndrew Elble <aweits@rit.edu>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      54135205
    • Dave Olson's avatar
      powerpc: Fix missing L2 cache size in /sys/devices/system/cpu · a760d7cb
      Dave Olson authored
      commit f7e9e358 upstream.
      
      This problem appears to have been introduced in 2.6.29 by commit
      93197a36 "Rewrite sysfs processor cache info code".
      
      This caused lscpu to error out on at least e500v2 devices, eg:
      
        error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory
      
      Some embedded powerpc systems use cache-size in DTS for the unified L2
      cache size, not d-cache-size, so we need to allow for both DTS names.
      Added a new CACHE_TYPE_UNIFIED_D cache_type_info structure to handle
      this.
      
      Fixes: 93197a36 ("powerpc: Rewrite sysfs processor cache info code")
      Signed-off-by: default avatarDave Olson <olson@cumulusnetworks.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a760d7cb
    • Radim Krčmář's avatar
      KVM: use slowpath for cross page cached accesses · 26456762
      Radim Krčmář authored
      commit ca3f0874 upstream.
      
      kvm_write_guest_cached() does not mark all written pages as dirty and
      code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
      with cross page accesses.  Fix all the easy way.
      
      The check is '<= 1' to have the same result for 'len = 0' cache anywhere
      in the page.  (nr_pages_needed is 0 on page boundary.)
      
      Fixes: 8f964525 ("KVM: Allow cross page reads and writes from cached translations.")
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <20150408121648.GA3519@potion.brq.redhat.com>
      Reviewed-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      26456762
    • Alexander Duyck's avatar
      jhash: Update jhash_[321]words functions to use correct initval · 516430eb
      Alexander Duyck authored
      commit 2e7056c4 upstream.
      
      Looking over the implementation for jhash2 and comparing it to jhash_3words
      I realized that the two hashes were in fact very different.  Doing a bit of
      digging led me to "The new jhash implementation" in which lookup2 was
      supposed to have been replaced with lookup3.
      
      In reviewing the patch I noticed that jhash2 had originally initialized a
      and b to JHASH_GOLDENRATIO and c to initval, but after the patch a, b, and
      c were initialized to initval + (length << 2) + JHASH_INITVAL.  However the
      changes in jhash_3words simply replaced the initialization of a and b with
      JHASH_INITVAL.
      
      This change corrects what I believe was an oversight so that a, b, and c in
      jhash_3words all have the same value added consisting of initval + (length
      << 2) + JHASH_INITVAL so that jhash2 and jhash_3words will now produce the
      same hash result given the same inputs.
      
      Fixes: 60d509c8 ("The new jhash implementation")
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      516430eb
    • Vutla, Lokesh's avatar
      crypto: omap-aes - Fix support for unequal lengths · 4b6e84d6
      Vutla, Lokesh authored
      commit 6d7e7e02 upstream.
      
      For cases where total length of an input SGs is not same as
      length of the input data for encryption, omap-aes driver
      crashes. This happens in the case when IPsec is trying to use
      omap-aes driver.
      
      To avoid this, we copy all the pages from the input SG list
      into a contiguous buffer and prepare a single element SG list
      for this buffer with length as the total bytes to crypt, which is
      similar thing that is done in case of unaligned lengths.
      
      Fixes: 6242332f ("crypto: omap-aes - Add support for cases of unaligned lengths")
      Signed-off-by: default avatarLokesh Vutla <lokeshvutla@ti.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4b6e84d6
    • Nishanth Menon's avatar
      C6x: time: Ensure consistency in __init · e7732475
      Nishanth Menon authored
      commit f4831605 upstream.
      
      time_init invokes timer64_init (which is __init annotation)
      since all of these are invoked at init time, lets maintain
      consistency by ensuring time_init is marked appropriately
      as well.
      
      This fixes the following warning with CONFIG_DEBUG_SECTION_MISMATCH=y
      
      WARNING: vmlinux.o(.text+0x3bfc): Section mismatch in reference from the function time_init() to the function .init.text:timer64_init()
      The function time_init() references
      the function __init timer64_init().
      This is often because time_init lacks a __init
      annotation or the annotation of timer64_init is wrong.
      
      Fixes: 546a3954 ("C6X: time management")
      Signed-off-by: default avatarNishanth Menon <nm@ti.com>
      Signed-off-by: default avatarMark Salter <msalter@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e7732475
    • Junjie Mao's avatar
      driver core: bus: Goto appropriate labels on failure in bus_add_device · f7fe3687
      Junjie Mao authored
      commit 1c34203a upstream.
      
      It is not necessary to call device_remove_groups() when device_add_groups()
      fails.
      
      The group added by device_add_groups() should be removed if sysfs_create_link()
      fails.
      
      Fixes: fa6fdb33 ("driver core: bus_type: add dev_groups")
      Signed-off-by: default avatarJunjie Mao <junjie_mao@yeah.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f7fe3687
    • mancha security's avatar
      lib: memzero_explicit: use barrier instead of OPTIMIZER_HIDE_VAR · 99c5da5e
      mancha security authored
      commit 0b053c95 upstream.
      
      OPTIMIZER_HIDE_VAR(), as defined when using gcc, is insufficient to
      ensure protection from dead store optimization.
      
      For the random driver and crypto drivers, calls are emitted ...
      
        $ gdb vmlinux
        (gdb) disassemble memzero_explicit
        Dump of assembler code for function memzero_explicit:
          0xffffffff813a18b0 <+0>:	push   %rbp
          0xffffffff813a18b1 <+1>:	mov    %rsi,%rdx
          0xffffffff813a18b4 <+4>:	xor    %esi,%esi
          0xffffffff813a18b6 <+6>:	mov    %rsp,%rbp
          0xffffffff813a18b9 <+9>:	callq  0xffffffff813a7120 <memset>
          0xffffffff813a18be <+14>:	pop    %rbp
          0xffffffff813a18bf <+15>:	retq
        End of assembler dump.
      
        (gdb) disassemble extract_entropy
        [...]
          0xffffffff814a5009 <+313>:	mov    %r12,%rdi
          0xffffffff814a500c <+316>:	mov    $0xa,%esi
          0xffffffff814a5011 <+321>:	callq  0xffffffff813a18b0 <memzero_explicit>
          0xffffffff814a5016 <+326>:	mov    -0x48(%rbp),%rax
        [...]
      
      ... but in case in future we might use facilities such as LTO, then
      OPTIMIZER_HIDE_VAR() is not sufficient to protect gcc from a possible
      eviction of the memset(). We have to use a compiler barrier instead.
      
      Minimal test example when we assume memzero_explicit() would *not* be
      a call, but would have been *inlined* instead:
      
        static inline void memzero_explicit(void *s, size_t count)
        {
          memset(s, 0, count);
          <foo>
        }
      
        int main(void)
        {
          char buff[20];
      
          snprintf(buff, sizeof(buff) - 1, "test");
          printf("%s", buff);
      
          memzero_explicit(buff, sizeof(buff));
          return 0;
        }
      
      With <foo> := OPTIMIZER_HIDE_VAR():
      
        (gdb) disassemble main
        Dump of assembler code for function main:
        [...]
         0x0000000000400464 <+36>:	callq  0x400410 <printf@plt>
         0x0000000000400469 <+41>:	xor    %eax,%eax
         0x000000000040046b <+43>:	add    $0x28,%rsp
         0x000000000040046f <+47>:	retq
        End of assembler dump.
      
      With <foo> := barrier():
      
        (gdb) disassemble main
        Dump of assembler code for function main:
        [...]
         0x0000000000400464 <+36>:	callq  0x400410 <printf@plt>
         0x0000000000400469 <+41>:	movq   $0x0,(%rsp)
         0x0000000000400471 <+49>:	movq   $0x0,0x8(%rsp)
         0x000000000040047a <+58>:	movl   $0x0,0x10(%rsp)
         0x0000000000400482 <+66>:	xor    %eax,%eax
         0x0000000000400484 <+68>:	add    $0x28,%rsp
         0x0000000000400488 <+72>:	retq
        End of assembler dump.
      
      As can be seen, movq, movq, movl are being emitted inlined
      via memset().
      
      Reference: http://thread.gmane.org/gmane.linux.kernel.cryptoapi/13764/
      Fixes: d4c5efdb ("random: add and use memzero_explicit() for clearing data")
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarmancha security <mancha1@zoho.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      99c5da5e
    • Nicolas Iooss's avatar
      wl18xx: show rx_frames_per_rates as an array as it really is · ca654131
      Nicolas Iooss authored
      commit a3fa71c4 upstream.
      
      In struct wl18xx_acx_rx_rate_stat, rx_frames_per_rates field is an
      array, not a number.  This means WL18XX_DEBUGFS_FWSTATS_FILE can't be
      used to display this field in debugfs (it would display a pointer, not
      the actual data).  Use WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY instead.
      
      This bug has been found by adding a __printf attribute to
      wl1271_format_buffer.  gcc complained about "format '%u' expects
      argument of type 'unsigned int', but argument 5 has type 'u32 *'".
      
      Fixes: c5d94169 ("wl18xx: use new fw stats structures")
      Signed-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ca654131
    • Sabrina Dubroca's avatar
      e1000: add dummy allocator to fix race condition between mtu change and netpoll · b29d9b81
      Sabrina Dubroca authored
      commit 08e83316 upstream.
      
      There is a race condition between e1000_change_mtu's cleanups and
      netpoll, when we change the MTU across jumbo size:
      
      Changing MTU frees all the rx buffers:
          e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings ->
              e1000_clean_rx_ring
      
      Then, close to the end of e1000_change_mtu:
          pr_info -> ... -> netpoll_poll_dev -> e1000_clean ->
              e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag
      
      And when we come back to do the rest of the MTU change:
          e1000_up -> e1000_configure -> e1000_configure_rx ->
              e1000_alloc_jumbo_rx_buffers
      
      alloc_jumbo finds the buffers already != NULL, since data (shared with
      page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage,
      or at least not what is expected when in jumbo state.
      
      This results in an unusable adapter (packets don't get through), and a
      NULL pointer dereference on the next call to e1000_clean_rx_ring
      (other mtu change, link down, shutdown):
      
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330
      
          [...]
      
      Call Trace:
       [<ffffffff81195445>] put_page+0x55/0x60
       [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200
       [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60
       [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0
       [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840
       [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170
       [<ffffffff81647050>] dev_set_mtu+0xa0/0x140
       [<ffffffff81664218>] do_setlink+0x218/0xac0
       [<ffffffff814459e9>] ? nla_parse+0xb9/0x120
       [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890
       [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40
       [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100
       [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260
      
      By setting the allocator to a dummy version, netpoll can't mess up our
      rx buffers.  The allocator is set back to a sane value in
      e1000_configure_rx.
      
      Fixes: edbbb3ca ("e1000: implement jumbo receive with partial descriptors")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b29d9b81
    • Jeff Layton's avatar
      nfs: fix high load average due to callback thread sleeping · f09a1ad6
      Jeff Layton authored
      commit 5d05e54a upstream.
      
      Chuck pointed out a problem that crept in with commit 6ffa30d3 (nfs:
      don't call blocking operations while !TASK_RUNNING). Linux counts tasks
      in uninterruptible sleep against the load average, so this caused the
      system's load average to be pinned at at least 1 when there was a
      NFSv4.1+ mount active.
      
      Not a huge problem, but it's probably worth fixing before we get too
      many complaints about it. This patch converts the code back to use
      TASK_INTERRUPTIBLE sleep, simply has it flush any signals on each loop
      iteration. In practice no one should really be signalling this thread at
      all, so I think this is reasonably safe.
      
      With this change, there's also no need to game the hung task watchdog so
      we can also convert the schedule_timeout call back to a normal schedule.
      Reported-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Tested-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Fixes: commit 6ffa30d3 (“nfs: don't call blocking . . .”)
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f09a1ad6
    • Jeff Layton's avatar
      nfs: don't call blocking operations while !TASK_RUNNING · d6cfb99d
      Jeff Layton authored
      commit 6ffa30d3 upstream.
      
      Bruce reported seeing this warning pop when mounting using v4.1:
      
           ------------[ cut here ]------------
           WARNING: CPU: 1 PID: 1121 at kernel/sched/core.c:7300 __might_sleep+0xbd/0xd0()
          do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810ff58f>] prepare_to_wait+0x2f/0x90
          Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer ppdev joydev snd virtio_console virtio_balloon pcspkr serio_raw parport_pc parport pvpanic floppy soundcore i2c_piix4 virtio_blk virtio_net qxl drm_kms_helper ttm drm virtio_pci virtio_ring ata_generic virtio pata_acpi
          CPU: 1 PID: 1121 Comm: nfsv4.1-svc Not tainted 3.19.0-rc4+ #25
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
           0000000000000000 000000004e5e3f73 ffff8800b998fb48 ffffffff8186ac78
           0000000000000000 ffff8800b998fba0 ffff8800b998fb88 ffffffff810ac9da
           ffff8800b998fb68 ffffffff81c923e7 00000000000004d9 0000000000000000
          Call Trace:
           [<ffffffff8186ac78>] dump_stack+0x4c/0x65
           [<ffffffff810ac9da>] warn_slowpath_common+0x8a/0xc0
           [<ffffffff810aca65>] warn_slowpath_fmt+0x55/0x70
           [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90
           [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90
           [<ffffffff810dd2ad>] __might_sleep+0xbd/0xd0
           [<ffffffff8124c973>] kmem_cache_alloc_trace+0x243/0x430
           [<ffffffff810d941e>] ? groups_alloc+0x3e/0x130
           [<ffffffff810d941e>] groups_alloc+0x3e/0x130
           [<ffffffffa0301b1e>] svcauth_unix_accept+0x16e/0x290 [sunrpc]
           [<ffffffffa0300571>] svc_authenticate+0xe1/0xf0 [sunrpc]
           [<ffffffffa02fc564>] svc_process_common+0x244/0x6a0 [sunrpc]
           [<ffffffffa02fd044>] bc_svc_process+0x1c4/0x260 [sunrpc]
           [<ffffffffa03d5478>] nfs41_callback_svc+0x128/0x1f0 [nfsv4]
           [<ffffffff810ff970>] ? wait_woken+0xc0/0xc0
           [<ffffffffa03d5350>] ? nfs4_callback_svc+0x60/0x60 [nfsv4]
           [<ffffffff810d45bf>] kthread+0x11f/0x140
           [<ffffffff810ea815>] ? local_clock+0x15/0x30
           [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250
           [<ffffffff81874bfc>] ret_from_fork+0x7c/0xb0
           [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250
          ---[ end trace 675220a11e30f4f2 ]---
      
      nfs41_callback_svc does most of its work while in TASK_INTERRUPTIBLE,
      which is just wrong. Fix that by finishing the wait immediately if we've
      found that the list has something on it.
      
      Also, we don't expect this kthread to accept signals, so we should be
      using a TASK_UNINTERRUPTIBLE sleep instead. That however, opens us up
      hung task warnings from the watchdog, so have the schedule_timeout
      wake up every 60s if there's no callback activity.
      Reported-by: default avatar"J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarJeff Layton <jlayton@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d6cfb99d
    • Len Brown's avatar
      sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power... · 1f766656
      Len Brown authored
      sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance
      
      commit b253149b upstream.
      
      In Linux-3.9 we removed the mwait_idle() loop:
      
        69fb3676 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")
      
      The reasoning was that modern machines should be sufficiently
      happy during the boot process using the default_idle() HALT
      loop, until cpuidle loads and either acpi_idle or intel_idle
      invoke the newer MWAIT-with-hints idle loop.
      
      But two machines reported problems:
      
       1. Certain Core2-era machines support MWAIT-C1 and HALT only.
          MWAIT-C1 is preferred for optimal power and performance.
          But if they support just C1, cpuidle never loads and
          so they use the boot-time default idle loop forever.
      
       2. Some laptops will boot-hang if HALT is used,
          but will boot successfully if MWAIT is used.
          This appears to be a hidden assumption in BIOS SMI,
          that is presumably valid on the proprietary OS
          where the BIOS was validated.
      
             https://bugzilla.kernel.org/show_bug.cgi?id=60770
      
      So here we effectively revert the patch above, restoring
      the mwait_idle() loop.  However, we don't bother restoring
      the idle=mwait cmdline parameter, since it appears to add
      no value.
      
      Maintainer notes:
      
        For 3.9, simply revert 69fb3676
        for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
        context For 3.11, 3.12, 3.13, this patch applies cleanly
      Tested-by: default avatarMike Galbraith <bitbucket@online.de>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Acked-by: default avatarMike Galbraith <bitbucket@online.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ian Malone <ibmalone@gmail.com>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
      [ Ported to recent kernels. ]
      [ Mike: 3.10 backport ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1f766656
    • Krzysztof Kozlowski's avatar
      compal-laptop: Check return value of power_supply_register · 6f35715c
      Krzysztof Kozlowski authored
      commit 1915a718 upstream.
      
      The return value of power_supply_register() call was not checked and
      even on error probe() function returned 0. If registering failed then
      during unbind the driver tried to unregister power supply which was not
      actually registered.
      
      This could lead to memory corruption because power_supply_unregister()
      unconditionally cleans up given power supply.
      
      Fix this by checking return status of power_supply_register() call. In
      case of failure, clean up sysfs entries and fail the probe.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: 9be0fcb5 ("compal-laptop: add JHL90, battery & hwmon interface")
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: Jiri Slaby <jslaby@suse.cz> [backport to 3.12]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6f35715c
    • Al Viro's avatar
      RCU pathwalk breakage when running into a symlink overmounting something · b1aaebf1
      Al Viro authored
      commit 3cab989a upstream.
      
      Calling unlazy_walk() in walk_component() and do_last() when we find
      a symlink that needs to be followed doesn't acquire a reference to vfsmount.
      That's fine when the symlink is on the same vfsmount as the parent directory
      (which is almost always the case), but it's not always true - one _can_
      manage to bind a symlink on top of something.  And in such cases we end up
      with excessive mntput().
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b1aaebf1
    • Dmitry Torokhov's avatar
      drm/i915: cope with large i2c transfers · 345c304f
      Dmitry Torokhov authored
      commit 9535c475 upstream.
      
      The hardware, according to the specs, is limited to 256 byte transfers,
      and current driver has no protections in case users attempt to do larger
      transfers. The code will just stomp over status register and mayhem
      ensues.
      
      Let's split larger transfers into digestable chunks. Doing this allows
      Atmel MXT driver on Pixel 1 function properly (it hasn't since commit
      9d8dc3e5 "Input: atmel_mxt_ts -
      implement T44 message handling" which tries to consume multiple
      touchscreen/touchpad reports in a single transaction).
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      345c304f
  3. 26 May, 2015 12 commits
    • James Bottomley's avatar
      mvsas: fix panic on expander attached SATA devices · 37aec83a
      James Bottomley authored
      commit 56cbd0cc upstream.
      
      mvsas is giving a General protection fault when it encounters an expander
      attached ATA device.  Analysis of mvs_task_prep_ata() shows that the driver is
      assuming all ATA devices are locally attached and obtaining the phy mask by
      indexing the local phy table (in the HBA structure) with the phy id.  Since
      expanders have many more phys than the HBA, this is causing the index into the
      HBA phy table to overflow and returning rubbish as the pointer.
      
      mvs_task_prep_ssp() instead does the phy mask using the port properties.
      Mirror this in mvs_task_prep_ata() to fix the panic.
      Reported-by: default avatarAdam Talbot <ajtalbot1@gmail.com>
      Tested-by: default avatarAdam Talbot <ajtalbot1@gmail.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      37aec83a
    • Oleg Nesterov's avatar
      ptrace: fix race between ptrace_resume() and wait_task_stopped() · 9317661c
      Oleg Nesterov authored
      commit b72c1869 upstream.
      
      ptrace_resume() is called when the tracee is still __TASK_TRACED.  We set
      tracee->exit_code and then wake_up_state() changes tracee->state.  If the
      tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T)
      wrongly looks like another report from tracee.
      
      This confuses debugger, and since wait_task_stopped() clears ->exit_code
      the tracee can miss a signal.
      
      Test-case:
      
      	#include <stdio.h>
      	#include <unistd.h>
      	#include <sys/wait.h>
      	#include <sys/ptrace.h>
      	#include <pthread.h>
      	#include <assert.h>
      
      	int pid;
      
      	void *waiter(void *arg)
      	{
      		int stat;
      
      		for (;;) {
      			assert(pid == wait(&stat));
      			assert(WIFSTOPPED(stat));
      			if (WSTOPSIG(stat) == SIGHUP)
      				continue;
      
      			assert(WSTOPSIG(stat) == SIGCONT);
      			printf("ERR! extra/wrong report:%x\n", stat);
      		}
      	}
      
      	int main(void)
      	{
      		pthread_t thread;
      
      		pid = fork();
      		if (!pid) {
      			assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
      			for (;;)
      				kill(getpid(), SIGHUP);
      		}
      
      		assert(pthread_create(&thread, NULL, waiter, NULL) == 0);
      
      		for (;;)
      			ptrace(PTRACE_CONT, pid, 0, SIGCONT);
      
      		return 0;
      	}
      
      Note for stable: the bug is very old, but without 9899d11f "ptrace:
      ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix
      should use lock_task_sighand(child).
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarPavel Labath <labath@google.com>
      Tested-by: default avatarPavel Labath <labath@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      9317661c
    • Yann Droneaud's avatar
      IB/core: don't disallow registering region starting at 0x0 · 0239250c
      Yann Droneaud authored
      commit 66578b0b upstream.
      
      In a call to ib_umem_get(), if address is 0x0 and size is
      already page aligned, check added in commit 8494057a
      ("IB/uverbs: Prevent integer overflow in ib_umem_get address
      arithmetic") will refuse to register a memory region that
      could otherwise be valid (provided vm.mmap_min_addr sysctl
      and mmap_low_allowed SELinux knobs allow userspace to map
      something at address 0x0).
      
      This patch allows back such registration: ib_umem_get()
      should probably don't care of the base address provided it
      can be pinned with get_user_pages().
      
      There's two possible overflows, in (addr + size) and in
      PAGE_ALIGN(addr + size), this patch keep ensuring none
      of them happen while allowing to pin memory at address
      0x0. Anyway, the case of size equal 0 is no more (partially)
      handled as 0-length memory region are disallowed by an
      earlier check.
      
      Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
      Cc: Shachar Raindel <raindel@mellanox.com>
      Cc: Jack Morgenstein <jackm@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Reviewed-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Reviewed-by: default avatarHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0239250c
    • Yann Droneaud's avatar
      IB/core: disallow registering 0-sized memory region · c6d4f56d
      Yann Droneaud authored
      commit 8abaae62 upstream.
      
      If ib_umem_get() is called with a size equal to 0 and an
      non-page aligned address, one page will be pinned and a
      0-sized umem will be returned to the caller.
      
      This should not be allowed: it's not expected for a memory
      region to have a size equal to 0.
      
      This patch adds a check to explicitly refuse to register
      a 0-sized region.
      
      Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
      Cc: Shachar Raindel <raindel@mellanox.com>
      Cc: Jack Morgenstein <jackm@mellanox.com>
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c6d4f56d
    • Michael Davidson's avatar
      fs/binfmt_elf.c: fix bug in loading of PIE binaries · 7b5436bf
      Michael Davidson authored
      commit a87938b2 upstream.
      
      With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down
      address allocation strategy, load_elf_binary() will attempt to map a PIE
      binary into an address range immediately below mm->mmap_base.
      
      Unfortunately, load_elf_ binary() does not take account of the need to
      allocate sufficient space for the entire binary which means that, while
      the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent
      PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are
      that is supposed to be the "gap" between the stack and the binary.
      
      Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this
      means that binaries with large data segments > 128MB can end up mapping
      part of their data segment over their stack resulting in corruption of the
      stack (and the data segment once the binary starts to run).
      
      Any PIE binary with a data segment > 128MB is vulnerable to this although
      address randomization means that the actual gap between the stack and the
      end of the binary is normally greater than 128MB.  The larger the data
      segment of the binary the higher the probability of failure.
      
      Fix this by calculating the total size of the binary in the same way as
      load_elf_interp().
      Signed-off-by: default avatarMichael Davidson <md@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7b5436bf
    • Gerald Schaefer's avatar
      mm/hugetlb: use pmd_page() in follow_huge_pmd() · 658cd810
      Gerald Schaefer authored
      commit 97534127 upstream.
      
      Commit 61f77eda ("mm/hugetlb: reduce arch dependent code around
      follow_huge_*") broke follow_huge_pmd() on s390, where pmd and pte
      layout differ and using pte_page() on a huge pmd will return wrong
      results.  Using pmd_page() instead fixes this.
      
      All architectures that were touched by that commit have pmd_page()
      defined, so this should not break anything on other architectures.
      
      Fixes: 61f77eda "mm/hugetlb: reduce arch dependent code around follow_huge_*"
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@suse.cz>, Andrea Arcangeli <aarcange@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      658cd810
    • Nicholas Bellinger's avatar
      target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling · 8ef7da4c
      Nicholas Bellinger authored
      commit c8e63985 upstream.
      
      This patch fixes a bug for COMPARE_AND_WRITE handling with
      fabrics using SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC.
      
      It adds the missing allocation for cmd->t_bidi_data_sg within
      transport_generic_new_cmd() that is used by COMPARE_AND_WRITE
      for the initial READ payload, even if the fabric is already
      providing a pre-allocated buffer for cmd->t_data_sg.
      
      Also, fix zero-length COMPARE_AND_WRITE handling within the
      compare_and_write_callback() and target_complete_ok_work()
      to queue the response, skipping the initial READ.
      
      This fixes COMPARE_AND_WRITE emulation with loopback, vhost,
      and xen-backend fabric drivers using SG_TO_MEM_NOALLOC.
      Reported-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8ef7da4c
    • Daniel Vetter's avatar
      drm/i915: Dont enable CS_PARSER_ERROR interrupts at all · fa2c7758
      Daniel Vetter authored
      commit 37ef01ab upstream.
      
      We stopped handling them in
      
      commit aaecdf61
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Tue Nov 4 15:52:22 2014 +0100
      
          drm/i915: Stop gathering error states for CS error interrupts
      
      but just clearing is apparently not enough: A sufficiently dead gpu
      left behind by firmware (*cough* coreboot *cough*) can keep the gpu in
      an endless loop of such interrupts, eventually leading to the nmi
      firing. And definitely to what looks like a machine hang.
      
      Since we don't even enable these interrupts on gen5+ let's do the same
      on earlier platforms.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93171Tested-by: default avatarMono <mono-for-kernel-org@donderklumpen.de>
      Tested-by: info@gluglug.org.uk
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fa2c7758
    • Lv Zheng's avatar
      ACPICA: Utilities: split IO address types from data type models. · 93ff29e6
      Lv Zheng authored
      commit 2b876010 upstream.
      
      ACPICA commit aacf863cfffd46338e268b7415f7435cae93b451
      
      It is reported that on a physically 64-bit addressed machine, 32-bit kernel
      can trigger crashes in accessing the memory regions that are beyond the
      32-bit boundary. The region field's start address should still be 32-bit
      compliant, but after a calculation (adding some offsets), it may exceed the
      32-bit boundary. This case is rare and buggy, but there are real BIOSes
      leaked with such issues (see References below).
      
      This patch fixes this gap by always defining IO addresses as 64-bit, and
      allows OSPMs to optimize it for a real 32-bit machine to reduce the size of
      the internal objects.
      
      Internal acpi_physical_address usages in the structures that can be fixed
      by this change include:
       1. struct acpi_object_region:
          acpi_physical_address		address;
       2. struct acpi_address_range:
          acpi_physical_address		start_address;
          acpi_physical_address		end_address;
       3. struct acpi_mem_space_context;
          acpi_physical_address		address;
       4. struct acpi_table_desc
          acpi_physical_address		address;
      See known issues 1 for other usages.
      
      Note that acpi_io_address which is used for ACPI_PROCESSOR may also suffer
      from same problem, so this patch changes it accordingly.
      
      For iasl, it will enforce acpi_physical_address as 32-bit to generate
      32-bit OSPM compatible tables on 32-bit platforms, we need to define
      ACPI_32BIT_PHYSICAL_ADDRESS for it in acenv.h.
      
      Known issues:
       1. Cleanup of mapped virtual address
         In struct acpi_mem_space_context, acpi_physical_address is used as a virtual
         address:
          acpi_physical_address                   mapped_physical_address;
         It is better to introduce acpi_virtual_address or use acpi_size instead.
         This patch doesn't make such a change. Because this should be done along
         with a change to acpi_os_map_memory()/acpi_os_unmap_memory().
         There should be no functional problem to leave this unchanged except
         that only this structure is enlarged unexpectedly.
      
      Link: https://github.com/acpica/acpica/commit/aacf863c
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=87971
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=79501Reported-and-tested-by: default avatarPaul Menzel <paulepanter@users.sourceforge.net>
      Reported-and-tested-by: default avatarSial Nije <sialnije@gmail.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarBob Moore <robert.moore@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      93ff29e6
    • Anton Blanchard's avatar
      powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH · 09b5fa71
      Anton Blanchard authored
      commit 9a5cbce4 upstream.
      
      We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH
      (currently 127), but we forgot to do the same for 64bit backtraces.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      09b5fa71
    • Filipe Manana's avatar
      Btrfs: fix inode eviction infinite loop after cloning into it · 97c3d660
      Filipe Manana authored
      commit ccccf3d6 upstream.
      
      If we attempt to clone a 0 length region into a file we can end up
      inserting a range in the inode's extent_io tree with a start offset
      that is greater then the end offset, which triggers immediately the
      following warning:
      
      [ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
      [ 3914.620886] BTRFS: end < start 4095 4096
      (...)
      [ 3914.638093] Call Trace:
      [ 3914.638636]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
      [ 3914.639620]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
      [ 3914.640789]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
      [ 3914.642041]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
      [ 3914.643236]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
      [ 3914.644441]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
      [ 3914.645711]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
      [ 3914.646914]  [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33
      [ 3914.648058]  [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs]
      [ 3914.650105]  [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs]
      [ 3914.651361]  [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs]
      [ 3914.652761]  [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs]
      [ 3914.654128]  [<ffffffff811226dd>] ? might_fault+0x58/0xb5
      [ 3914.655320]  [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs]
      (...)
      [ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]---
      
      This later makes the inode eviction handler enter an infinite loop that
      keeps dumping the following warning over and over:
      
      [ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
      [ 3915.119913] BTRFS: end < start 4095 4096
      (...)
      [ 3915.137394] Call Trace:
      [ 3915.137913]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
      [ 3915.139154]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
      [ 3915.140316]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
      [ 3915.141505]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
      [ 3915.142709]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
      [ 3915.143849]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
      [ 3915.145120]  [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs]
      [ 3915.146352]  [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50
      [ 3915.147565]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
      [ 3915.148785]  [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33
      [ 3915.149931]  [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs]
      [ 3915.151154]  [<ffffffff81168904>] evict+0xa0/0x148
      [ 3915.152094]  [<ffffffff811689e5>] dispose_list+0x39/0x43
      [ 3915.153081]  [<ffffffff81169564>] evict_inodes+0xdc/0xeb
      [ 3915.154062]  [<ffffffff81154418>] generic_shutdown_super+0x49/0xef
      [ 3915.155193]  [<ffffffff811546d1>] kill_anon_super+0x13/0x1e
      [ 3915.156274]  [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
      (...)
      [ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]---
      
      So just bail out of the clone ioctl if the length of the region to clone
      is zero, without locking any extent range, in order to prevent this issue
      (same behaviour as a pwrite with a 0 length for example).
      
      This is trivial to reproduce. For example, the steps for the test I just
      made for fstests:
      
        mkfs.btrfs -f SCRATCH_DEV
        mount SCRATCH_DEV $SCRATCH_MNT
      
        touch $SCRATCH_MNT/foo
        touch $SCRATCH_MNT/bar
      
        $CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
        umount $SCRATCH_MNT
      
      A test case for fstests follows soon.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarOmar Sandoval <osandov@osandov.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      97c3d660
    • Filipe Manana's avatar
      Btrfs: fix inode eviction infinite loop after extent_same ioctl · d51db72a
      Filipe Manana authored
      commit 113e8283 upstream.
      
      If we pass a length of 0 to the extent_same ioctl, we end up locking an
      extent range with a start offset greater then its end offset (if the
      destination file's offset is greater than zero). This results in a warning
      from extent_io.c:insert_state through the following call chain:
      
        btrfs_extent_same()
          btrfs_double_lock()
            lock_extent_range()
              lock_extent(inode->io_tree, offset, offset + len - 1)
                lock_extent_bits()
                  __set_extent_bit()
                    insert_state()
                      --> WARN_ON(end < start)
      
      This leads to an infinite loop when evicting the inode. This is the same
      problem that my previous patch titled
      "Btrfs: fix inode eviction infinite loop after cloning into it" addressed
      but for the extent_same ioctl instead of the clone ioctl.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarOmar Sandoval <osandov@osandov.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d51db72a