1. 20 Aug, 2016 40 commits
    • Mikulas Patocka's avatar
      drm/nouveau/fbcon: fix font width not divisible by 8 · c1b4d25b
      Mikulas Patocka authored
      [ Upstream commit 28668f43 ]
      
      The patch f045f459 ("drm/nouveau/fbcon: fix out-of-bounds memory accesses")
      tries to fix some out of memory accesses. Unfortunatelly, the patch breaks the
      display when using fonts with width that is not divisiable by 8.
      
      The monochrome bitmap for each character is stored in memory by lines from top
      to bottom. Each line is padded to a full byte.
      
      For example, for 22x11 font, each line is padded to 16 bits, so each
      character is consuming 44 bytes total, that is 11 32-bit words. The patch
      f045f459 changed the logic to "dsize = ALIGN(image->width *
      image->height, 32) >> 5", that is just 8 words - this is incorrect and it
      causes display corruption.
      
      This patch adds the necesary padding of lines to 8 bytes.
      
      This patch should be backported to stable kernels where f045f459 was
      backported.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Fixes: f045f459 ("drm/nouveau/fbcon: fix out-of-bounds memory accesses")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      c1b4d25b
    • Richard Weinberger's avatar
      ubi: Make volume resize power cut aware · c027bc0c
      Richard Weinberger authored
      [ Upstream commit 4946784b ]
      
      When the volume resize operation shrinks a volume,
      LEBs will be unmapped. Since unmapping will not erase these
      LEBs immediately we have to wait for that operation to finish.
      Otherwise in case of a power cut right after writing the new
      volume table the UBI attach process can find more LEBs than the
      volume table knows. This will render the UBI image unattachable.
      
      Fix this issue by waiting for erase to complete and write the new
      volume table afterward.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      c027bc0c
    • Richard Weinberger's avatar
      ubi: Fix early logging · 304e9158
      Richard Weinberger authored
      [ Upstream commit bc743f34 ]
      
      We cannot use ubi_* logging functions before the UBI
      object is initialized.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 32608703 ("UBI: Extend UBI layer debug/messaging capabilities")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      304e9158
    • Iosif Harutyunov's avatar
      ubi: Fix race condition between ubi device creation and udev · ae32d1b9
      Iosif Harutyunov authored
      [ Upstream commit 714fb87e ]
      
      Install the UBI device object before we arm sysfs.
      Otherwise udev tries to read sysfs attributes before UBI is ready and
      udev rules will not match.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarIosif Harutyunov <iharutyunov@sonicwall.com>
      [rw: massaged commit message]
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      ae32d1b9
    • Wei Fang's avatar
      fuse: fix wrong assignment of ->flags in fuse_send_init() · eb61bdda
      Wei Fang authored
      [ Upstream commit 9446385f ]
      
      FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo.
      Signed-off-by: default avatarWei Fang <fangwei1@huawei.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 69fe05c9 ("fuse: add missing INIT flags")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      eb61bdda
    • Maxim Patlasov's avatar
      fuse: fuse_flush must check mapping->flags for errors · 614c3393
      Maxim Patlasov authored
      [ Upstream commit 9ebce595 ]
      
      fuse_flush() calls write_inode_now() that triggers writeback, but actual
      writeback will happen later, on fuse_sync_writes(). If an error happens,
      fuse_writepage_end() will set error bit in mapping->flags. So, we have to
      check mapping->flags after fuse_sync_writes().
      Signed-off-by: default avatarMaxim Patlasov <mpatlasov@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
      Cc: <stable@vger.kernel.org> # v3.15+
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      614c3393
    • Alexey Kuznetsov's avatar
      fuse: fsync() did not return IO errors · 3fc4a4a2
      Alexey Kuznetsov authored
      [ Upstream commit ac7f052b ]
      
      Due to implementation of fuse writeback filemap_write_and_wait_range() does
      not catch errors. We have to do this directly after fuse_sync_writes()
      Signed-off-by: default avatarAlexey Kuznetsov <kuznet@virtuozzo.com>
      Signed-off-by: default avatarMaxim Patlasov <mpatlasov@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
      Cc: <stable@vger.kernel.org> # v3.15+
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      3fc4a4a2
    • Vineet Gupta's avatar
      ARC: mm: don't loose PTE_SPECIAL in pte_modify() · 93c0b008
      Vineet Gupta authored
      [ Upstream commit 3925a16a ]
      
      LTP madvise05 was generating mm splat
      
      | [ARCLinux]# /sd/ltp/testcases/bin/madvise05
      | BUG: Bad page map in process madvise05  pte:80e08211 pmd:9f7d4000
      | page:9fdcfc90 count:1 mapcount:-1 mapping:  (null) index:0x0 flags: 0x404(referenced|reserved)
      | page dumped because: bad pte
      | addr:200b8000 vm_flags:00000070 anon_vma:  (null) mapping:  (null) index:1005c
      | file:  (null) fault:  (null) mmap:  (null) readpage:  (null)
      | CPU: 2 PID: 6707 Comm: madvise05
      
      And for newer kernels, the system was rendered unusable afterwards.
      
      The problem was mprotect->pte_modify() clearing PTE_SPECIAL (which is
      set to identify the special zero page wired to the pte).
      When pte was finally unmapped, special casing for zero page was not
      done, and instead it was treated as a "normal" page, tripping on the
      map counts etc.
      
      This fixes ARC STAR 9001053308
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      93c0b008
    • Alex Deucher's avatar
      drm/radeon: fix firmware info version checks · ae9c7f35
      Alex Deucher authored
      [ Upstream commit 3edc38a0 ]
      
      Some of the checks didn't handle frev 2 tables properly.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      ae9c7f35
    • David Howells's avatar
      KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace · 7afd374e
      David Howells authored
      [ Upstream commit 20f06ed9 ]
      
      MIPS64 needs to use compat_sys_keyctl for 32-bit userspace rather than
      calling sys_keyctl.  The latter will work in a lot of cases, thereby hiding
      the issue.
      Reported-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: stable@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-security-module@vger.kernel.org
      Cc: keyrings@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13832/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7afd374e
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE · bd89870c
      Paul Mackerras authored
      [ Upstream commit 93d17397 ]
      
      It turns out that if the guest does a H_CEDE while the CPU is in
      a transactional state, and the H_CEDE does a nap, and the nap
      loses the architected state of the CPU (which is is allowed to do),
      then we lose the checkpointed state of the virtual CPU.  In addition,
      the transactional-memory state recorded in the MSR gets reset back
      to non-transactional, and when we try to return to the guest, we take
      a TM bad thing type of program interrupt because we are trying to
      transition from non-transactional to transactional with a hrfid
      instruction, which is not permitted.
      
      The result of the program interrupt occurring at that point is that
      the host CPU will hang in an infinite loop with interrupts disabled.
      Thus this is a denial of service vulnerability in the host which can
      be triggered by any guest (and depending on the guest kernel, it can
      potentially triggered by unprivileged userspace in the guest).
      
      This vulnerability has been assigned the ID CVE-2016-5412.
      
      To fix this, we save the TM state before napping and restore it
      on exit from the nap, when handling a H_CEDE in real mode.  The
      case where H_CEDE exits to host virtual mode is already OK (as are
      other hcalls which exit to host virtual mode) because the exit
      path saves the TM state.
      
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      bd89870c
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures · ae40dadb
      Paul Mackerras authored
      [ Upstream commit f024ee09 ]
      
      This moves the transactional memory state save and restore sequences
      out of the guest entry/exit paths into separate procedures.  This is
      so that these sequences can be used in going into and out of nap
      in a subsequent patch.
      
      The only code changes here are (a) saving and restore LR on the
      stack, since these new procedures get called with a bl instruction,
      (b) explicitly saving r1 into the PACA instead of assuming that
      HSTATE_HOST_R1(r13) is already set, and (c) removing an unnecessary
      and redundant setting of MSR[TM] that should have been removed by
      commit 9d4d0bdd9e0a ("KVM: PPC: Book3S HV: Add transactional memory
      support", 2013-09-24) but wasn't.
      
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      ae40dadb
    • Pavel Shilovsky's avatar
      CIFS: Fix a possible invalid memory access in smb2_query_symlink() · a3b180a9
      Pavel Shilovsky authored
      [ Upstream commit 7893242e ]
      
      During following a symbolic link we received err_buf from SMB2_open().
      While the validity of SMB2 error response is checked previously
      in smb2_check_message() a symbolic link payload is not checked at all.
      Fix it by adding such checks.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a3b180a9
    • Aurelien Aptel's avatar
      fs/cifs: make share unaccessible at root level mountable · b7e61a10
      Aurelien Aptel authored
      [ Upstream commit a6b5058f ]
      
      if, when mounting //HOST/share/sub/dir/foo we can query /sub/dir/foo but
      not any of the path components above:
      
      - store the /sub/dir/foo prefix in the cifs super_block info
      - in the superblock, set root dentry to the subpath dentry (instead of
        the share root)
      - set a flag in the superblock to remember it
      - use prefixpath when building path from a dentry
      
      fixes bso#8950
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      CC: Stable <stable@vger.kernel.org>
      Reviewed-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b7e61a10
    • Dmitry Torokhov's avatar
      Input: i8042 - break load dependency between atkbd/psmouse and i8042 · b5e8e7f6
      Dmitry Torokhov authored
      [ Upstream commit 40974618 ]
      
      As explained in 1407814240-4275-1-git-send-email-decui@microsoft.com we
      have a hard load dependency between i8042 and atkbd which prevents
      keyboard from working on Gen2 Hyper-V VMs.
      
      > hyperv_keyboard invokes serio_interrupt(), which needs a valid serio
      > driver like atkbd.c.  atkbd.c depends on libps2.c because it invokes
      > ps2_command().  libps2.c depends on i8042.c because it invokes
      > i8042_check_port_owner().  As a result, hyperv_keyboard actually
      > depends on i8042.c.
      >
      > For a Generation 2 Hyper-V VM (meaning no i8042 device emulated), if a
      > Linux VM (like Arch Linux) happens to configure CONFIG_SERIO_I8042=m
      > rather than =y, atkbd.ko can't load because i8042.ko can't load(due to
      > no i8042 device emulated) and finally hyperv_keyboard can't work and
      > the user can't input: https://bugs.archlinux.org/task/39820
      > (Ubuntu/RHEL/SUSE aren't affected since they use CONFIG_SERIO_I8042=y)
      
      To break the dependency we move away from using i8042_check_port_owner()
      and instead allow serio port owner specify a mutex that clients should use
      to serialize PS/2 command stream.
      Reported-by: default avatarMark Laws <mdl@60hz.org>
      Tested-by: default avatarMark Laws <mdl@60hz.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b5e8e7f6
    • Ben Hutchings's avatar
      Documentation/module-signing.txt: Note need for version info if reusing a key · e9071d07
      Ben Hutchings authored
      [ Upstream commit b8612e51 ]
      
      Signing a module should only make it trusted by the specific kernel it
      was built for, not anything else.  If a module signing key is used for
      multiple ABI-incompatible kernels, the modules need to include enough
      version information to distinguish them.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      e9071d07
    • Ben Hutchings's avatar
      module: Invalidate signatures on force-loaded modules · 6ac98572
      Ben Hutchings authored
      [ Upstream commit bca014ca ]
      
      Signing a module should only make it trusted by the specific kernel it
      was built for, not anything else.  Loading a signed module meant for a
      kernel with a different ABI could have interesting effects.
      Therefore, treat all signatures as invalid when a module is
      force-loaded.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      6ac98572
    • Vegard Nossum's avatar
      net/irda: fix NULL pointer dereference on memory allocation failure · 7e6f0e1e
      Vegard Nossum authored
      [ Upstream commit d3e6952c ]
      
      I ran into this:
      
          kasan: CONFIG_KASAN_INLINE enabled
          kasan: GPF could be caused by NULL-ptr deref or user memory access
          general protection fault: 0000 [#1] PREEMPT SMP KASAN
          CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
          task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000
          RIP: 0010:[<ffffffff82bbf066>]  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
          RSP: 0018:ffff880111747bb8  EFLAGS: 00010286
          RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358
          RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048
          RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000
          R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000
          R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000
          FS:  00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0
          Stack:
           0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220
           ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232
           ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000
          Call Trace:
           [<ffffffff82bca542>] irda_connect+0x562/0x1190
           [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0
           [<ffffffff825b4489>] SyS_connect+0x9/0x10
           [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
           [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74
          RIP  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
           RSP <ffff880111747bb8>
          ---[ end trace 4cda2588bc055b30 ]---
      
      The problem is that irda_open_tsap() can fail and leave self->tsap = NULL,
      and then irttp_connect_request() almost immediately dereferences it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7e6f0e1e
    • Wei Fang's avatar
      fs/dcache.c: avoid soft-lockup in dput() · 7d06f7f8
      Wei Fang authored
      [ Upstream commit 47be6184 ]
      
      We triggered soft-lockup under stress test which
      open/access/write/close one file concurrently on more than
      five different CPUs:
      
      WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
      ...
      [<ffffffc0003986f8>] dput+0x100/0x298
      [<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
      [<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
      [<ffffffc00038f780>] filename_lookup+0x38/0xf0
      [<ffffffc000391180>] user_path_at_empty+0x78/0xd0
      [<ffffffc0003911f4>] user_path_at+0x1c/0x28
      [<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230
      
      ->d_lock trylock may failed many times because of concurrently
      operations, and dput() may execute a long time.
      
      Fix this by replacing cpu_relax() with cond_resched().
      dput() used to be sleepable, so make it sleepable again
      should be safe.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarWei Fang <fangwei1@huawei.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7d06f7f8
    • Huacai Chen's avatar
      MIPS: Don't register r4k sched clock when CPUFREQ enabled · 49e3c9ae
      Huacai Chen authored
      [ Upstream commit 07d69579 ]
      
      Don't register r4k sched clock when CPUFREQ enabled because sched clock
      need a constant frequency.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Cc: John Crispin <john@phrozen.org>
      Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com>
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13820/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      49e3c9ae
    • Benjamin Coddington's avatar
      nfs: don't create zero-length requests · 0e2cbad6
      Benjamin Coddington authored
      [ Upstream commit 149a4fdd ]
      
      NFS doesn't expect requests with wb_bytes set to zero and may make
      unexpected decisions about how to handle that request at the page IO layer.
      Skip request creation if we won't have any wb_bytes in the request.
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarWeston Andros Adamson <dros@primarydata.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0e2cbad6
    • Andy Shevchenko's avatar
      gpio: intel-mid: Remove potentially harmful code · e1cc0756
      Andy Shevchenko authored
      [ Upstream commit 3dbd3212 ]
      
      The commit d56d6b3d ("gpio: langwell: add Intel Merrifield support")
      doesn't look at all as a proper support for Intel Merrifield and I dare to say
      that it distorts the behaviour of the hardware.
      
      The register map is different on Intel Merrifield, i.e. only 6 out of 8
      register have the same purpose but none of them has same location in the
      address space. The current case potentially harmful to existing hardware since
      it's poking registers on wrong offsets and may set some pin to be GPIO output
      when connected hardware doesn't expect such.
      
      Besides the above GPIO and pinctrl on Intel Merrifield have been located in
      different IP blocks. The functionality has been extended as well, i.e. added
      support of level interrupts, special registers for wake capable sources and
      thus, in my opinion, requires a completele separate driver.
      
      If someone wondering the existing gpio-intel-mid.c would be converted to actual
      pinctrl (which by the fact it is now), though I wouldn't be a volunteer to do
      that.
      
      Fixes: d56d6b3d ("gpio: langwell: add Intel Merrifield support")
      Cc: stable@vger.kernel.org # v3.13+
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      e1cc0756
    • Feng Li's avatar
      iscsi-target: Fix panic when adding second TCP connection to iSCSI session · 7f5a3c76
      Feng Li authored
      [ Upstream commit 8abc718d ]
      
      In MC/S scenario, the conn->sess has been set NULL in
      iscsi_login_non_zero_tsih_s1 when the second connection comes here,
      then kernel panic.
      
      The conn->sess will be assigned in iscsi_login_non_zero_tsih_s2. So
      we should check whether it's NULL before calling.
      Signed-off-by: default avatarFeng Li <lifeng1519@gmail.com>
      Tested-by: default avatarSumit Rai <sumit.rai@calsoftinc.com>
      Cc: stable@vger.kernel.org # 3.14+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7f5a3c76
    • Paul Moore's avatar
      audit: fix a double fetch in audit_log_single_execve_arg() · 634a3fc5
      Paul Moore authored
      [ Upstream commit 43761473 ]
      
      There is a double fetch problem in audit_log_single_execve_arg()
      where we first check the execve(2) argumnets for any "bad" characters
      which would require hex encoding and then re-fetch the arguments for
      logging in the audit record[1].  Of course this leaves a window of
      opportunity for an unsavory application to munge with the data.
      
      This patch reworks things by only fetching the argument data once[2]
      into a buffer where it is scanned and logged into the audit
      records(s).  In addition to fixing the double fetch, this patch
      improves on the original code in a few other ways: better handling
      of large arguments which require encoding, stricter record length
      checking, and some performance improvements (completely unverified,
      but we got rid of some strlen() calls, that's got to be a good
      thing).
      
      As part of the development of this patch, I've also created a basic
      regression test for the audit-testsuite, the test can be tracked on
      GitHub at the following link:
      
       * https://github.com/linux-audit/audit-testsuite/issues/25
      
      [1] If you pay careful attention, there is actually a triple fetch
      problem due to a strnlen_user() call at the top of the function.
      
      [2] This is a tiny white lie, we do make a call to strnlen_user()
      prior to fetching the argument data.  I don't like it, but due to the
      way the audit record is structured we really have no choice unless we
      copy the entire argument at once (which would require a rather
      wasteful allocation).  The good news is that with this patch the
      kernel no longer relies on this strnlen_user() value for anything
      beyond recording it in the log, we also update it with a trustworthy
      value whenever possible.
      Reported-by: default avatarPengfei Wang <wpengfeinudt@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      634a3fc5
    • Linus Torvalds's avatar
      Fix broken audit tests for exec arg len · a4664afa
      Linus Torvalds authored
      [ Upstream commit 45820c29 ]
      
      The "fix" in commit 0b08c5e5 ("audit: Fix check of return value of
      strnlen_user()") didn't fix anything, it broke things.  As reported by
      Steven Rostedt:
      
       "Yes, strnlen_user() returns 0 on fault, but if you look at what len is
        set to, than you would notice that on fault len would be -1"
      
      because we just subtracted one from the return value.  So testing
      against 0 doesn't test for a fault condition, it tests against a
      perfectly valid empty string.
      
      Also fix up the usual braindamage wrt using WARN_ON() inside a
      conditional - make it part of the conditional and remove the explicit
      unlikely() (which is already part of the WARN_ON*() logic, exactly so
      that you don't have to write unreadable code.
      Reported-and-tested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Paul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a4664afa
    • Jan Kara's avatar
      audit: Fix check of return value of strnlen_user() · a49b282f
      Jan Kara authored
      [ Upstream commit 0b08c5e5 ]
      
      strnlen_user() returns 0 when it hits fault, not -1. Fix the test in
      audit_log_single_execve_arg(). Luckily this shouldn't ever happen unless
      there's a kernel bug so it's mostly a cosmetic fix.
      
      CC: Paul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a49b282f
    • Rabin Vincent's avatar
      cifs: fix crash due to race in hmac(md5) handling · dd265663
      Rabin Vincent authored
      [ Upstream commit bd975d1e ]
      
      The secmech hmac(md5) structures are present in the TCP_Server_Info
      struct and can be shared among multiple CIFS sessions.  However, the
      server mutex is not currently held when these structures are allocated
      and used, which can lead to a kernel crashes, as in the scenario below:
      
      mount.cifs(8) #1				mount.cifs(8) #2
      
      Is secmech.sdeschmaccmd5 allocated?
      // false
      
      						Is secmech.sdeschmaccmd5 allocated?
      						// false
      
      secmech.hmacmd = crypto_alloc_shash..
      secmech.sdeschmaccmd5 = kzalloc..
      sdeschmaccmd5->shash.tfm = &secmec.hmacmd;
      
      						secmech.sdeschmaccmd5 = kzalloc
      						// sdeschmaccmd5->shash.tfm
      						// not yet assigned
      
      crypto_shash_update()
       deref NULL sdeschmaccmd5->shash.tfm
      
       Unable to handle kernel paging request at virtual address 00000030
       epc   : 8027ba34 crypto_shash_update+0x38/0x158
       ra    : 8020f2e8 setup_ntlmv2_rsp+0x4bc/0xa84
       Call Trace:
        crypto_shash_update+0x38/0x158
        setup_ntlmv2_rsp+0x4bc/0xa84
        build_ntlmssp_auth_blob+0xbc/0x34c
        sess_auth_rawntlmssp_authenticate+0xac/0x248
        CIFS_SessSetup+0xf0/0x178
        cifs_setup_session+0x4c/0x84
        cifs_get_smb_ses+0x2c8/0x314
        cifs_mount+0x38c/0x76c
        cifs_do_mount+0x98/0x440
        mount_fs+0x20/0xc0
        vfs_kern_mount+0x58/0x138
        do_mount+0x1e8/0xccc
        SyS_mount+0x88/0xd4
        syscall_common+0x30/0x54
      
      Fix this by locking the srv_mutex around the code which uses these
      hmac(md5) structures.  All the other secmech algos already have similar
      locking.
      
      Fixes: 95dc8dd1 ("Limit allocation of crypto mechanisms to dialect which requires")
      Signed-off-by: default avatarRabin Vincent <rabinv@axis.com>
      Acked-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      dd265663
    • Nicholas Bellinger's avatar
      target: Fix race between iscsi-target connection shutdown + ABORT_TASK · b9090fe4
      Nicholas Bellinger authored
      [ Upstream commit 064cdd2d ]
      
      This patch fixes a race in iscsit_release_commands_from_conn() ->
      iscsit_free_cmd() -> transport_generic_free_cmd() + wait_for_tasks=1,
      where CMD_T_FABRIC_STOP could end up being set after the final
      kref_put() is called from core_tmr_abort_task() context.
      
      This results in transport_generic_free_cmd() blocking indefinately
      on se_cmd->cmd_wait_comp, because the target_release_cmd_kref()
      check for CMD_T_FABRIC_STOP returns false.
      
      To address this bug, make iscsit_release_commands_from_conn()
      do list_splice and set CMD_T_FABRIC_STOP early while holding
      iscsi_conn->cmd_lock.  Also make iscsit_aborted_task() only
      remove iscsi_cmd_t if CMD_T_FABRIC_STOP has not already been
      set.
      
      Finally in target_release_cmd_kref(), only honor fabric_stop
      if CMD_T_ABORTED has been set.
      
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Quinn Tran <quinn.tran@qlogic.com>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org # 3.14+
      Tested-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b9090fe4
    • Nicholas Bellinger's avatar
      target: Fix missing complete during ABORT_TASK + CMD_T_FABRIC_STOP · 6c631d3e
      Nicholas Bellinger authored
      [ Upstream commit 5e2c956b ]
      
      During transport_generic_free_cmd() with a concurrent TMR
      ABORT_TASK and shutdown CMD_T_FABRIC_STOP bit set, the
      caller will be blocked on se_cmd->cmd_wait_stop completion
      until the final kref_put() -> target_release_cmd_kref()
      has been invoked to call complete().
      
      However, when ABORT_TASK is completed with FUNCTION_COMPLETE
      in core_tmr_abort_task(), the aborted se_cmd will have already
      been removed from se_sess->sess_cmd_list via list_del_init().
      
      This results in target_release_cmd_kref() hitting the
      legacy list_empty() == true check, invoking ->release_cmd()
      but skipping complete() to wakeup se_cmd->cmd_wait_stop
      blocked earlier in transport_generic_free_cmd() code.
      
      To address this bug, it's safe to go ahead and drop the
      original list_empty() check so that fabric_stop invokes
      the complete() as expected, since list_del_init() can
      safely be used on a empty list.
      
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Quinn Tran <quinn.tran@qlogic.com>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org # 3.14+
      Tested-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      6c631d3e
    • Hector Palacios's avatar
      mtd: nand: fix bug writing 1 byte less than page size · 80d341fe
      Hector Palacios authored
      [ Upstream commit 144f4c98 ]
      
      nand_do_write_ops() determines if it is writing a partial page with the
      formula:
      	part_pagewr = (column || writelen < (mtd->writesize - 1))
      
      When 'writelen' is exactly 1 byte less than the NAND page size the formula
      equates to zero, so the code doesn't process it as a partial write,
      although it should.
      As a consequence the function remains in the while(1) loop with 'writelen'
      becoming 0xffffffff and iterating endlessly.
      
      The bug may not be easy to reproduce in Linux since user space tools
      usually force the padding or round-up the write size to a page-size
      multiple.
      This was discovered in U-Boot where the issue can be reproduced by
      writing any size that is 1 byte less than a page-size multiple.
      For example, on a NAND with 2K page (0x800):
      	=> nand erase.part <partition>
      	=> nand write $loadaddr <partition> 7ff
      
      [Editor's note: the bug was added in commit 29072b96, but moved
      around in commit 66507c7b ("mtd: nand: Add support to use nand_base
      poi databuf as bounce buffer")]
      
      Fixes: 29072b96 ("[MTD] NAND: add subpage write support")
      Signed-off-by: default avatarHector Palacios <hector.palacios@digi.com>
      Acked-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      80d341fe
    • Will Deacon's avatar
      arm64: debug: unmask PSTATE.D earlier · 8ae0073f
      Will Deacon authored
      [ Upstream commit 2ce39ad1 ]
      
      Clearing PSTATE.D is one of the requirements for generating a debug
      exception. The arm64 booting protocol requires that PSTATE.D is set,
      since many of the debug registers (for example, the hw_breakpoint
      registers) are UNKNOWN out of reset and could potentially generate
      spurious, fatal debug exceptions in early boot code if PSTATE.D was
      clear. Once the debug registers have been safely initialised, PSTATE.D
      is cleared, however this is currently broken for two reasons:
      
      (1) The boot CPU clears PSTATE.D in a postcore_initcall and secondary
          CPUs clear PSTATE.D in secondary_start_kernel. Since the initcall
          runs after SMP (and the scheduler) have been initialised, there is
          no guarantee that it is actually running on the boot CPU. In this
          case, the boot CPU is left with PSTATE.D set and is not capable of
          generating debug exceptions.
      
      (2) In a preemptible kernel, we may explicitly schedule on the IRQ
          return path to EL1. If an IRQ occurs with PSTATE.D set in the idle
          thread, then we may schedule the kthread_init thread, run the
          postcore_initcall to clear PSTATE.D and then context switch back
          to the idle thread before returning from the IRQ. The exception
          return path will then restore PSTATE.D from the stack, and set it
          again.
      
      This patch fixes the problem by moving the clearing of PSTATE.D earlier
      to proc.S. This has the desirable effect of clearing it in one place for
      all CPUs, long before we have to worry about the scheduler or any
      exception handling. We ensure that the previous reset of MDSCR_EL1 has
      completed before unmasking the exception, so that any spurious
      exceptions resulting from UNKNOWN debug registers are not generated.
      
      Without this patch applied, the kprobes selftests have been seen to fail
      under KVM, where we end up attempting to step the OOL instruction buffer
      with PSTATE.D set and therefore fail to complete the step.
      
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      8ae0073f
    • Alim Akhtar's avatar
      rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq() · f88ded26
      Alim Akhtar authored
      [ Upstream commit 70c96dfa ]
      
      As per code flow s3c_rtc_setfreq() will get called with rtc clock disabled
      and in set_freq we perform h/w registers read/write, which results in a
      kernel crash on exynos7 platform while probing rtc driver.
      Below is code flow:
      s3c_rtc_probe()
          clk_prepare_enable(info->rtc_clk) // rtc clock enabled
          s3c_rtc_gettime() // will enable clk if not done, and disable it upon exit
          s3c_rtc_setfreq() //then this will be called with clk disabled
      
      This patch take cares of such issue by adding s3c_rtc_{enable/disable}_clk in
      s3c_rtc_setfreq().
      
      Fixes: 24e14554 ("drivers/rtc/rtc-s3c.c: delete duplicate clock control")
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAlim Akhtar <alim.akhtar@samsung.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Reviewed-by: default avatarPankaj Dubey <pankaj.dubey@samsung.com>
      Tested-by: default avatarPankaj Dubey <pankaj.dubey@samsung.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      f88ded26
    • Sasha Levin's avatar
      dm: fix second blk_delay_queue() parameter to be in msec units not jiffies · abf95692
      Sasha Levin authored
      [ Upstream commit bd9f55ea ]
      
      Commit d548b34b ("dm: reduce the queue delay used in dm_request_fn
      from 100ms to 10ms") always intended the value to be 10 msecs -- it
      just expressed it in jiffies because earlier commit 7eaceacc ("block:
      remove per-queue plugging") did.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: d548b34b ("dm: reduce the queue delay used in dm_request_fn from 100ms to 10ms")
      Cc: stable@vger.kernel.org # 4.1+ -- stable@ backports must be applied to drivers/md/dm.c
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      abf95692
    • Herbert Xu's avatar
      crypto: scatterwalk - Fix test in scatterwalk_done · 81292598
      Herbert Xu authored
      [ Upstream commit 5f070e81 ]
      
      When there is more data to be processed, the current test in
      scatterwalk_done may prevent us from calling pagedone even when
      we should.
      
      In particular, if we're on an SG entry spanning multiple pages
      where the last page is not a full page, we will incorrectly skip
      calling pagedone on the second last page.
      
      This patch fixes this by adding a separate test for whether we've
      reached the end of a page.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      81292598
    • Amadeusz Sławiński's avatar
      Bluetooth: Fix l2cap_sock_setsockopt() with optname BT_RCVMTU · 98953c4c
      Amadeusz Sławiński authored
      [ Upstream commit 23bc6ab0 ]
      
      When we retrieve imtu value from userspace we should use 16 bit pointer
      cast instead of 32 as it's defined that way in headers. Fixes setsockopt
      calls on big-endian platforms.
      Signed-off-by: default avatarAmadeusz Sławiński <amadeusz.slawinski@tieto.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      98953c4c
    • Cao, Lei's avatar
      KVM: VMX: handle PML full VMEXIT that occurs during event delivery · 5436aa6c
      Cao, Lei authored
      [ Upstream commit b244c9fc ]
      
      With PML enabled, guest will shut down if a PML full VMEXIT occurs during
      event delivery. According to Intel SDM 27.2.3, PML full VMEXIT can occur when
      event is being delivered through IDT, so KVM should not exit to user space
      with error. Instead, it should let EXIT_REASON_PML_FULL go through and the
      event will be re-injected on the next VMENTRY.
      Signed-off-by: default avatarLei Cao <lei.cao@stratus.com>
      Cc: stable@vger.kernel.org
      Fixes: 843e4330 ("KVM: VMX: Add PML support in VMX")
      [Shortened the summary and Cc'd stable.]
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      5436aa6c
    • Daniele Palmas's avatar
      USB: serial: option: add support for Telit LE910 PID 0x1206 · aef1e06d
      Daniele Palmas authored
      [ Upstream commit 3c0415fa ]
      
      This patch adds support for 0x1206 PID of Telit LE910.
      
      Since the interfaces positions are the same than the ones for
      0x1043 PID of Telit LE922, telit_le922_blacklist_usbcfg3 is used.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      aef1e06d
    • Michael Neuling's avatar
      powerpc/tm: Fix stack pointer corruption in __tm_recheckpoint() · 4af80d9f
      Michael Neuling authored
      [ Upstream commit 6bcb8014 ]
      
      At the start of __tm_recheckpoint() we save the kernel stack pointer
      (r1) in SPRG SCRATCH0 (SPRG2) so that we can restore it after the
      trecheckpoint.
      
      Unfortunately, the same SPRG is used in the SLB miss handler.  If an
      SLB miss is taken between the save and restore of r1 to the SPRG, the
      SPRG is changed and hence r1 is also corrupted.  We can end up with
      the following crash when we start using r1 again after the restore
      from the SPRG:
      
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 658 PID: 143777 Comm: htm_demo Tainted: G            EL   X 4.4.13-0-default #1
        task: c0000b56993a7810 ti: c00000000cfec000 task.ti: c0000b56993bc000
        NIP: c00000000004f188 LR: 00000000100040b8 CTR: 0000000010002570
        REGS: c00000000cfefd40 TRAP: 0300   Tainted: G            EL   X  (4.4.13-0-default)
        MSR: 8000000300001033 <SF,ME,IR,DR,RI,LE>  CR: 02000424  XER: 20000000
        CFAR: c000000000008468 DAR: 00003ffd84e66880 DSISR: 40000000 SOFTE: 0
        PACATMSCRATCH: 00003ffbc865e680
        GPR00: fffffffcfabc4268 00003ffd84e667a0 00000000100d8c38 000000030544bb80
        GPR04: 0000000000000002 00000000100cf200 0000000000000449 00000000100cf100
        GPR08: 000000000000c350 0000000000002569 0000000000002569 00000000100d6c30
        GPR12: 00000000100d6c28 c00000000e6a6b00 00003ffd84660000 0000000000000000
        GPR16: 0000000000000003 0000000000000449 0000000010002570 0000010009684f20
        GPR20: 0000000000800000 00003ffd84e5f110 00003ffd84e5f7a0 00000000100d0f40
        GPR24: 0000000000000000 0000000000000000 0000000000000000 00003ffff0673f50
        GPR28: 00003ffd84e5e960 00000000003d0f00 00003ffd84e667a0 00003ffd84e5e680
        NIP [c00000000004f188] restore_gprs+0x110/0x17c
        LR [00000000100040b8] 0x100040b8
        Call Trace:
        Instruction dump:
        f8a1fff0 e8e700a8 38a00000 7ca10164 e8a1fff8 e821fff0 7c0007dd 7c421378
        7db142a6 7c3242a6 38800002 7c810164 <e9c100e0> e9e100e8 ea0100f0 ea2100f8
      
      We hit this on large memory machines (> 2TB) but it can also be hit on
      smaller machines when 1TB segments are disabled.
      
      To hit this, you also need to be virtualised to ensure SLBs are
      periodically removed by the hypervisor.
      
      This patches moves the saving of r1 to the SPRG to the region where we
      are guaranteed not to take any further SLB misses.
      
      Fixes: 98ae22e1 ("powerpc: Add helper functions for transactional memory context switching")
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Acked-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      4af80d9f
    • Michael Neuling's avatar
      powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 · 572c8b69
      Michael Neuling authored
      [ Upstream commit 190ce869 ]
      
      Currently we have 2 segments that are bolted for the kernel linear
      mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
      stacks. Anything accessed outside of these regions may need to be
      faulted in. (In practice machines with TM always have 1T segments)
      
      If a machine has < 2TB of memory we never fault on the kernel linear
      mapping as these two segments cover all physical memory. If a machine
      has > 2TB of memory, there may be structures outside of these two
      segments that need to be faulted in. This faulting can occur when
      running as a guest as the hypervisor may remove any SLB that's not
      bolted.
      
      When we treclaim and trecheckpoint we have a window where we need to
      run with the userspace GPRs. This means that we no longer have a valid
      stack pointer in r1. For this window we therefore clear MSR RI to
      indicate that any exceptions taken at this point won't be able to be
      handled. This means that we can't take segment misses in this RI=0
      window.
      
      In this RI=0 region, we currently access the thread_struct for the
      process being context switched to or from. This thread_struct access
      may cause a segment fault since it's not guaranteed to be covered by
      the two bolted segment entries described above.
      
      We've seen this with a crash when running as a guest with > 2TB of
      memory on PowerVM:
      
        Unrecoverable exception 4100 at c00000000004f138
        Oops: Unrecoverable exception, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
        task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
        NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
        REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
        MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
        CFAR: c00000000004ed18 SOFTE: 0
        GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
        GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
        GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
        GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
        GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
        GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
        GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
        GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
        NIP [c00000000004f138] restore_gprs+0xd0/0x16c
        LR [0000000010003a24] 0x10003a24
        Call Trace:
        [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
        [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
        [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
        [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
        [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
        [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
        [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
        [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
        Instruction dump:
        7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
        38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
        ---[ end trace 602126d0a1dedd54 ]---
      
      This fixes this by copying the required data from the thread_struct to
      the stack before we clear MSR RI. Then once we clear RI, we only access
      the stack, guaranteeing there's no segment miss.
      
      We also tighten the region over which we set RI=0 on the treclaim()
      path. This may have a slight performance impact since we're adding an
      mtmsr instruction.
      
      Fixes: 090b9284 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      572c8b69
    • Vegard Nossum's avatar
      ext4: short-cut orphan cleanup on error · 881052c2
      Vegard Nossum authored
      [ Upstream commit c65d5c6c ]
      
      If we encounter a filesystem error during orphan cleanup, we should stop.
      Otherwise, we may end up in an infinite loop where the same inode is
      processed again and again.
      
          EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended
          EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters
          Aborting journal on device loop0-8.
          EXT4-fs (loop0): Remounting filesystem read-only
          EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure
          EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted
          EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed!
          [...]
          EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed!
          [...]
          EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed!
          [...]
      
      See-also: c9eb13a9 ("ext4: fix hang when processing corrupted orphaned inode list")
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      881052c2