1. 09 Sep, 2018 40 commits
    • Ondrej Mosnacek's avatar
      crypto: vmx - Fix sleep-in-atomic bugs · 9f830cf2
      Ondrej Mosnacek authored
      commit 0522236d upstream.
      
      This patch fixes sleep-in-atomic bugs in AES-CBC and AES-XTS VMX
      implementations. The problem is that the blkcipher_* functions should
      not be called in atomic context.
      
      The bugs can be reproduced via the AF_ALG interface by trying to
      encrypt/decrypt sufficiently large buffers (at least 64 KiB) using the
      VMX implementations of 'cbc(aes)' or 'xts(aes)'. Such operations then
      trigger BUG in crypto_yield():
      
      [  891.863680] BUG: sleeping function called from invalid context at include/crypto/algapi.h:424
      [  891.864622] in_atomic(): 1, irqs_disabled(): 0, pid: 12347, name: kcapi-enc
      [  891.864739] 1 lock held by kcapi-enc/12347:
      [  891.864811]  #0: 00000000f5d42c46 (sk_lock-AF_ALG){+.+.}, at: skcipher_recvmsg+0x50/0x530
      [  891.865076] CPU: 5 PID: 12347 Comm: kcapi-enc Not tainted 4.19.0-0.rc0.git3.1.fc30.ppc64le #1
      [  891.865251] Call Trace:
      [  891.865340] [c0000003387578c0] [c000000000d67ea4] dump_stack+0xe8/0x164 (unreliable)
      [  891.865511] [c000000338757910] [c000000000172a58] ___might_sleep+0x2f8/0x310
      [  891.865679] [c000000338757990] [c0000000006bff74] blkcipher_walk_done+0x374/0x4a0
      [  891.865825] [c0000003387579e0] [d000000007e73e70] p8_aes_cbc_encrypt+0x1c8/0x260 [vmx_crypto]
      [  891.865993] [c000000338757ad0] [c0000000006c0ee0] skcipher_encrypt_blkcipher+0x60/0x80
      [  891.866128] [c000000338757b10] [c0000000006ec504] skcipher_recvmsg+0x424/0x530
      [  891.866283] [c000000338757bd0] [c000000000b00654] sock_recvmsg+0x74/0xa0
      [  891.866403] [c000000338757c10] [c000000000b00f64] ___sys_recvmsg+0xf4/0x2f0
      [  891.866515] [c000000338757d90] [c000000000b02bb8] __sys_recvmsg+0x68/0xe0
      [  891.866631] [c000000338757e30] [c00000000000bbe4] system_call+0x5c/0x70
      
      Fixes: 8c755ace ("crypto: vmx - Adding CBC routines for VMX module")
      Fixes: c07f5d3d ("crypto: vmx - Adding support for XTS")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f830cf2
    • Adrian Hunter's avatar
      perf auxtrace: Fix queue resize · 300ec47a
      Adrian Hunter authored
      commit 99cbbe56 upstream.
      
      When the number of queues grows beyond 32, the array of queues is
      resized but not all members were being copied. Fix by also copying
      'tid', 'cpu' and 'set'.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: e5027893 ("perf auxtrace: Add helpers for queuing AUX area tracing data")
      Link: http://lkml.kernel.org/r/20180814084608.6563-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      300ec47a
    • Eddie.Horng's avatar
      cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias() · 5a842ecc
      Eddie.Horng authored
      commit 355139a8 upstream.
      
      The code in cap_inode_getsecurity(), introduced by commit 8db6c34f
      ("Introduce v3 namespaced file capabilities"), should use
      d_find_any_alias() instead of d_find_alias() do handle unhashed dentry
      correctly. This is needed, for example, if execveat() is called with an
      open but unlinked overlayfs file, because overlayfs unhashes dentry on
      unlink.
      This is a regression of real life application, first reported at
      https://www.spinics.net/lists/linux-unionfs/msg05363.html
      
      Below reproducer and setup can reproduce the case.
        const char* exec="echo";
        const char *newargv[] = { "echo", "hello", NULL};
        const char *newenviron[] = { NULL };
        int fd, err;
      
        fd = open(exec, O_PATH);
        unlink(exec);
        err = syscall(322/*SYS_execveat*/, fd, "", newargv, newenviron,
      AT_EMPTY_PATH);
        if(err<0)
          fprintf(stderr, "execveat: %s\n", strerror(errno));
      
      gcc compile into ~/test/a.out
      mount -t overlay -orw,lowerdir=/mnt/l,upperdir=/mnt/u,workdir=/mnt/w
      none /mnt/m
      cd /mnt/m
      cp /bin/echo .
      ~/test/a.out
      
      Expected result:
      hello
      Actually result:
      execveat: Invalid argument
      dmesg:
      Invalid argument reading file caps for /dev/fd/3
      
      The 2nd reproducer and setup emulates similar case but for
      regular filesystem:
        const char* exec="echo";
        int fd, err;
        char buf[256];
      
        fd = open(exec, O_RDONLY);
        unlink(exec);
        err = fgetxattr(fd, "security.capability", buf, 256);
        if(err<0)
          fprintf(stderr, "fgetxattr: %s\n", strerror(errno));
      
      gcc compile into ~/test_fgetxattr
      
      cd /tmp
      cp /bin/echo .
      ~/test_fgetxattr
      
      Result:
      fgetxattr: Invalid argument
      
      On regular filesystem, for example, ext4 read xattr from
      disk and return to execveat(), will not trigger this issue, however,
      the overlay attr handler pass real dentry to vfs_getxattr() will.
      This reproducer calls fgetxattr() with an unlinked fd, involkes
      vfs_getxattr() then reproduced the case that d_find_alias() in
      cap_inode_getsecurity() can't find the unlinked dentry.
      Suggested-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Acked-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Acked-by: default avatarSerge E. Hallyn <serge@hallyn.com>
      Fixes: 8db6c34f ("Introduce v3 namespaced file capabilities")
      Cc: <stable@vger.kernel.org> # v4.14
      Signed-off-by: default avatarEddie Horng <eddie.horng@mediatek.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a842ecc
    • Shan Hai's avatar
      bcache: release dc->writeback_lock properly in bch_writeback_thread() · d1a265da
      Shan Hai authored
      commit 3943b040 upstream.
      
      The writeback thread would exit with a lock held when the cache device
      is detached via sysfs interface, fix it by releasing the held lock
      before exiting the while-loop.
      
      Fixes: fadd94e0 (bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set)
      Signed-off-by: default avatarShan Hai <shan.hai@oracle.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Tested-by: default avatarShenghui Wang <shhuiw@foxmail.com>
      Cc: stable@vger.kernel.org #4.17+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1a265da
    • Vishal Verma's avatar
      libnvdimm: fix ars_status output length calculation · c8d875b4
      Vishal Verma authored
      commit 286e8771 upstream.
      
      Commit efda1b5d ("acpi, nfit, libnvdimm: fix / harden ars_status output length handling")
      Introduced additional hardening for ambiguity in the ACPI spec for
      ars_status output sizing. However, it had a couple of cases mixed up.
      Where it should have been checking for (and returning) "out_field[1] -
      4" it was using "out_field[1] - 8" and vice versa.
      
      This caused a four byte discrepancy in the buffer size passed on to
      the command handler, and in some cases, this caused memory corruption
      like:
      
        ./daxdev-errors.sh: line 76: 24104 Aborted   (core dumped) ./daxdev-errors $busdev $region
        malloc(): memory corruption
        Program received signal SIGABRT, Aborted.
        [...]
        #5  0x00007ffff7865a2e in calloc () from /lib64/libc.so.6
        #6  0x00007ffff7bc2970 in ndctl_bus_cmd_new_ars_status (ars_cap=ars_cap@entry=0x6153b0) at ars.c:136
        #7  0x0000000000401644 in check_ars_status (check=0x7fffffffdeb0, bus=0x604c20) at daxdev-errors.c:144
        #8  test_daxdev_clear_error (region_name=<optimized out>, bus_name=<optimized out>)
            at daxdev-errors.c:332
      
      Cc: <stable@vger.kernel.org>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Lukasz Dorau <lukasz.dorau@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Fixes: efda1b5d ("acpi, nfit, libnvdimm: fix / harden ars_status output length handling")
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-of-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8d875b4
    • Christian Brauner's avatar
      getxattr: use correct xattr length · ff0791f4
      Christian Brauner authored
      commit 82c9a927 upstream.
      
      When running in a container with a user namespace, if you call getxattr
      with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
      silently skips the user namespace fixup that it normally does resulting in
      un-fixed-up data being returned.
      This is caused by posix_acl_fix_xattr_to_user() being passed the total
      buffer size and not the actual size of the xattr as returned by
      vfs_getxattr().
      This commit passes the actual length of the xattr as returned by
      vfs_getxattr() down.
      
      A reproducer for the issue is:
      
        touch acl_posix
      
        setfacl -m user:0:rwx acl_posix
      
      and the compile:
      
        #define _GNU_SOURCE
        #include <errno.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>
        #include <sys/types.h>
        #include <unistd.h>
        #include <attr/xattr.h>
      
        /* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
        int main(int argc, void **argv)
        {
                ssize_t ret1, ret2;
                char buf1[128], buf2[132];
                int fret = EXIT_SUCCESS;
                char *file;
      
                if (argc < 2) {
                        fprintf(stderr,
                                "Please specify a file with "
                                "\"system.posix_acl_access\" permissions set\n");
                        _exit(EXIT_FAILURE);
                }
                file = argv[1];
      
                ret1 = getxattr(file, "system.posix_acl_access",
                                buf1, sizeof(buf1));
                if (ret1 < 0) {
                        fprintf(stderr, "%s - Failed to retrieve "
                                        "\"system.posix_acl_access\" "
                                        "from \"%s\"\n", strerror(errno), file);
                        _exit(EXIT_FAILURE);
                }
      
                ret2 = getxattr(file, "system.posix_acl_access",
                                buf2, sizeof(buf2));
                if (ret2 < 0) {
                        fprintf(stderr, "%s - Failed to retrieve "
                                        "\"system.posix_acl_access\" "
                                        "from \"%s\"\n", strerror(errno), file);
                        _exit(EXIT_FAILURE);
                }
      
                if (ret1 != ret2) {
                        fprintf(stderr, "The value of \"system.posix_acl_"
                                        "access\" for file \"%s\" changed "
                                        "between two successive calls\n", file);
                        _exit(EXIT_FAILURE);
                }
      
                for (ssize_t i = 0; i < ret2; i++) {
                        if (buf1[i] == buf2[i])
                                continue;
      
                        fprintf(stderr,
                                "Unexpected different in byte %zd: "
                                "%02x != %02x\n", i, buf1[i], buf2[i]);
                        fret = EXIT_FAILURE;
                }
      
                if (fret == EXIT_SUCCESS)
                        fprintf(stderr, "Test passed\n");
                else
                        fprintf(stderr, "Test failed\n");
      
                _exit(fret);
        }
      and run:
      
        ./tester acl_posix
      
      On a non-fixed up kernel this should return something like:
      
        root@c1:/# ./t
        Unexpected different in byte 16: ffffffa0 != 00
        Unexpected different in byte 17: ffffff86 != 00
        Unexpected different in byte 18: 01 != 00
      
      and on a fixed kernel:
      
        root@c1:~# ./t
        Test passed
      
      Cc: stable@vger.kernel.org
      Fixes: 2f6f0654 ("userns: Convert vfs posix_acl support to use kuids and kgids")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=199945Reported-by: default avatarColin Watson <cjwatson@ubuntu.com>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarSerge Hallyn <serge@hallyn.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ff0791f4
    • Mikulas Patocka's avatar
      udlfb: set optimal write delay · 19b99719
      Mikulas Patocka authored
      commit bb24153a upstream.
      
      The default delay 5 jiffies is too much when the kernel is compiled with
      HZ=100 - it results in jumpy cursor in Xwindow.
      
      In order to find out the optimal delay, I benchmarked the driver on
      1280x720x30fps video. I found out that with HZ=1000, 10ms is acceptable,
      but with HZ=250 or HZ=300, we need 4ms, so that the video is played
      without any frame skips.
      
      This patch changes the delay to this value.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19b99719
    • Mikulas Patocka's avatar
      fb: fix lost console when the user unplugs a USB adapter · d0f2eb3a
      Mikulas Patocka authored
      commit 8c5b0442 upstream.
      
      I have a USB display adapter using the udlfb driver and I use it on an ARM
      board that doesn't have any graphics card. When I plug the adapter in, the
      console is properly displayed, however when I unplug and re-plug the
      adapter, the console is not displayed and I can't access it until I reboot
      the board.
      
      The reason is this:
      When the adapter is unplugged, dlfb_usb_disconnect calls
      unlink_framebuffer, then it waits until the reference count drops to zero
      and then it deallocates the framebuffer. However, the console that is
      attached to the framebuffer device keeps the reference count non-zero, so
      the framebuffer device is never destroyed. When the USB adapter is plugged
      again, it creates a new device /dev/fb1 and the console is not attached to
      it.
      
      This patch fixes the bug by unbinding the console from unlink_framebuffer.
      The code to unbind the console is moved from do_unregister_framebuffer to
      a function unbind_console. When the console is unbound, the reference
      count drops to zero and the udlfb driver frees the framebuffer. When the
      adapter is plugged back, a new framebuffer is created and the console is
      attached to it.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Bernie Thompson <bernie@plugable.com>
      Cc: Ladislav Michl <ladis@linux-mips.org>
      Cc: stable@vger.kernel.org
      [b.zolnierkie: preserve old behavior for do_unregister_framebuffer()]
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0f2eb3a
    • Vignesh R's avatar
      pwm: tiehrpwm: Fix disabling of output of PWMs · 9b0dd656
      Vignesh R authored
      commit 38dabd91 upstream.
      
      pwm-tiehrpwm driver disables PWM output by putting it in low output
      state via active AQCSFRC register in ehrpwm_pwm_disable(). But, the
      AQCSFRC shadow register is not updated. Therefore, when shadow AQCSFRC
      register is re-enabled in ehrpwm_pwm_enable() (say to enable second PWM
      output), previous settings are lost as shadow register value is loaded
      into active register. This results in things like PWMA getting enabled
      automatically, when PWMB is enabled and vice versa. Fix this by
      updating AQCSFRC shadow register as well during ehrpwm_pwm_disable().
      
      Fixes: 19891b20 ("pwm: pwm-tiehrpwm: PWM driver support for EHRPWM")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVignesh R <vigneshr@ti.com>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b0dd656
    • Vignesh R's avatar
      pwm: tiehrpwm: Don't use emulation mode bits to control PWM output · 0ef9c771
      Vignesh R authored
      commit aa49d628 upstream.
      
      As per AM335x TRM SPRUH73P "15.2.2.11 ePWM Behavior During Emulation",
      TBCTL[15:14] only have effect during emulation suspend events (IOW,
      to stop PWM when debugging using a debugger). These bits have no effect
      on PWM output during normal running of system. Hence, remove code
      accessing these bits as they have no role in enabling/disabling PWMs.
      
      Fixes: 19891b20 ("pwm: pwm-tiehrpwm: PWM driver support for EHRPWM")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVignesh R <vigneshr@ti.com>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ef9c771
    • Richard Weinberger's avatar
      ubifs: Fix synced_i_size calculation for xattr inodes · 63bbaa14
      Richard Weinberger authored
      commit 59965593 upstream.
      
      In ubifs_jnl_update() we sync parent and child inodes to the flash,
      in case of xattrs, the parent inode (AKA host inode) has a non-zero
      data_len. Therefore we need to adjust synced_i_size too.
      
      This issue was reported by ubifs self tests unter a xattr related work
      load.
      UBIFS error (ubi0:0 pid 1896): dbg_check_synced_i_size: ui_size is 4, synced_i_size is 0, but inode is clean
      UBIFS error (ubi0:0 pid 1896): dbg_check_synced_i_size: i_ino 65, i_mode 0x81a4, i_size 4
      
      Cc: <stable@vger.kernel.org>
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63bbaa14
    • Richard Weinberger's avatar
      ubifs: xattr: Don't operate on deleted inodes · 8a23348d
      Richard Weinberger authored
      commit 11a6fc3d upstream.
      
      xattr operations can race with unlink and the following assert triggers:
      UBIFS assert failed in ubifs_jnl_change_xattr at 1606 (pid 6256)
      
      Fix this by checking i_nlink before working on the host inode.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a23348d
    • Richard Weinberger's avatar
      ubifs: Check data node size before truncate · f6d7acc1
      Richard Weinberger authored
      commit 95a22d20 upstream.
      
      Check whether the size is within bounds before using it.
      If the size is not correct, abort and dump the bad data node.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Silvio Cesare <silvio.cesare@gmail.com>
      Cc: stable@vger.kernel.org
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Reported-by: default avatarSilvio Cesare <silvio.cesare@gmail.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6d7acc1
    • Richard Weinberger's avatar
      Revert "UBIFS: Fix potential integer overflow in allocation" · 3259dd71
      Richard Weinberger authored
      commit 08acbdd6 upstream.
      
      This reverts commit 353748a3.
      It bypassed the linux-mtd review process and fixes the issue not as it
      should.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Silvio Cesare <silvio.cesare@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3259dd71
    • Richard Weinberger's avatar
      ubifs: Fix memory leak in lprobs self-check · a230db38
      Richard Weinberger authored
      commit eef19816 upstream.
      
      Allocate the buffer after we return early.
      Otherwise memory is being leaked.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a230db38
    • Jann Horn's avatar
      userns: move user access out of the mutex · 656d6e6f
      Jann Horn authored
      commit 5820f140 upstream.
      
      The old code would hold the userns_state_mutex indefinitely if
      memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
      moving the memdup_user_nul in front of the mutex_lock().
      
      Note: This changes the error precedence of invalid buf/count/*ppos vs
      map already written / capabilities missing.
      
      Fixes: 22d917d8 ("userns: Rework the user_namespace adding uid/gid...")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarSerge Hallyn <serge@hallyn.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      656d6e6f
    • Jann Horn's avatar
      sys: don't hold uts_sem while accessing userspace memory · b692c405
      Jann Horn authored
      commit 42a0cc34 upstream.
      
      Holding uts_sem as a writer while accessing userspace memory allows a
      namespace admin to stall all processes that attempt to take uts_sem.
      Instead, move data through stack buffers and don't access userspace memory
      while uts_sem is held.
      
      Cc: stable@vger.kernel.org
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b692c405
    • Jacob Pan's avatar
      iommu/vt-d: Fix dev iotlb pfsid use · c2ea292b
      Jacob Pan authored
      commit 1c48db44 upstream.
      
      PFSID should be used in the invalidation descriptor for flushing
      device IOTLBs on SRIOV VFs.
      Signed-off-by: default avatarJacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: stable@vger.kernel.org
      Cc: "Ashok Raj" <ashok.raj@intel.com>
      Cc: "Lu Baolu" <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2ea292b
    • Jacob Pan's avatar
      iommu/vt-d: Add definitions for PFSID · eb58c404
      Jacob Pan authored
      commit 0f725561 upstream.
      
      When SRIOV VF device IOTLB is invalidated, we need to provide
      the PF source ID such that IOMMU hardware can gauge the depth
      of invalidation queue which is shared among VFs. This is needed
      when device invalidation throttle (DIT) capability is supported.
      
      This patch adds bit definitions for checking and tracking PFSID.
      Signed-off-by: default avatarJacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: stable@vger.kernel.org
      Cc: "Ashok Raj" <ashok.raj@intel.com>
      Cc: "Lu Baolu" <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb58c404
    • Peter Zijlstra's avatar
      mm/tlb: Remove tlb_remove_table() non-concurrent condition · 7cf82f3b
      Peter Zijlstra authored
      commit a6f57208 upstream.
      
      Will noted that only checking mm_users is incorrect; we should also
      check mm_count in order to cover CPUs that have a lazy reference to
      this mm (and could do speculative TLB operations).
      
      If removing this turns out to be a performance issue, we can
      re-instate a more complete check, but in tlb_table_flush() eliding the
      call_rcu_sched().
      
      Fixes: 26723911 ("mm, powerpc: move the RCU page-table freeing into generic code")
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cf82f3b
    • Jon Hunter's avatar
      ARM: tegra: Fix Tegra30 Cardhu PCA954x reset · ddcb9270
      Jon Hunter authored
      commit 6e181190 upstream.
      
      On all versions of Tegra30 Cardhu, the reset signal to the NXP PCA9546
      I2C mux is connected to the Tegra GPIO BB0. Currently, this pin on the
      Tegra is not configured as a GPIO but as a special-function IO (SFIO)
      that is multiplexing the pin to an I2S controller. On exiting system
      suspend, I2C commands sent to the PCA9546 are failing because there is
      no ACK. Although it is not possible to see exactly what is happening
      to the reset during suspend, by ensuring it is configured as a GPIO
      and driven high, to de-assert the reset, the failures are no longer
      seen.
      
      Please note that this GPIO is also used to drive the reset signal
      going to the camera connector on the board. However, given that there
      is no camera support currently for Cardhu, this should not have any
      impact.
      
      Fixes: 40431d16 ("ARM: tegra: enable PCA9546 on Cardhu")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ddcb9270
    • Trond Myklebust's avatar
      NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence() · d453f04e
      Trond Myklebust authored
      commit 8618289c upstream.
      
      We must drop the lock before we can sleep in referring_call_exists().
      Reported-by: default avatarJia-Ju Bai <baijiaju1990@gmail.com>
      Fixes: 045d2a6d ("NFSv4.1: Delay callback processing...")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d453f04e
    • Trond Myklebust's avatar
      NFSv4: Fix locking in pnfs_generic_recover_commit_reqs · c5759d5a
      Trond Myklebust authored
      commit d0fbb1d8 upstream.
      
      The use of the inode->i_lock was converted to a mutex, but we forgot
      to remove the old inode unlock/lock() pair that allowed the layout
      segment to be put inside the loop.
      Reported-by: default avatarJia-Ju Bai <baijiaju1990@gmail.com>
      Fixes: e824f99a ("NFSv4: Use a mutex to protect the per-inode commit...")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5759d5a
    • Bill Baker's avatar
      NFSv4 client live hangs after live data migration recovery · bf23ba37
      Bill Baker authored
      commit 0f90be13 upstream.
      
      After a live data migration event at the NFS server, the client may send
      I/O requests to the wrong server, causing a live hang due to repeated
      recovery events.  On the wire, this will appear as an I/O request failing
      with NFS4ERR_BADSESSION, followed by successful CREATE_SESSION, repeatedly.
      NFS4ERR_BADSSESSION is returned because the session ID being used was
      issued by the other server and is not valid at the old server.
      
      The failure is caused by async worker threads having cached the transport
      (xprt) in the rpc_task structure.  After the migration recovery completes,
      the task is redispatched and the task resends the request to the wrong
      server based on the old value still present in tk_xprt.
      
      The solution is to recompute the tk_xprt field of the rpc_task structure
      so that the request goes to the correct server.
      Signed-off-by: default avatarBill Baker <bill.baker@oracle.com>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Tested-by: default avatarHelen Chao <helen.chao@oracle.com>
      Fixes: fb43d172 ("SUNRPC: Use the multipath iterator to assign a ...")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf23ba37
    • Dan Carpenter's avatar
      pnfs/blocklayout: off by one in bl_map_stripe() · ec13c53d
      Dan Carpenter authored
      commit 0914bb96 upstream.
      
      "dev->nr_children" is the number of children which were parsed
      successfully in bl_parse_stripe().  It could be all of them and then, in
      that case, it is equal to v->stripe.volumes_count.  Either way, the >
      should be >= so that we don't go beyond the end of what we're supposed
      to.
      
      Fixes: 5c83746a ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org # 3.17+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec13c53d
    • Maciej S. Szmigiero's avatar
      block, bfq: return nbytes and not zero from struct cftype .write() method · ed480f2b
      Maciej S. Szmigiero authored
      commit fc8ebd01 upstream.
      
      The value that struct cftype .write() method returns is then directly
      returned to userspace as the value returned by write() syscall, so it
      should be the number of bytes actually written (or consumed) and not zero.
      
      Returning zero from write() syscall makes programs like /bin/echo or bash
      spin.
      Signed-off-by: default avatarMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Fixes: e21b7a0b ("block, bfq: add full hierarchical scheduling and cgroups support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed480f2b
    • Max Filippov's avatar
      xtensa: increase ranges in ___invalidate_{i,d}cache_all · fe806eb5
      Max Filippov authored
      commit fec3259c upstream.
      
      Cache invalidation macros use cache line size to iterate over
      invalidated cache lines, assuming that all cache ways are invalidated by
      single instruction, but xtensa ISA recommends to not assume that for
      future compatibility:
        In some implementations all ways at index Addry-1..z are invalidated
        regardless of the specified way, but for future compatibility this
        behavior should not be assumed.
      
      Iterate over all cache ways in ___invalidate_icache_all and
      ___invalidate_dcache_all.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe806eb5
    • Max Filippov's avatar
      xtensa: limit offsets in __loop_cache_{all,page} · 0d78efe0
      Max Filippov authored
      commit be75de25 upstream.
      
      When building kernel for xtensa cores with big cache lines (e.g. 128
      bytes or more) __loop_cache_all and __loop_cache_page may generate
      assembly instructions with immediate fields that are too big. This
      results in the following build errors:
      
        arch/xtensa/mm/misc.S: Assembler messages:
        arch/xtensa/mm/misc.S:464: Error: operand 2 of 'diwbi' has invalid value '256'
        arch/xtensa/mm/misc.S:464: Error: operand 2 of 'diwbi' has invalid value '384'
        arch/xtensa/kernel/head.S: Assembler messages:
        arch/xtensa/kernel/head.S:172: Error: operand 2 of 'diu' has invalid value '256'
        arch/xtensa/kernel/head.S:172: Error: operand 2 of 'diu' has invalid value '384'
        arch/xtensa/kernel/head.S:176: Error: operand 2 of 'iiu' has invalid value '256'
        arch/xtensa/kernel/head.S:176: Error: operand 2 of 'iiu' has invalid value '384'
        arch/xtensa/kernel/head.S:255: Error: operand 2 of 'diwb' has invalid value '256'
        arch/xtensa/kernel/head.S:255: Error: operand 2 of 'diwb' has invalid value '384'
      
      Add parameter max_immed to these macros and use it to limit values of
      immediate operands. Extract common code of these macros into the new
      macro __loop_cache_unroll.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d78efe0
    • Paul Mackerras's avatar
      KVM: PPC: Book3S: Fix guest DMA when guest partially backed by THP pages · 025cc91f
      Paul Mackerras authored
      commit 8cfbdbdc upstream.
      
      Commit 76fa4975 ("KVM: PPC: Check if IOMMU page is contained in
      the pinned physical page", 2018-07-17) added some checks to ensure
      that guest DMA mappings don't attempt to map more than the guest is
      entitled to access. However, errors in the logic mean that legitimate
      guest requests to map pages for DMA are being denied in some
      situations. Specifically, if the first page of the range passed to
      mm_iommu_get() is mapped with a normal page, and subsequent pages are
      mapped with transparent huge pages, we end up with mem->pageshift ==
      0. That means that the page size checks in mm_iommu_ua_to_hpa() and
      mm_iommu_up_to_hpa_rm() will always fail for every page in that
      region, and thus the guest can never map any memory in that region for
      DMA, typically leading to a flood of error messages like this:
      
        qemu-system-ppc64: VFIO_MAP_DMA: -22
        qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument)
      
      The logic errors in mm_iommu_get() are:
      
        (a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte()
            call (meaning that find_linux_pte() returns the pte for the
            first address in the range, not the address we are currently up
            to);
        (b) use of 'pageshift' as the variable to receive the hugepage shift
            returned by find_linux_pte() - for a normal page this gets set
            to 0, leading to us setting mem->pageshift to 0 when we conclude
            that the pte returned by find_linux_pte() didn't match the page
            we were looking at;
        (c) comparing 'compshift', which is a page order, i.e. log base 2 of
            the number of pages, with 'pageshift', which is a log base 2 of
            the number of bytes.
      
      To fix these problems, this patch introduces 'cur_ua' to hold the
      current user address and uses that in the find_linux_pte() call;
      introduces 'pteshift' to hold the hugepage shift found by
      find_linux_pte(); and compares 'pteshift' with 'compshift +
      PAGE_SHIFT' rather than 'compshift'.
      
      The patch also moves the local_irq_restore to the point after the PTE
      pointer returned by find_linux_pte() has been dereferenced because
      otherwise the PTE could change underneath us, and adds a check to
      avoid doing the find_linux_pte() call once mem->pageshift has been
      reduced to PAGE_SHIFT, as an optimization.
      
      Fixes: 76fa4975 ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      025cc91f
    • Paolo Bonzini's avatar
      KVM: VMX: fixes for vmentry_l1d_flush module parameter · 58936d4d
      Paolo Bonzini authored
      commit 0027ff2a upstream.
      
      Two bug fixes:
      
      1) missing entries in the l1d_param array; this can cause a host crash
      if an access attempts to reach the missing entry. Future-proof the get
      function against any overflows as well.  However, the two entries
      VMENTER_L1D_FLUSH_EPT_DISABLED and VMENTER_L1D_FLUSH_NOT_REQUIRED must
      not be accepted by the parse function, so disable them there.
      
      2) invalid values must be rejected even if the CPU does not have the
      bug, so test for them before checking boot_cpu_has(X86_BUG_L1TF)
      
      ... and a small refactoring, since the .cmd field is redundant with
      the index in the array.
      Reported-by: default avatarBandan Das <bsd@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: a7b9020bSigned-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58936d4d
    • zhangyi (F)'s avatar
      PM / sleep: wakeup: Fix build error caused by missing SRCU support · 015156f5
      zhangyi (F) authored
      commit 3df6f61f upstream.
      
      Commit ea0212f4 (power: auto select CONFIG_SRCU) made the code in
      drivers/base/power/wakeup.c use SRCU instead of RCU, but it forgot to
      select CONFIG_SRCU in Kconfig, which leads to the following build
      error if CONFIG_SRCU is not selected somewhere else:
      
      drivers/built-in.o: In function `wakeup_source_remove':
      (.text+0x3c6fc): undefined reference to `synchronize_srcu'
      drivers/built-in.o: In function `pm_print_active_wakeup_sources':
      (.text+0x3c7a8): undefined reference to `__srcu_read_lock'
      drivers/built-in.o: In function `pm_print_active_wakeup_sources':
      (.text+0x3c84c): undefined reference to `__srcu_read_unlock'
      drivers/built-in.o: In function `device_wakeup_arm_wake_irqs':
      (.text+0x3d1d8): undefined reference to `__srcu_read_lock'
      drivers/built-in.o: In function `device_wakeup_arm_wake_irqs':
      (.text+0x3d228): undefined reference to `__srcu_read_unlock'
      drivers/built-in.o: In function `device_wakeup_disarm_wake_irqs':
      (.text+0x3d24c): undefined reference to `__srcu_read_lock'
      drivers/built-in.o: In function `device_wakeup_disarm_wake_irqs':
      (.text+0x3d29c): undefined reference to `__srcu_read_unlock'
      drivers/built-in.o:(.data+0x4158): undefined reference to `process_srcu'
      
      Fix this error by selecting CONFIG_SRCU when PM_SLEEP is enabled.
      
      Fixes: ea0212f4 (power: auto select CONFIG_SRCU)
      Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      [ rjw: Minor subject/changelog fixups ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      015156f5
    • Henry Willard's avatar
      cpufreq: governor: Avoid accessing invalid governor_data · 924383ed
      Henry Willard authored
      commit 2a3eb51e upstream.
      
      If cppc_cpufreq.ko is deleted at the same time that tuned-adm is
      changing profiles, there is a small chance that a race can occur
      between cpufreq_dbs_governor_exit() and cpufreq_dbs_governor_limits()
      resulting in a system failure when the latter tries to use
      policy->governor_data that has been freed by the former.
      
      This patch uses gov_dbs_data_mutex to synchronize access.
      
      Fixes: e788892b (cpufreq: governor: Get rid of governor events)
      Signed-off-by: default avatarHenry Willard <henry.willard@oracle.com>
      [ rjw: Subject, minor white space adjustment ]
      Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      924383ed
    • Peter Kalauskas's avatar
      drivers/block/zram/zram_drv.c: fix bug storing backing_dev · 256f63f5
      Peter Kalauskas authored
      commit c8bd134a upstream.
      
      The call to strlcpy in backing_dev_store is incorrect. It should take
      the size of the destination buffer instead of the size of the source
      buffer.  Additionally, ignore the newline character (\n) when reading
      the new file_name buffer. This makes it possible to set the backing_dev
      as follows:
      
      	echo /dev/sdX > /sys/block/zram0/backing_dev
      
      The reason it worked before was the fact that strlcpy() copies 'len - 1'
      bytes, which is strlen(buf) - 1 in our case, so it accidentally didn't
      copy the trailing new line symbol.  Which also means that "echo -n
      /dev/sdX" most likely was broken.
      Signed-off-by: default avatarPeter Kalauskas <peskal@google.com>
      Link: http://lkml.kernel.org/r/20180813061623.GC64836@rodete-desktop-imager.corp.google.comAcked-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: <stable@vger.kernel.org>    [4.14+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      256f63f5
    • Amir Goldstein's avatar
      ovl: fix wrong use of impure dir cache in ovl_iterate() · 8840ca57
      Amir Goldstein authored
      commit 67810693 upstream.
      
      Only upper dir can be impure, but if we are in the middle of
      iterating a lower real dir, dir could be copied up and marked
      impure. We only want the impure cache if we started iterating
      a real upper dir to begin with.
      
      Aditya Kali reported that the following reproducer hits the
      WARN_ON(!cache->refcount) in ovl_get_cache():
      
       docker run --rm drupal:8.5.4-fpm-alpine \
          sh -c 'cd /var/www/html/vendor/symfony && \
                 chown -R www-data:www-data . && ls -l .'
      Reported-by: default avatarAditya Kali <adityakali@google.com>
      Tested-by: default avatarAditya Kali <adityakali@google.com>
      Fixes: 4edb83bb ('ovl: constant d_ino for non-merge dirs')
      Cc: <stable@vger.kernel.org> # v4.14
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8840ca57
    • Rafael David Tinoco's avatar
      mfd: hi655x: Fix regmap area declared size for hi655x · aa9ceea2
      Rafael David Tinoco authored
      commit 6afebb70 upstream.
      
      Fixes https://bugs.linaro.org/show_bug.cgi?id=3903
      
      LTP Functional tests have caused a bad paging request when triggering
      the regmap_read_debugfs() logic of the device PMIC Hi6553 (reading
      regmap/f8000000.pmic/registers file during read_all test):
      
      Unable to handle kernel paging request at virtual address ffff0
      [ffff00000984e000] pgd=0000000077ffe803, pud=0000000077ffd803,0
      Internal error: Oops: 96000007 [#1] SMP
      ...
      Hardware name: HiKey Development Board (DT)
      ...
      Call trace:
       regmap_mmio_read8+0x24/0x40
       regmap_mmio_read+0x48/0x70
       _regmap_bus_reg_read+0x38/0x48
       _regmap_read+0x68/0x170
       regmap_read+0x50/0x78
       regmap_read_debugfs+0x1a0/0x308
       regmap_map_read_file+0x48/0x58
       full_proxy_read+0x68/0x98
       __vfs_read+0x48/0x80
       vfs_read+0x94/0x150
       SyS_read+0x6c/0xd8
       el0_svc_naked+0x30/0x34
      Code: aa1e03e0 d503201f f9400280 8b334000 (39400000)
      
      Investigations have showed that, when triggered by debugfs read()
      handler, the mmio regmap logic was reading a bigger (16k) register area
      than the one mapped by devm_ioremap_resource() during hi655x-pmic probe
      time (4k).
      
      This commit changes hi655x's max register, according to HW specs, to be
      the same as the one declared in the pmic device in hi6220's dts, fixing
      the issue.
      
      Cc: <stable@vger.kernel.org> #v4.9 #v4.14 #v4.16 #v4.17
      Signed-off-by: default avatarRafael David Tinoco <rafael.tinoco@linaro.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa9ceea2
    • Steven Rostedt (VMware)'s avatar
      uprobes: Use synchronize_rcu() not synchronize_sched() · 4f6789ca
      Steven Rostedt (VMware) authored
      commit 016f8ffc upstream.
      
      While debugging another bug, I was looking at all the synchronize*()
      functions being used in kernel/trace, and noticed that trace_uprobes was
      using synchronize_sched(), with a comment to synchronize with
      {u,ret}_probe_trace_func(). When looking at those functions, the data is
      protected with "rcu_read_lock()" and not with "rcu_read_lock_sched()". This
      is using the wrong synchronize_*() function.
      
      Link: http://lkml.kernel.org/r/20180809160553.469e1e32@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Fixes: 70ed91c6 ("tracing/uprobes: Support ftrace_event_file base multibuffer")
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f6789ca
    • Kamalesh Babulal's avatar
      livepatch: Validate module/old func name length · a36e2aa9
      Kamalesh Babulal authored
      commit 6e9df95b upstream.
      
      livepatch module author can pass module name/old function name with more
      than the defined character limit. With obj->name length greater than
      MODULE_NAME_LEN, the livepatch module gets loaded but waits forever on
      the module specified by obj->name to be loaded. It also populates a /sys
      directory with an untruncated object name.
      
      In the case of funcs->old_name length greater then KSYM_NAME_LEN, it
      would not match against any of the symbol table entries. Instead loop
      through the symbol table comparing them against a nonexisting function,
      which can be avoided.
      
      The same issues apply, to misspelled/incorrect names. At least gatekeep
      the modules with over the limit string length, by checking for their
      length during livepatch module registration.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a36e2aa9
    • Steven Rostedt (VMware)'s avatar
      printk/tracing: Do not trace printk_nmi_enter() · 68a735eb
      Steven Rostedt (VMware) authored
      commit d1c392c9 upstream.
      
      I hit the following splat in my tests:
      
      ------------[ cut here ]------------
      IRQs not enabled as expected
      WARNING: CPU: 3 PID: 0 at kernel/time/tick-sched.c:982 tick_nohz_idle_enter+0x44/0x8c
      Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipv6
      CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-rc2-test+ #2
      Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
      EIP: tick_nohz_idle_enter+0x44/0x8c
      Code: ec 05 00 00 00 75 26 83 b8 c0 05 00 00 00 75 1d 80 3d d0 36 3e c1 00
      75 14 68 94 63 12 c1 c6 05 d0 36 3e c1 01 e8 04 ee f8 ff <0f> 0b 58 fa bb a0
      e5 66 c1 e8 25 0f 04 00 64 03 1d 28 31 52 c1 8b
      EAX: 0000001c EBX: f26e7f8c ECX: 00000006 EDX: 00000007
      ESI: f26dd1c0 EDI: 00000000 EBP: f26e7f40 ESP: f26e7f38
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010296
      CR0: 80050033 CR2: 0813c6b0 CR3: 2f342000 CR4: 001406f0
      Call Trace:
       do_idle+0x33/0x202
       cpu_startup_entry+0x61/0x63
       start_secondary+0x18e/0x1ed
       startup_32_smp+0x164/0x168
      irq event stamp: 18773830
      hardirqs last  enabled at (18773829): [<c040150c>] trace_hardirqs_on_thunk+0xc/0x10
      hardirqs last disabled at (18773830): [<c040151c>] trace_hardirqs_off_thunk+0xc/0x10
      softirqs last  enabled at (18773824): [<c0ddaa6f>] __do_softirq+0x25f/0x2bf
      softirqs last disabled at (18773767): [<c0416bbe>] call_on_stack+0x45/0x4b
      ---[ end trace b7c64aa79e17954a ]---
      
      After a bit of debugging, I found what was happening. This would trigger
      when performing "perf" with a high NMI interrupt rate, while enabling and
      disabling function tracer. Ftrace uses breakpoints to convert the nops at
      the start of functions to calls to the function trampolines. The breakpoint
      traps disable interrupts and this makes calls into lockdep via the
      trace_hardirqs_off_thunk in the entry.S code. What happens is the following:
      
        do_idle {
      
          [interrupts enabled]
      
          <interrupt> [interrupts disabled]
      	TRACE_IRQS_OFF [lockdep says irqs off]
      	[...]
      	TRACE_IRQS_IRET
      	    test if pt_regs say return to interrupts enabled [yes]
      	    TRACE_IRQS_ON [lockdep says irqs are on]
      
      	    <nmi>
      		nmi_enter() {
      		    printk_nmi_enter() [traced by ftrace]
      		    [ hit ftrace breakpoint ]
      		    <breakpoint exception>
      			TRACE_IRQS_OFF [lockdep says irqs off]
      			[...]
      			TRACE_IRQS_IRET [return from breakpoint]
      			   test if pt_regs say interrupts enabled [no]
      			   [iret back to interrupt]
      	   [iret back to code]
      
          tick_nohz_idle_enter() {
      
      	lockdep_assert_irqs_enabled() [lockdep say no!]
      
      Although interrupts are indeed enabled, lockdep thinks it is not, and since
      we now do asserts via lockdep, it gives a false warning. The issue here is
      that printk_nmi_enter() is called before lockdep_off(), which disables
      lockdep (for this reason) in NMIs. By simply not allowing ftrace to see
      printk_nmi_enter() (via notrace annotation) we keep lockdep from getting
      confused.
      
      Cc: stable@vger.kernel.org
      Fixes: 42a0bb3f ("printk/nmi: generic solution for safe printk in NMI")
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68a735eb
    • Steven Rostedt (VMware)'s avatar
      tracing/blktrace: Fix to allow setting same value · cbde057a
      Steven Rostedt (VMware) authored
      commit 757d9140 upstream.
      
      Masami Hiramatsu reported:
      
        Current trace-enable attribute in sysfs returns an error
        if user writes the same setting value as current one,
        e.g.
      
          # cat /sys/block/sda/trace/enable
          0
          # echo 0 > /sys/block/sda/trace/enable
          bash: echo: write error: Invalid argument
          # echo 1 > /sys/block/sda/trace/enable
          # echo 1 > /sys/block/sda/trace/enable
          bash: echo: write error: Device or resource busy
      
        But this is not a preferred behavior, it should ignore
        if new setting is same as current one. This fixes the
        problem as below.
      
          # cat /sys/block/sda/trace/enable
          0
          # echo 0 > /sys/block/sda/trace/enable
          # echo 1 > /sys/block/sda/trace/enable
          # echo 1 > /sys/block/sda/trace/enable
      
      Link: http://lkml.kernel.org/r/20180816103802.08678002@gandalf.local.home
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: cd649b8b ("blktrace: remove sysfs_blk_trace_enable_show/store()")
      Reported-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbde057a
    • Steven Rostedt (VMware)'s avatar
      tracing: Do not call start/stop() functions when tracing_on does not change · 4c901675
      Steven Rostedt (VMware) authored
      commit f143641b upstream.
      
      Currently, when one echo's in 1 into tracing_on, the current tracer's
      "start()" function is executed, even if tracing_on was already one. This can
      lead to strange side effects. One being that if the hwlat tracer is enabled,
      and someone does "echo 1 > tracing_on" into tracing_on, the hwlat tracer's
      start() function is called again which will recreate another kernel thread,
      and make it unable to remove the old one.
      
      Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de
      
      Cc: stable@vger.kernel.org
      Fixes: 2df8f8a6 ("tracing: Fix regression with irqsoff tracer and tracing_on file")
      Reported-by: default avatarErica Bugden <erica.bugden@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c901675