1. 02 Dec, 2018 2 commits
    • Dave Kleikamp's avatar
      nfs: don't dirty kernel pages read by direct-io · ad3cba22
      Dave Kleikamp authored
      When we use direct_IO with an NFS backing store, we can trigger a
      WARNING in __set_page_dirty(), as below, since we're dirtying the page
      unnecessarily in nfs_direct_read_completion().
      
      To fix, replicate the logic in commit 53cbf3b1 ("fs: direct-io:
      don't dirtying pages for ITER_BVEC/ITER_KVEC direct read").
      
      Other filesystems that implement direct_IO handle this; most use
      blockdev_direct_IO(). ceph and cifs have similar logic.
      
      mount 127.0.0.1:/export /nfs
      dd if=/dev/zero of=/nfs/image bs=1M count=200
      losetup --direct-io=on -f /nfs/image
      mkfs.btrfs /dev/loop0
      mount -t btrfs /dev/loop0 /mnt/
      
      kernel: WARNING: CPU: 0 PID: 8067 at fs/buffer.c:580 __set_page_dirty+0xaf/0xd0
      kernel: Modules linked in: loop(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) fuse(E) tun(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_
      kernel:  snd_seq(E) snd_seq_device(E) snd_pcm(E) video(E) snd_timer(E) snd(E) soundcore(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) cdrom(E) ata_generic(E) pata_acpi(E) crc32c_intel(E) ahci(E) li
      kernel: CPU: 0 PID: 8067 Comm: kworker/0:2 Tainted: G            E     4.20.0-rc1.master.20181111.ol7.x86_64 #1
      kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      kernel: Workqueue: nfsiod rpc_async_release [sunrpc]
      kernel: RIP: 0010:__set_page_dirty+0xaf/0xd0
      kernel: Code: c3 48 8b 02 f6 c4 04 74 d4 48 89 df e8 ba 05 f7 ff 48 89 c6 eb cb 48 8b 43 08 a8 01 75 1f 48 89 d8 48 8b 00 a8 04 74 02 eb 87 <0f> 0b eb 83 48 83 e8 01 eb 9f 48 83 ea 01 0f 1f 00 eb 8b 48 83 e8
      kernel: RSP: 0000:ffffc1c8825b7d78 EFLAGS: 00013046
      kernel: RAX: 000fffffc0020089 RBX: fffff2b603308b80 RCX: 0000000000000001
      kernel: RDX: 0000000000000001 RSI: ffff9d11478115c8 RDI: ffff9d11478115d0
      kernel: RBP: ffffc1c8825b7da0 R08: 0000646f6973666e R09: 8080808080808080
      kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d11478115d0
      kernel: R13: ffff9d11478115c8 R14: 0000000000003246 R15: 0000000000000001
      kernel: FS:  0000000000000000(0000) GS:ffff9d115ba00000(0000) knlGS:0000000000000000
      kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      kernel: CR2: 00007f408686f640 CR3: 0000000104d8e004 CR4: 00000000000606f0
      kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      kernel: Call Trace:
      kernel:  __set_page_dirty_buffers+0xb6/0x110
      kernel:  set_page_dirty+0x52/0xb0
      kernel:  nfs_direct_read_completion+0xc4/0x120 [nfs]
      kernel:  nfs_pgio_release+0x10/0x20 [nfs]
      kernel:  rpc_free_task+0x30/0x70 [sunrpc]
      kernel:  rpc_async_release+0x12/0x20 [sunrpc]
      kernel:  process_one_work+0x174/0x390
      kernel:  worker_thread+0x4f/0x3e0
      kernel:  kthread+0x102/0x140
      kernel:  ? drain_workqueue+0x130/0x130
      kernel:  ? kthread_stop+0x110/0x110
      kernel:  ret_from_fork+0x35/0x40
      kernel: ---[ end trace 01341980905412c9 ]---
      Signed-off-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      
      [forward-ported to v4.20]
      Signed-off-by: default avatarCalum Mackay <calum.mackay@oracle.com>
      Reviewed-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      ad3cba22
    • Tigran Mkrtchyan's avatar
      flexfiles: enforce per-mirror stateid only for v4 DSes · 320f35b7
      Tigran Mkrtchyan authored
      Since commit bb21ce0a we always enforce per-mirror stateid.
      However, this makes sense only for v4+ servers.
      Signed-off-by: default avatarTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      320f35b7
  2. 01 Dec, 2018 10 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4b783176
      Linus Torvalds authored
      Pull STIBP fallout fixes from Thomas Gleixner:
       "The performance destruction department finally got it's act together
        and came up with a cure for the STIPB regression:
      
         - Provide a command line option to control the spectre v2 user space
           mitigations. Default is either seccomp or prctl (if seccomp is
           disabled in Kconfig). prctl allows mitigation opt-in, seccomp
           enables the migitation for sandboxed processes.
      
         - Rework the code to handle the conditional STIBP/IBPB control and
           remove the now unused ptrace_may_access_sched() optimization
           attempt
      
         - Disable STIBP automatically when SMT is disabled
      
         - Optimize the switch_to() logic to avoid MSR writes and invocations
           of __switch_to_xtra().
      
         - Make the asynchronous speculation TIF updates synchronous to
           prevent stale mitigation state.
      
        As a general cleanup this also makes retpoline directly depend on
        compiler support and removes the 'minimal retpoline' option which just
        pretended to provide some form of security while providing none"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
        x86/speculation: Provide IBPB always command line options
        x86/speculation: Add seccomp Spectre v2 user space protection mode
        x86/speculation: Enable prctl mode for spectre_v2_user
        x86/speculation: Add prctl() control for indirect branch speculation
        x86/speculation: Prepare arch_smt_update() for PRCTL mode
        x86/speculation: Prevent stale SPEC_CTRL msr content
        x86/speculation: Split out TIF update
        ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS
        x86/speculation: Prepare for conditional IBPB in switch_mm()
        x86/speculation: Avoid __switch_to_xtra() calls
        x86/process: Consolidate and simplify switch_to_xtra() code
        x86/speculation: Prepare for per task indirect branch speculation control
        x86/speculation: Add command line control for indirect branch speculation
        x86/speculation: Unify conditional spectre v2 print functions
        x86/speculataion: Mark command line parser data __initdata
        x86/speculation: Mark string arrays const correctly
        x86/speculation: Reorder the spec_v2 code
        x86/l1tf: Show actual SMT state
        x86/speculation: Rework SMT state change
        sched/smt: Expose sched_smt_present static key
        ...
      4b783176
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20181201' of git://git.kernel.dk/linux-block · 88058417
      Linus Torvalds authored
      Pull block layer fixes from Jens Axboe:
      
       - Single range elevator discard merge fix, that caused crashes (Ming)
      
       - Fix for a regression in O_DIRECT, where we could potentially lose the
         error value (Maximilian Heyne)
      
       - NVMe pull request from Christoph, with little fixes all over the map
         for NVMe.
      
      * tag 'for-linus-20181201' of git://git.kernel.dk/linux-block:
        block: fix single range discard merge
        nvme-rdma: fix double freeing of async event data
        nvme: flush namespace scanning work just before removing namespaces
        nvme: warn when finding multi-port subsystems without multipathing enabled
        fs: fix lost error code in dio_complete
        nvme-pci: fix surprise removal
        nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request()
        nvme: Free ctrl device name on init failure
      88058417
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.20-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · c734b425
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - Fix a link speed checking interface that broke PCIe gen3 cards in
         gen1 slots (Mikulas Patocka)
      
       - Fix an imx6 link training error (Trent Piepho)
      
       - Fix a layerscape outbound window accessor calling error (Hou
         Zhiqiang)
      
       - Fix a DesignWare endpoint MSI-X address calculation error (Gustavo
         Pimentel)
      
      * tag 'pci-v4.20-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: Fix incorrect value returned from pcie_get_speed_cap()
        PCI: dwc: Fix MSI-X EP framework address calculation bug
        PCI: layerscape: Fix wrong invocation of outbound window disable accessor
        PCI: imx6: Fix link training status detection in link up check
      c734b425
    • Bjorn Helgaas's avatar
      Merge remote-tracking branch 'lorenzo/pci/controller-fixes' into for-linus · c74eadf8
      Bjorn Helgaas authored
        - Fix DesignWare endpoint MSI-X address calculation bug (Gustavo
          Pimentel)
      
        - Fix Layerscape outbound window disable usage (Hou Zhiqiang)
      
        - Fix imx6 link up detection (Trent Piepho)
      
      * lorenzo/pci/controller-fixes:
        PCI: dwc: Fix MSI-X EP framework address calculation bug
        PCI: layerscape: Fix wrong invocation of outbound window disable accessor
        PCI: imx6: Fix link training status detection in link up check
      c74eadf8
    • Mikulas Patocka's avatar
      PCI: Fix incorrect value returned from pcie_get_speed_cap() · f1f90e25
      Mikulas Patocka authored
      The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks.  We must mask
      the register and compare it against them.
      
      This fixes errors like this:
      
        amdgpu: [powerplay] failed to send message 261 ret is 0
      
      when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is
      being incorrectly reported as PCIe-v3 capable.
      
      6cf57be0, which appeared in v4.17, added pcie_get_speed_cap() with the
      incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask.  5d9a6330, which
      appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the
      amdgpu bug reports below are regressions in v4.19.
      
      Fixes: 6cf57be0 ("PCI: Add pcie_get_speed_cap() to find max supported link speed")
      Fixes: 5d9a6330 ("drm/amdgpu: use pcie functions for link width and speed")
      Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704
      Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      [bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and
      PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by PCI_EXP_LNKCAP2,
      remove test of PCI_EXP_LNKCAP for zero, since that register is required]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org	# v4.17+
      f1f90e25
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · d8f190ee
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "31 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (31 commits)
        ocfs2: fix potential use after free
        mm/khugepaged: fix the xas_create_range() error path
        mm/khugepaged: collapse_shmem() do not crash on Compound
        mm/khugepaged: collapse_shmem() without freezing new_page
        mm/khugepaged: minor reorderings in collapse_shmem()
        mm/khugepaged: collapse_shmem() remember to clear holes
        mm/khugepaged: fix crashes due to misaccounted holes
        mm/khugepaged: collapse_shmem() stop if punched or truncated
        mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
        mm/huge_memory: splitting set mapping+index before unfreeze
        mm/huge_memory: rename freeze_page() to unmap_page()
        initramfs: clean old path before creating a hardlink
        kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace
        psi: make disabling/enabling easier for vendor kernels
        proc: fixup map_files test on arm
        debugobjects: avoid recursive calls with kmemleak
        userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set
        userfaultfd: shmem: add i_size checks
        userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas
        userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem
        ...
      d8f190ee
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_4.20_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 6c7954b7
      Linus Torvalds authored
      Pull few more MIPS fixes from Paul Burton:
      
       - Fix mips_get_syscall_arg() to operate on the task specified when
         detecting o32 tasks running on MIPS64 kernels.
      
       - Fix some incorrect GPIO pin muxing for the MT7620 SoC.
      
       - Update the linux-mips mailing list address.
      
      * tag 'mips_fixes_4.20_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MAINTAINERS: Update linux-mips mailing list address
        MIPS: ralink: Fix mt7620 nd_sd pinmux
        mips: fix mips_get_syscall_arg o32 check
      6c7954b7
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 868dda00
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Cortex-A76 erratum workaround
      
       - ftrace fix to enable syscall events on arm64
      
       - Fix uninitialised pointer in iort_get_platform_device_domain()
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value
        arm64: ftrace: Fix to enable syscall events on arm64
        arm64: Add workaround for Cortex-A76 erratum 1286807
      868dda00
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugins-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1f817429
      Linus Torvalds authored
      Pull stackleak plugin fix from Kees Cook:
       "Fix crash by not allowing kprobing of stackleak_erase() (Alexander
        Popov)"
      
      * tag 'gcc-plugins-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        stackleak: Disable function tracing and kprobes for stackleak_erase()
      1f817429
    • Linus Torvalds's avatar
      Merge tag 'fscache-fixes-20181130' of... · fd3b3e0e
      Linus Torvalds authored
      Merge tag 'fscache-fixes-20181130' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      Pull fscache and cachefiles fixes from David Howells:
       "Misc fixes:
      
         - Fix an assertion failure at fs/cachefiles/xattr.c:138 caused by a
           race between a cache object lookup failing and someone attempting
           to reenable that object, thereby triggering an update of the
           object's attributes.
      
         - Fix an assertion failure at fs/fscache/operation.c:449 caused by a
           split atomic subtract and atomic read that allows a race to happen.
      
         - Fix a leak of backing pages when simultaneously reading the same
           page from the same object from two or more threads.
      
         - Fix a hang due to a race between a cache object being discarded and
           the corresponding cookie being reenabled.
      
        There are also some minor cleanups:
      
         - Cast an enum value to a different enum type to prevent clang from
           generating a warning. This shouldn't cause any sort of change in
           the emitted code.
      
         - Use ktime_get_real_seconds() instead of get_seconds(). This is just
           used to uniquify a filename for an object to be placed in the
           graveyard. Objects placed there are deleted by cachfilesd in
           userspace immediately thereafter.
      
         - Remove an initialised, but otherwise unused variable. This should
           have been entirely optimised away anyway"
      
      * tag 'fscache-fixes-20181130' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        fscache, cachefiles: remove redundant variable 'cache'
        cachefiles: avoid deprecated get_seconds()
        cachefiles: Explicitly cast enumerated type in put_object
        fscache: fix race between enablement and dropping of object
        cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active
        fscache: Fix race in fscache_op_complete() due to split atomic_sub & read
        cachefiles: Fix an assertion failure when trying to update a failed object
      fd3b3e0e
  3. 30 Nov, 2018 28 commits