1. 10 Aug, 2017 1 commit
  2. 26 Jul, 2017 1 commit
    • Scott Bauer's avatar
      nvme: validate admin queue before unquiesce · 7dd1ab16
      Scott Bauer authored
      With a misbehaving controller it's possible we'll never
      enter the live state and create an admin queue. When we
      fail out of reset work it's possible we failed out early
      enough without setting up the admin queue. We tear down
      queues after a failed reset, but needed to do some more
      sanitization.
      
      Fixes 443bd90f: "nvme: host: unquiesce queue in nvme_kill_queues()"
      
      [  189.650995] nvme nvme1: pci function 0000:0b:00.0
      [  317.680055] nvme nvme0: Device not ready; aborting reset
      [  317.680183] nvme nvme0: Removing after probe failure status: -19
      [  317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  317.681397] general protection fault: 0000 [#1] SMP KASAN
      [  317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
      [  317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
      [  317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
      [  317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
      [  317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
      [  317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
      [  317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
      [  317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
      [  317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
      [  317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
      [  317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
      [  317.684371] FS:  0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
      [  317.684519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
      [  317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  317.685018] Call Trace:
      [  317.685084]  nvme_kill_queues+0x4d/0x170 [nvme_core]
      [  317.685185]  nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
      [  317.685289]  process_one_work+0x771/0x1170
      [  317.685372]  worker_thread+0xde/0x11e0
      [  317.685452]  ? pci_mmcfg_check_reserved+0x110/0x110
      [  317.685550]  kthread+0x2d3/0x3d0
      [  317.685617]  ? process_one_work+0x1170/0x1170
      [  317.685704]  ? kthread_create_on_node+0xc0/0xc0
      [  317.685785]  ret_from_fork+0x25/0x30
      [  317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
      [  317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
      [  317.685908] ---[ end trace a3f8704150b1e8b4 ]---
      Signed-off-by: default avatarScott Bauer <scott.bauer@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      7dd1ab16
  3. 25 Jul, 2017 5 commits
    • Christoph Hellwig's avatar
      nvme-pci: fix HMB size calculation · 50cdb7c6
      Christoph Hellwig authored
      It's possible the preferred HMB size may not be a multiple of the
      chunk_size. This patch moves len to function scope and uses that in
      the for loop increment so the last iteration doesn't cause the total
      size to exceed the allocated HMB size.
      
      Based on an earlier patch from Keith Busch.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Fixes: 87ad72a5 ("nvme-pci: implement host memory buffer support")
      50cdb7c6
    • James Smart's avatar
      nvme-fc: revise TRADDR parsing · 9c5358e1
      James Smart authored
      The FC-NVME spec hasn't locked down on the format string for TRADDR.
      Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>"
      where the wwn's are hex values but not prefixed by 0x.
      
      Most implementations so far expect a string format of
      "nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport
      uses the match_u64 parser which requires a leading 0x prefix to set
      the base properly. If it's not there, a match will either fail or return
      a base 10 value.
      
      The resolution in T11 is pushing out. Therefore, to fix things now and
      to cover any eventuality and any implementations already in the field,
      this patch adds support for both formats.
      
      The change consists of replacing the token matching routine with a
      routine that validates the fixed string format, and then builds
      a local copy of the hex name with a 0x prefix before calling
      the system parser.
      
      Note: the same parser routine exists in both the initiator and target
      transports. Given this is about the only "shared" item, we chose to
      replicate rather than create an interdendency on some shared code.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      9c5358e1
    • James Smart's avatar
      nvme-fc: address target disconnect race conditions in fcp io submit · 8b25f351
      James Smart authored
      There are cases where threads are in the process of submitting new
      io when the LLDD calls in to remove the remote port. In some cases,
      the next io actually goes to the LLDD, who knows the remoteport isn't
      present and rejects it. To properly recovery/restart these i/o's we
      don't want to hard fail them, we want to treat them as temporary
      resource errors in which a delayed retry will work.
      
      Add a couple more checks on remoteport connectivity and commonize the
      busy response handling when it's seen.
      Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      8b25f351
    • Jon Derrick's avatar
      nvme: fabrics commands should use the fctype field for data direction · 2fd4167f
      Jon Derrick authored
      Fabrics commands with opcode 0x7F use the fctype field to indicate data
      direction.
      Signed-off-by: default avatarJon Derrick <jonathan.derrick@intel.com>
      Reviewed-by: default avatarSagi Grimberg <sai@grmberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
      2fd4167f
    • Johannes Thumshirn's avatar
      nvme: also provide a UUID in the WWID sysfs attribute · 6484f5d1
      Johannes Thumshirn authored
      The WWID sysfs attribute can provide multiple means of a World Wide ID
      for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation
      of VID, Serial Number, Model and the Namespace ID in this order of
      preference.
      
      If the target also sends us a UUID use the UUID for identification and
      give it the highest priority.
      
      This eases generation of /dev/disk/by-* symlinks.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      6484f5d1
  4. 24 Jul, 2017 3 commits
    • Christoph Hellwig's avatar
      blk-mq: map queues to all present CPUs · 76451d79
      Christoph Hellwig authored
      We already do this for PCI mappings, and the higher level code now
      expects that CPU on/offlining doesn't have an affect on the queue
      mappings.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      76451d79
    • Christoph Hellwig's avatar
      block: disable runtime-pm for blk-mq · 765e40b6
      Christoph Hellwig authored
      The blk-mq code lacks support for looking at the rpm_status field, tracking
      active requests and the RQF_PM flag.
      
      Due to the default switch to blk-mq for scsi people start to run into
      suspend / resume issue due to this fact, so make sure we disable the runtime
      PM functionality until it is properly implemented.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      765e40b6
    • Bart Van Assche's avatar
      xen-blkfront: Fix handling of non-supported operations · 31c4ccc3
      Bart Van Assche authored
      This patch fixes the following sparse warnings:
      
      drivers/block/xen-blkfront.c:916:45: warning: incorrect type in argument 2 (different base types)
      drivers/block/xen-blkfront.c:916:45:    expected restricted blk_status_t [usertype] error
      drivers/block/xen-blkfront.c:916:45:    got int [signed] error
      drivers/block/xen-blkfront.c:1599:47: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1599:47:    expected int [signed] error
      drivers/block/xen-blkfront.c:1599:47:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1607:55: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1607:55:    expected int [signed] error
      drivers/block/xen-blkfront.c:1607:55:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1625:55: warning: incorrect type in assignment (different base types)
      drivers/block/xen-blkfront.c:1625:55:    expected int [signed] error
      drivers/block/xen-blkfront.c:1625:55:    got restricted blk_status_t [usertype] <noident>
      drivers/block/xen-blkfront.c:1628:62: warning: restricted blk_status_t degrades to integer
      
      Compile-tested only.
      
      Fixes: commit 2a842aca ("block: introduce new block status code type")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Roger Pau Monné <roger.pau@citrix.com>
      Cc: <xen-devel@lists.xenproject.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      31c4ccc3
  5. 22 Jul, 2017 4 commits
  6. 21 Jul, 2017 26 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.13-2' of git://git.linux-nfs.org/projects/anna/linux-nfs · 505d5c11
      Linus Torvalds authored
      Pull NFS client bugfixes from Anna Schumaker:
       "Stable bugfix:
         - Fix error reporting regression
      
        Bugfixes:
         - Fix setting filelayout ds address race
         - Fix subtle access bug when using ACLs
         - Fix setting mnt3_counts array size
         - Fix a couple of pNFS commit races"
      
      * tag 'nfs-for-4.13-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        NFS/filelayout: Fix racy setting of fl->dsaddr in filelayout_check_deviceid()
        NFS: Be more careful about mapping file permissions
        NFS: Store the raw NFS access mask in the inode's access cache
        NFSv3: Convert nfs3_proc_access() to use nfs_access_set_mask()
        NFS: Refactor NFS access to kernel access mask calculation
        net/sunrpc/xprt_sock: fix regression in connection error reporting.
        nfs: count correct array for mnt3_counts array size
        Revert commit 722f0b89 ("pNFS: Don't send COMMITs to the DSes if...")
        pNFS/flexfiles: Handle expired layout segments in ff_layout_initiate_commit()
        NFS: Fix another COMMIT race in pNFS
        NFS: Fix a COMMIT race in pNFS
        mount: copy the port field into the cloned nfs_server structure.
        NFS: Don't run wake_up_bit() when nobody is waiting...
        nfs: add export operations
      505d5c11
    • Linus Torvalds's avatar
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 99313414
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "This fixes a crash with SELinux and several other old and new bugs"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: check for bad and whiteout index on lookup
        ovl: do not cleanup directory and whiteout index entries
        ovl: fix xattr get and set with selinux
        ovl: remove unneeded check for IS_ERR()
        ovl: fix origin verification of index dir
        ovl: mark parent impure on ovl_link()
        ovl: fix random return value on mount
      99313414
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 0151ef00
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A small set of fixes for -rc2 - two fixes for BFQ, documentation and
        code, and a removal of an unused variable in nbd. Outside of that, a
        small collection of fixes from the usual crew on the nvme side"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        nvmet: don't report 0-bytes in serial number
        nvmet: preserve controller serial number between reboots
        nvmet: Move serial number from controller to subsystem
        nvmet: prefix version configfs file with attr
        nvme-pci: Fix an error handling path in 'nvme_probe()'
        nvme-pci: Remove nvme_setup_prps BUG_ON
        nvme-pci: add another device ID with stripe quirk
        nvmet-fc: fix byte swapping in nvmet_fc_ls_create_association
        nvme: fix byte swapping in the streams code
        nbd: kill unused ret in recv_work
        bfq: dispatch request to prevent queue stalling after the request completion
        bfq: fix typos in comments about B-WF2Q+ algorithm
      0151ef00
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · bb236dbe
      Linus Torvalds authored
      Pull more rdma fixes from Doug Ledford:
       "As per my previous pull request, there were two drivers that each had
        a rather large number of legitimate fixes still to be sent.
      
        As it turned out, I also missed a reasonably large set of fixes from
        one person across the stack that are all important fixes. All in all,
        the bnxt_re, i40iw, and Dan Carpenter are 3/4 to 2/3rds of this pull
        request.
      
        There were some other random fixes that I didn't send in the last pull
        request that I added to this one. This catches the rdma stack up to
        the fixes from up to about the beginning of this week. Any more fixes
        I'll wait and batch up later in the -rc cycle. This will give us a
        good base to start with for basing a for-next branch on -rc2.
      
        Summary:
      
         - i40iw fixes
      
         - bnxt_re fixes
      
         - Dan Carpenter bugfixes across stack
      
         - ten more random fixes, no more than two from any one person"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
        RDMA/core: Initialize port_num in qp_attr
        RDMA/uverbs: Fix the check for port number
        IB/cma: Fix reference count leak when no ipv4 addresses are set
        RDMA/iser: don't send an rkey if all data is written as immadiate-data
        rxe: fix broken receive queue draining
        RDMA/qedr: Prevent memory overrun in verbs' user responses
        iw_cxgb4: don't use WR keys/addrs for 0 byte reads
        IB/mlx4: Fix CM REQ retries in paravirt mode
        IB/rdmavt: Setting of QP timeout can overflow jiffies computation
        IB/core: Fix sparse warnings
        RDMA/bnxt_re: Fix the value reported for local ack delay
        RDMA/bnxt_re: Report MISSED_EVENTS in req_notify_cq
        RDMA/bnxt_re: Fix return value of poll routine
        RDMA/bnxt_re: Enable atomics only if host bios supports
        RDMA/bnxt_re: Specify RDMA component when allocating stats context
        RDMA/bnxt_re: Fixed the max_rd_atomic support for initiator and destination QP
        RDMA/bnxt_re: Report supported value to IB stack in query_device
        RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails
        RDMA/bnxt_re: Fix WQE Size posted to HW to prevent it from throwing error
        RDMA/bnxt_re: Free doorbell page index (DPI) during dealloc ucontext
        ...
      bb236dbe
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.13-rc2' of git://people.freedesktop.org/~airlied/linux · 24a1635a
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "A bunch of fixes for rc2: two imx regressions, vc4 fix, dma-buf fix,
        some displayport mst fixes, and an amdkfd fix.
      
        Nothing too crazy, I assume we just haven't see much rc1 testing yet"
      
      * tag 'drm-fixes-for-v4.13-rc2' of git://people.freedesktop.org/~airlied/linux:
        drm/mst: Avoid processing partially received up/down message transactions
        drm/mst: Avoid dereferencing a NULL mstb in drm_dp_mst_handle_up_req()
        drm/mst: Fix error handling during MST sideband message reception
        drm/imx: parallel-display: Accept drm_of_find_panel_or_bridge failure
        drm/imx: fix typo in ipu_plane_formats[]
        drm/vc4: Fix VBLANK handling in crtc->enable() path
        dma-buf/fence: Avoid use of uninitialised timestamp
        drm/amdgpu: Remove unused field kgd2kfd_shared_resources.num_mec
        drm/radeon: Remove initialization of shared_resources.num_mec
        drm/amdkfd: Remove unused references to shared_resources.num_mec
        drm/amdgpu: Fix KFD oversubscription by tracking queues correctly
      24a1635a
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · f79ec886
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Three minor updates
      
         - Use the new GFP_RETRY_MAYFAIL to be more aggressive in allocating
           memory for the ring buffer without causing OOMs
      
         - Fix a memory leak in adding and removing instances
      
         - Add __rcu annotation to be able to debug RCU usage of function
           tracing a bit better"
      
      * tag 'trace-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        trace: fix the errors caused by incompatible type of RCU variables
        tracing: Fix kmemleak in instance_rmdir
        tracing/ring_buffer: Try harder to allocate
      f79ec886
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b0a75281
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "A bunch of small fixes for x86"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm: x86: hyperv: avoid livelock in oneshot SynIC timers
        KVM: VMX: Fix invalid guest state detection after task-switch emulation
        x86: add MULTIUSER dependency for KVM
        KVM: nVMX: Disallow VM-entry in MOV-SS shadow
        KVM: nVMX: track NMI blocking state separately for each VMCS
        KVM: x86: masking out upper bits
      b0a75281
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 10fc9554
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "A handful of fixes, mostly for new code:
      
         - some reworking of the new STRICT_KERNEL_RWX support to make sure we
           also remove executable permission from __init memory before it's
           freed.
      
         - a fix to some recent optimisations to the hypercall entry where we
           were clobbering r12, this was breaking nested guests (PR KVM).
      
         - a fix for the recent patch to opal_configure_cores(). This could
           break booting on bare metal Power8 boxes if the kernel was built
           without CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG.
      
         - .. and finally a workaround for spurious PMU interrupts on Power9
           DD2.
      
        Thanks to: Nicholas Piggin, Anton Blanchard, Balbir Singh"
      
      * tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y
        powerpc/mm/hash: Refactor hash__mark_rodata_ro()
        powerpc/mm/radix: Refactor radix__mark_rodata_ro()
        powerpc/64s: Fix hypercall entry clobbering r12 input
        powerpc/perf: Avoid spurious PMU interrupts after idle
        powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()
      10fc9554
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4ec9f7a1
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Half of the fixes are for various build time warnings triggered by
        randconfig builds. Most (but not all...) were harmless.
      
        There's also:
      
         - ACPI boundary condition fixes
      
         - UV platform fixes
      
         - defconfig updates
      
         - an AMD K6 CPU init fix
      
         - a %pOF printk format related preparatory change
      
         - .. and a warning fix related to the tlb/PCID changes"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/devicetree: Convert to using %pOF instead of ->full_name
        x86/platform/uv/BAU: Disable BAU on single hub configurations
        x86/platform/intel-mid: Fix a format string overflow warning
        x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUG
        x86/build: Silence the build with "make -s"
        x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl
        x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warning
        x86/fpu/math-emu: Fix possible uninitialized variable use
        perf/x86: Shut up false-positive -Wmaybe-uninitialized warning
        x86/defconfig: Remove stale, old Kconfig options
        x86/ioapic: Pass the correct data to unmask_ioapic_irq()
        x86/acpi: Prevent out of bound access caused by broken ACPI tables
        x86/mm, KVM: Fix warning when !CONFIG_PREEMPT_COUNT
        x86/platform/uv/BAU: Fix congested_response_us not taking effect
        x86/cpu: Use indirect call to measure performance in init_amd_k6()
      4ec9f7a1
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e234b4a8
      Linus Torvalds authored
      Pull timer fix from Ingo Molnar:
       "A timer_irq_init() clocksource API robustness fix"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/drivers/timer-of: Handle of_irq_get_byname() result correctly
      e234b4a8
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5a77f025
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "A cputime fix and code comments/organization fix to the deadline
        scheduler"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/deadline: Fix confusing comments about selection of top pi-waiter
        sched/cputime: Don't use smp_processor_id() in preemptible context
      5a77f025
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bbcdea65
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Two hw-enablement patches, two race fixes, three fixes for regressions
        of semantics, plus a number of tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Add proper condition to run sched_task callbacks
        perf/core: Fix locking for children siblings group read
        perf/core: Fix scheduling regression of pinned groups
        perf/x86/intel: Fix debug_store reset field for freq events
        perf/x86/intel: Add Goldmont Plus CPU PMU support
        perf/x86/intel: Enable C-state residency events for Apollo Lake
        perf symbols: Accept zero as the kernel base address
        Revert "perf/core: Drop kernel samples even though :u is specified"
        perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target
        perf evsel: State in the default event name if attr.exclude_kernel is set
        perf evsel: Fix attr.exclude_kernel setting for default cycles:p
      bbcdea65
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8b810a3a
      Linus Torvalds authored
      Pull locking fixlet from Ingo Molnar:
       "Remove an unnecessary priority adjustment in the rtmutex code"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rtmutex: Remove unnecessary priority adjustment
      8b810a3a
    • Trond Myklebust's avatar
      NFS/filelayout: Fix racy setting of fl->dsaddr in filelayout_check_deviceid() · 1ebf9801
      Trond Myklebust authored
      We must set fl->dsaddr once, and once only, even if there are multiple
      processes calling filelayout_check_deviceid() for the same layout
      segment.
      Reported-by: default avatarOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      1ebf9801
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 34eddefe
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
       "A resume_irq() fix, plus a number of static declaration fixes"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/digicolor: Drop unnecessary static
        irqchip/mips-cpu: Drop unnecessary static
        irqchip/gic/realview: Drop unnecessary static
        irqchip/mips-gic: Remove population of irq domain names
        genirq/PM: Properly pretend disabled state when force resuming interrupts
      34eddefe
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0a6109fd
      Linus Torvalds authored
      Pull core fixes from Ingo Molnar:
       "A fix to WARN_ON_ONCE() done by modules, plus a MAINTAINERS update"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        debug: Fix WARN_ON_ONCE() for modules
        MAINTAINERS: Update the PTRACE entry
      0a6109fd
    • Trond Myklebust's avatar
      NFS: Be more careful about mapping file permissions · ecbb903c
      Trond Myklebust authored
      When mapping a directory, we want the MAY_WRITE permissions to reflect
      whether or not we have permission to modify, add and delete the directory
      entries. MAY_EXEC must map to lookup permissions.
      
      On the other hand, for files, we want MAY_WRITE to reflect a permission
      to modify and extend the file.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      ecbb903c
    • Trond Myklebust's avatar
    • Trond Myklebust's avatar
    • Trond Myklebust's avatar
    • NeilBrown's avatar
      net/sunrpc/xprt_sock: fix regression in connection error reporting. · 3ffbc1d6
      NeilBrown authored
      Commit 3d476263 ("tcp: remove poll() flakes when receiving
      RST") in v4.12 changed the order in which ->sk_state_change()
      and ->sk_error_report() are called when a socket is shut
      down - sk_state_change() is now called first.
      
      This causes xs_tcp_state_change() -> xs_sock_mark_closed() ->
      xprt_disconnect_done() to wake all pending tasked with -EAGAIN.
      When the ->sk_error_report() callback arrives, it is too late to
      pass the error on, and it is lost.
      
      As easy way to demonstrate the problem caused is to try to start
      rpc.nfsd while rcpbind isn't running.
      nfsd will attempt a tcp connection to rpcbind.  A ECONNREFUSED
      error is returned, but sunrpc code loses the error and keeps
      retrying.  If it saw the ECONNREFUSED, it would abort.
      
      To fix this, handle the sk->sk_err in the TCP_CLOSE branch of
      xs_tcp_state_change().
      
      Fixes: 3d476263 ("tcp: remove poll() flakes when receiving RST")
      Cc: stable@vger.kernel.org (v4.12)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      3ffbc1d6
    • Eryu Guan's avatar
      nfs: count correct array for mnt3_counts array size · ecc7b435
      Eryu Guan authored
      Array size of mnt3_counts should be the size of array
      mnt3_procedures, not mnt_procedures, though they're same in size
      right now. Found this by code inspection.
      
      Fixes: 1c5876dd ("sunrpc: move p_count out of struct rpc_procinfo")
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarEryu Guan <eguan@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      ecc7b435
    • Rob Herring's avatar
      x86/devicetree: Convert to using %pOF instead of ->full_name · db15e7f2
      Rob Herring authored
      Now that we have a custom printf format specifier, convert users of
      full_name to use %pOF instead. This is preparation to remove storing
      of the full path string for each device node.
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devicetree@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170718214339.7774-7-robh@kernel.org
      [ Clarify the error message while at it, as 'node' is ambiguous. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      db15e7f2
    • Jiri Olsa's avatar
      perf/x86/intel: Add proper condition to run sched_task callbacks · df6c3db8
      Jiri Olsa authored
      We have 2 functions using the same sched_task callback:
      
        - PEBS drain for free running counters
        - LBR save/store
      
      Both of them are called from intel_pmu_sched_task() and
      either of them can be unwillingly triggered when the
      other one is configured to run.
      
      Let's say there's PEBS drain configured in sched_task
      callback for the event, but in the callback itself
      (intel_pmu_sched_task()) we will also run the code for
      LBR save/restore, which we did not ask for, but the
      code in intel_pmu_sched_task() does not check for that.
      
      This can lead to extra cycles in some perf monitoring,
      like when we monitor PEBS event without LBR data.
      
        # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000
      
        (We need PEBS, non freq/non timestamp event to enable
         the sched_task callback)
      
      The perf stat of cycles and msr:write_msr for above
      command before the change:
        ...
        Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
                                       ./perf bench sched pipe -l 1000000' (5 runs):
      
          18,519,557,441      cycles:k
              91,195,527      msr:write_msr
      
            29.334476406 seconds time elapsed
      
      And after the change:
        ...
        Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
                                       ./perf bench sched pipe -l 1000000' (5 runs):
      
          18,704,973,540      cycles:k
              27,184,720      msr:write_msr
      
            16.977875900 seconds time elapsed
      
      There's no affect on cycles:k because the sched_task happens
      with events switched off, however the msr:write_msr tracepoint
      counter together with almost 50% of time speedup show the
      improvement.
      
      Monitoring LBR event and having extra PEBS drain processing
      in sched_task callback showed just a little speedup, because
      the drain function does not do much extra work in case there
      is no PEBS data.
      
      Adding conditions to recognize the configured work that needs
      to be done in the x86_pmu's sched_task callback.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170719075247.GA27506@kravaSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      df6c3db8
    • Andrew Banman's avatar
      x86/platform/uv/BAU: Disable BAU on single hub configurations · 2fe9a5c6
      Andrew Banman authored
      The BAU confers no benefit to a UV system running with only one hub/socket.
      Permanently disable the BAU driver if there are less than two hubs online
      to avoid BAU overhead. We have observed failed boots on single-socket UV4
      systems caused by BAU that are avoided with this patch.
      
      Also, while at it, consolidate initialization error blocks and fix a
      memory leak.
      Signed-off-by: default avatarAndrew Banman <abanman@hpe.com>
      Acked-by: default avatarRuss Anderson <rja@hpe.com>
      Acked-by: default avatarMike Travis <mike.travis@hpe.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: tony.ernst@hpe.com
      Link: http://lkml.kernel.org/r/1500588351-78016-1-git-send-email-abanman@hpe.com
      [ Minor cleanups. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2fe9a5c6
    • Jiri Olsa's avatar
      perf/core: Fix locking for children siblings group read · 2aeb1883
      Jiri Olsa authored
      We're missing ctx lock when iterating children siblings
      within the perf_read path for group reading. Following
      race and crash can happen:
      
      User space doing read syscall on event group leader:
      
      T1:
        perf_read
          lock event->ctx->mutex
          perf_read_group
            lock leader->child_mutex
            __perf_read_group_add(child)
              list_for_each_entry(sub, &leader->sibling_list, group_entry)
      
      ---->   sub might be invalid at this point, because it could
              get removed via perf_event_exit_task_context in T2
      
      Child exiting and cleaning up its events:
      
      T2:
        perf_event_exit_task_context
          lock ctx->mutex
          list_for_each_entry_safe(child_event, next, &child_ctx->event_list,...
            perf_event_exit_event(child)
              lock ctx->lock
              perf_group_detach(child)
              unlock ctx->lock
      
      ---->   child is removed from sibling_list without any sync
              with T1 path above
      
              ...
              free_event(child)
      
      Before the child is removed from the leader's child_list,
      (and thus is omitted from perf_read_group processing), we
      need to ensure that perf_read_group touches child's
      siblings under its ctx->lock.
      
      Peter further notes:
      
      | One additional note; this bug got exposed by commit:
      |
      |   ba5213ae ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
      |
      | which made it possible to actually trigger this code-path.
      Tested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: ba5213ae ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
      Link: http://lkml.kernel.org/r/20170720141455.2106-1-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2aeb1883