1. 09 Feb, 2017 20 commits
  2. 08 Feb, 2017 6 commits
    • Bryant G. Ly's avatar
      ibmvscsis: Add SGL limit · b22bc278
      Bryant G. Ly authored
      This patch adds internal LIO sgl limit since the driver already
      sets a max transfer limit on transport layer of 1MB to the client.
      
      Cc: stable@vger.kernel.org
      Tested-by: default avatarSteven Royer <seroyer@linux.vnet.ibm.com>
      Signed-off-by: default avatarBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      b22bc278
    • Nicholas Bellinger's avatar
      target: Fix COMPARE_AND_WRITE ref leak for non GOOD status · 9b2792c3
      Nicholas Bellinger authored
      This patch addresses a long standing bug where the commit phase
      of COMPARE_AND_WRITE would result in a se_cmd->cmd_kref reference
      leak if se_cmd->scsi_status returned non SAM_STAT_GOOD.
      
      This would manifest first as a lost SCSI response, and eventual
      hung task during fabric driver logout or re-login, as existing
      shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
      to reach zero.
      
      To address this bug, compare_and_write_post() has been changed
      to drop the incorrect !cmd->scsi_status conditional that was
      preventing *post_ret = 1 for being set during non SAM_STAT_GOOD
      status.
      
      This patch has been tested with SAM_STAT_CHECK_CONDITION status
      from normal target_complete_cmd() callback path, as well as the
      incoming __target_execute_cmd() submission failure path when
      se_cmd->execute_cmd() returns non zero status.
      Reported-by: default avatarDonald White <dew@datera.io>
      Cc: Donald White <dew@datera.io>
      Tested-by: default avatarGary Guo <ghg@datera.io>
      Cc: Gary Guo <ghg@datera.io>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: <stable@vger.kernel.org> # v3.12+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      9b2792c3
    • Nicholas Bellinger's avatar
      target: Fix multi-session dynamic se_node_acl double free OOPs · 01d4d673
      Nicholas Bellinger authored
      This patch addresses a long-standing bug with multi-session
      (eg: iscsi-target + iser-target) se_node_acl dynamic free
      withini transport_deregister_session().
      
      This bug is caused when a storage endpoint is configured with
      demo-mode (generate_node_acls = 1 + cache_dynamic_acls = 1)
      initiators, and initiator login creates a new dynamic node acl
      and attaches two sessions to it.
      
      After that, demo-mode for the storage instance is disabled via
      configfs (generate_node_acls = 0 + cache_dynamic_acls = 0) and
      the existing dynamic acl is never converted to an explicit ACL.
      
      The end result is dynamic acl resources are released twice when
      the sessions are shutdown in transport_deregister_session().
      
      If the storage instance is not changed to disable demo-mode,
      or the dynamic acl is converted to an explict ACL, or there
      is only a single session associated with the dynamic ACL,
      the bug is not triggered.
      
      To address this big, move the release of dynamic se_node_acl
      memory into target_complete_nacl() so it's only freed once
      when se_node_acl->acl_kref reaches zero.
      
      (Drop unnecessary list_del_init usage - HCH)
      Reported-by: default avatarRob Millner <rlm@daterainc.com>
      Tested-by: default avatarRob Millner <rlm@daterainc.com>
      Cc: Rob Millner <rlm@daterainc.com>
      Cc: stable@vger.kernel.org # 4.1+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      01d4d673
    • Nicholas Bellinger's avatar
      target: Fix early transport_generic_handle_tmr abort scenario · c54eeffb
      Nicholas Bellinger authored
      This patch fixes a bug where incoming task management requests
      can be explicitly aborted during an active LUN_RESET, but who's
      struct work_struct are canceled in-flight before execution.
      
      This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
      for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
      for target_tmr_work() never getting invoked and the aborted TMR
      waiting indefinately within transport_wait_for_tasks().
      
      To address this case, perform a CMD_T_ABORTED check early in
      transport_generic_handle_tmr(), and invoke the normal path via
      transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
      blocked waiting for CMD_T_STOP in transport_wait_for_tasks().
      
      Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
      into transport_generic_handle_tmr() so the existing check in
      core_tmr_drain_tmr_list() avoids attempting abort the incoming
      se_tmr_req->task_cmd->work if it has already been queued into
      se_device->tmr_wq.
      Reported-by: default avatarRob Millner <rlm@daterainc.com>
      Tested-by: default avatarRob Millner <rlm@daterainc.com>
      Cc: Rob Millner <rlm@daterainc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org # 3.14+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      c54eeffb
    • Nicholas Bellinger's avatar
      target: Use correct SCSI status during EXTENDED_COPY exception · 0583c261
      Nicholas Bellinger authored
      This patch adds the missing target_complete_cmd() SCSI status
      parameter change in target_xcopy_do_work(), that was originally
      missing in commit 926317de.
      
      It correctly propigates up the correct SCSI status during
      EXTENDED_COPY exception cases, instead of always using the
      hardcoded SAM_STAT_CHECK_CONDITION from original code.
      
      This is required for ESX host environments that expect to
      hit SAM_STAT_RESERVATION_CONFLICT for certain scenarios,
      and SAM_STAT_CHECK_CONDITION results in non-retriable
      status for these cases.
      Reported-by: default avatarNixon Vincent <nixon.vincent@calsoftinc.com>
      Tested-by: default avatarNixon Vincent <nixon.vincent@calsoftinc.com>
      Cc: Nixon Vincent <nixon.vincent@calsoftinc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org # 3.14+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      0583c261
    • Nicholas Bellinger's avatar
      target: Don't BUG_ON during NodeACL dynamic -> explicit conversion · 391e2a6d
      Nicholas Bellinger authored
      After the v4.2+ RCU conversion to se_node_acl->lun_entry_hlist,
      a BUG_ON() was added in core_enable_device_list_for_node() to
      detect when the located orig->se_lun_acl contains an existing
      se_lun_acl pointer reference.
      
      However, this scenario can happen when a dynamically generated
      NodeACL is being converted to an explicit NodeACL, when the
      explicit NodeACL contains a different LUN mapping than the
      default provided by the WWN endpoint.
      
      So instead of triggering BUG_ON(), go ahead and fail instead
      following the original pre RCU conversion logic.
      Reported-by: default avatarBenjamin ESTRABAUD <ben.estrabaud@mpstor.com>
      Cc: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org # 4.2+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      391e2a6d
  3. 05 Feb, 2017 1 commit
  4. 04 Feb, 2017 6 commits
  5. 03 Feb, 2017 7 commits
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · a49e6f58
      Linus Torvalds authored
      Pull virtio/vhost fixes from Michael S. Tsirkin:
       "Last minute fixes:
      
         - ARM DMA fix revert
      
         - vhost endian-ness fix
      
         - MAINTAINERS: email address change for Amit"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        MAINTAINERS: update email address for Amit Shah
        vhost: fix initialization for vq->is_le
        Revert "vring: Force use of DMA API for ARM-based systems with legacy devices"
      a49e6f58
    • Linus Torvalds's avatar
      Merge tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio · e9f7f17d
      Linus Torvalds authored
      Pull VFIO fix from Alex Williamson:
       "Fix an error path in SPAPR IOMMU backend (Alexey Kardashevskiy)"
      
      * tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio:
        vfio/spapr: Fix missing mutex unlock when creating a window
      e9f7f17d
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 7a92cc6b
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "8 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm, fs: check for fatal signals in do_generic_file_read()
        fs: break out of iomap_file_buffered_write on fatal signals
        base/memory, hotplug: fix a kernel oops in show_valid_zones()
        mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
        jump label: pass kbuild_cflags when checking for asm goto support
        shmem: fix sleeping from atomic context
        kasan: respect /proc/sys/kernel/traceoff_on_warning
        zswap: disable changing params if init fails
      7a92cc6b
    • Michal Hocko's avatar
      mm, fs: check for fatal signals in do_generic_file_read() · 5abf186a
      Michal Hocko authored
      do_generic_file_read() can be told to perform a large request from
      userspace.  If the system is under OOM and the reading task is the OOM
      victim then it has an access to memory reserves and finishing the full
      request can lead to the full memory depletion which is dangerous.  Make
      sure we rather go with a short read and allow the killed task to
      terminate.
      
      Link: http://lkml.kernel.org/r/20170201092706.9966-3-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5abf186a
    • Michal Hocko's avatar
      fs: break out of iomap_file_buffered_write on fatal signals · d1908f52
      Michal Hocko authored
      Tetsuo has noticed that an OOM stress test which performs large write
      requests can cause the full memory reserves depletion.  He has tracked
      this down to the following path
      
      	__alloc_pages_nodemask+0x436/0x4d0
      	alloc_pages_current+0x97/0x1b0
      	__page_cache_alloc+0x15d/0x1a0          mm/filemap.c:728
      	pagecache_get_page+0x5a/0x2b0           mm/filemap.c:1331
      	grab_cache_page_write_begin+0x23/0x40   mm/filemap.c:2773
      	iomap_write_begin+0x50/0xd0             fs/iomap.c:118
      	iomap_write_actor+0xb5/0x1a0            fs/iomap.c:190
      	? iomap_write_end+0x80/0x80             fs/iomap.c:150
      	iomap_apply+0xb3/0x130                  fs/iomap.c:79
      	iomap_file_buffered_write+0x68/0xa0     fs/iomap.c:243
      	? iomap_write_end+0x80/0x80
      	xfs_file_buffered_aio_write+0x132/0x390 [xfs]
      	? remove_wait_queue+0x59/0x60
      	xfs_file_write_iter+0x90/0x130 [xfs]
      	__vfs_write+0xe5/0x140
      	vfs_write+0xc7/0x1f0
      	? syscall_trace_enter+0x1d0/0x380
      	SyS_write+0x58/0xc0
      	do_syscall_64+0x6c/0x200
      	entry_SYSCALL64_slow_path+0x25/0x25
      
      the oom victim has access to all memory reserves to make a forward
      progress to exit easier.  But iomap_file_buffered_write and other
      callers of iomap_apply loop to complete the full request.  We need to
      check for fatal signals and back off with a short write instead.
      
      As the iomap_apply delegates all the work down to the actor we have to
      hook into those.  All callers that work with the page cache are calling
      iomap_write_begin so we will check for signals there.  dax_iomap_actor
      has to handle the situation explicitly because it copies data to the
      userspace directly.  Other callers like iomap_page_mkwrite work on a
      single page or iomap_fiemap_actor do not allocate memory based on the
      given len.
      
      Fixes: 68a9f5e7 ("xfs: implement iomap based buffered write path")
      Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>	[4.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d1908f52
    • Toshi Kani's avatar
      base/memory, hotplug: fix a kernel oops in show_valid_zones() · a96dfddb
      Toshi Kani authored
      Reading a sysfs "memoryN/valid_zones" file leads to the following oops
      when the first page of a range is not backed by struct page.
      show_valid_zones() assumes that 'start_pfn' is always valid for
      page_zone().
      
       BUG: unable to handle kernel paging request at ffffea017a000000
       IP: show_valid_zones+0x6f/0x160
      
      This issue may happen on x86-64 systems with 64GiB or more memory since
      their memory block size is bumped up to 2GiB.  [1] An example of such
      systems is desribed below.  0x3240000000 is only aligned by 1GiB and
      this memory block starts from 0x3200000000, which is not backed by
      struct page.
      
       BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
      
      Since test_pages_in_a_zone() already checks holes, fix this issue by
      extending this function to return 'valid_start' and 'valid_end' for a
      given range.  show_valid_zones() then proceeds with the valid range.
      
      [1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
          large-memory x86-64 systems")'
      
      Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.comSigned-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: <stable@vger.kernel.org>	[4.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a96dfddb
    • Toshi Kani's avatar
      mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() · deb88a2a
      Toshi Kani authored
      Patch series "fix a kernel oops when reading sysfs valid_zones", v2.
      
      A sysfs memory file is created for each 2GiB memory block on x86-64 when
      the system has 64GiB or more memory.  [1] When the start address of a
      memory block is not backed by struct page, i.e.  a memory range is not
      aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
      kernel oops.  This issue was observed on multiple x86-64 systems with
      more than 64GiB of memory.  This patch-set fixes this issue.
      
      Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
      test the start section.
      
      Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
      to return valid [start, end).
      
      Note for stable kernels: The memory block size change was made by commit
      bdee237c ("x86: mm: Use 2GB memory block size on large-memory x86-64
      systems"), which was accepted to 3.9.  However, this patch-set depends
      on (and fixes) the change to test_pages_in_a_zone() made by commit
      5f0f2887 ("mm/memory_hotplug.c: check for missing sections in
      test_pages_in_a_zone()"), which was accepted to 4.4.
      
      So, I recommend that we backport it up to 4.4.
      
      [1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
          large-memory x86-64 systems")'
      
      This patch (of 2):
      
      test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
      section since 'sec_end_pfn' is set equal to 'pfn'.  Since this function
      is called for testing the range of a sysfs memory file, 'start_pfn' is
      always aligned by section.
      
      Fix it by properly setting 'sec_end_pfn' to the next section pfn.
      
      Also make sure that this function returns 1 only when the range belongs
      to a zone.
      
      Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.comSigned-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: <stable@vger.kernel.org>	[4.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      deb88a2a