1. 05 Feb, 2019 4 commits
    • Vaibhav Jain's avatar
      scsi: cxlflash: Prevent deadlock when adapter probe fails · bb61b843
      Vaibhav Jain authored
      Presently when an error is encountered during probe of the cxlflash
      adapter, a deadlock is seen with cpu thread stuck inside
      cxlflash_remove(). Below is the trace of the deadlock as logged by
      khungtaskd:
      
      cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
      INFO: task kworker/80:1:890 blocked for more than 120 seconds.
             Not tainted 5.0.0-rc4-capi2-kexec+ #2
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      kworker/80:1    D    0   890      2 0x00000808
      Workqueue: events work_for_cpu_fn
      
      Call Trace:
       0x4d72136320 (unreliable)
       __switch_to+0x2cc/0x460
       __schedule+0x2bc/0xac0
       schedule+0x40/0xb0
       cxlflash_remove+0xec/0x640 [cxlflash]
       cxlflash_probe+0x370/0x8f0 [cxlflash]
       local_pci_probe+0x6c/0x140
       work_for_cpu_fn+0x38/0x60
       process_one_work+0x260/0x530
       worker_thread+0x280/0x5d0
       kthread+0x1a8/0x1b0
       ret_from_kernel_thread+0x5c/0x80
      INFO: task systemd-udevd:5160 blocked for more than 120 seconds.
      
      The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
      without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
      starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
      successfully probed the 'cxlflash_cfg->state' never changes from
      STATE_PROBING hence the deadlock occurs.
      
      We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
      STATE_PROBED in case an error occurs during cxlflash_probe() and just
      before calling cxlflash_remove().
      
      Cc: stable@vger.kernel.org
      Fixes: c21e0bbf("cxlflash: Base support for IBM CXL Flash Adapter")
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      bb61b843
    • Ross Lagerwall's avatar
      Revert "scsi: libfc: Add WARN_ON() when deleting rports" · d8f6382a
      Ross Lagerwall authored
      This reverts commit bbc0f8bd.
      
      It added a warning whose intent was to check whether the rport was still
      linked into the peer list. It doesn't work as intended and gives false
      positive warnings for two reasons:
      
      1) If the rport is never linked into the peer list it will not be
      considered empty since the list_head is never initialized.
      
      2) If the rport is deleted from the peer list using list_del_rcu(), then
      the list_head is in an undefined state and it is not considered empty.
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      d8f6382a
    • Damien Le Moal's avatar
      scsi: sd_zbc: Fix zone information messages · 88fc41c4
      Damien Le Moal authored
      Commit bf505456 ("block: Introduce blk_revalidate_disk_zones()")
      inadvertently broke the message output of sd_zbc_print_zones() because the
      zone information initialization of the scsi disk structure was moved to the
      second scan run while sd_zbc_print_zones() is called on the first
      scan. This leads to the following incorrect message to be printed for any
      ZBC or ZAC zoned disks.
      
      "...[sdX] 4294967295 zones of 0 logical blocks + 1 runt zone"
      
      Fix this by initializing sdkp zone size and number of zones early on the
      first scan. This does not impact the execution of
      blk_revalidate_zones(). This functions is still called only once the block
      device capacity is set on the second revalidate run on boot, or if the disk
      zone configuration changed (i.e. the disk changed).
      
      Fixes: bf505456 ("block: Introduce blk_revalidate_disk_zones()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      88fc41c4
    • David Disseldorp's avatar
      scsi: target: make the pi_prot_format ConfigFS path readable · b6cd7f34
      David Disseldorp authored
      pi_prot_format conversion to write-only caused userspace breakage. Make the
      ConfigFS path readable again and hardcode the "0\n" content, matching
      previous output.
      
      Fixes: 6baca760 ("scsi: target: drop unused pi_prot_format attribute storage")
      Link: https://bugzilla.redhat.com/show_bug.cgi?id=1667505Reported-by: default avatarLee Duncan <lduncan@suse.com>
      Reported-by: default avatarLaura Abbott <labbott@redhat.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      b6cd7f34
  2. 01 Feb, 2019 1 commit
    • James Bottomley's avatar
      scsi: aic94xx: fix module loading · 42caa0ed
      James Bottomley authored
      The aic94xx driver is currently failing to load with errors like
      
      sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision'
      
      Because the PCI code had recently added a file named 'revision' to every
      PCI device.  Fix this by renaming the aic94xx revision file to
      aic_revision.  This is safe to do for us because as far as I can tell,
      there's nothing in userspace relying on the current aic94xx revision file
      so it can be renamed without breaking anything.
      
      Fixes: 702ed3be (PCI: Create revision file in sysfs)
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      42caa0ed
  3. 29 Jan, 2019 5 commits
    • Dan Carpenter's avatar
      scsi: 53c700: pass correct "dev" to dma_alloc_attrs() · 8437fcf1
      Dan Carpenter authored
      The "hostdata->dev" pointer is NULL here.  We set "hostdata->dev = dev;"
      later in the function and we also use "hostdata->dev" when we call
      dma_free_attrs() in NCR_700_release().
      
      This bug predates git version control.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      8437fcf1
    • Dan Carpenter's avatar
      scsi: bnx2fc: Fix error handling in probe() · b2d3492f
      Dan Carpenter authored
      There are two issues here.  First if cmgr->hba is not set early enough then
      it leads to a NULL dereference.  Second if we don't completely initialize
      cmgr->io_bdt_pool[] then we end up dereferencing uninitialized pointers.
      
      Fixes: 853e2bd2 ("[SCSI] bnx2fc: Broadcom FCoE offload driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      b2d3492f
    • Douglas Gilbert's avatar
      scsi: scsi_debug: fix write_same with virtual_gb problem · 40d07b52
      Douglas Gilbert authored
      The WRITE SAME(10) and (16) implementations didn't take account of the
      buffer wrap required when the virtual_gb parameter is greater than 0.
      
      Fix that and rename the fake_store() function to lba2fake_store() to lessen
      confusion with the global fake_storep pointer. Bump version date.
      Signed-off-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Reported-by: default avatarBart Van Assche <bvanassche@acm.org>
      Tested by: Bart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      40d07b52
    • Ming Lu's avatar
      scsi: libfc: free skb when receiving invalid flogi resp · 5d8fc4a9
      Ming Lu authored
      The issue to be fixed in this commit is when libfc found it received a
      invalid FLOGI response from FC switch, it would return without freeing the
      fc frame, which is just the skb data. This would cause memory leak if FC
      switch keeps sending invalid FLOGI responses.
      
      This fix is just to make it execute `fc_frame_free(fp)` before returning
      from function `fc_lport_flogi_resp`.
      Signed-off-by: default avatarMing Lu <ming.lu@citrix.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      5d8fc4a9
    • Steffen Maier's avatar
      scsi: zfcp: fix sysfs block queue limit output for max_segment_size · b6319569
      Steffen Maier authored
      Since v2.6.35 commit 68322984 ("[SCSI] zfcp: Report scatter-gather
      limits to SCSI and block layer"), zfcp set dma_parms.max_segment_size ==
      PAGE_SIZE (but without using the setter dma_set_max_seg_size()) and
      scsi_host_template.dma_boundary == PAGE_SIZE - 1.
      
      v5.0-rc1 commit 50c2e910 ("scsi: introduce a max_segment_size
      host_template parameters") introduced a new field
      scsi_host_template.max_segment_size. If an LLDD such as zfcp does not set
      it, scsi_host_alloc() uses BLK_MAX_SEGMENT_SIZE = 65536 for
      Scsi_Host.max_segment_size. __scsi_init_queue() announced the minimum of
      Scsi_Host.max_segment_size and dma_parms.max_segment_size to the block
      layer. For zfcp: min(65536, 4096) == 4096 which was still good.
      
      v5.0 commit a8cf59a6 ("scsi: communicate max segment size to the DMA
      mapping code") announces Scsi_Host.max_segment_size to the block layer and
      overwrites dma_parms.max_segment_size with Scsi_Host.max_segment_size.  For
      zfcp dma_parms.max_segment_size == Scsi_Host.max_segment_size == 65536
      which is also reflected in block queue limits.
      
      $ cd /sys/bus/ccw/drivers/zfcp
      $ cd 0.0.3c40/host5/rport-5:0-4/target5:0:4/5:0:4:10/block/sdi/queue
      $ cat max_segment_size
      65536
      
      Zfcp I/O still works because dma_boundary implicitly still keeps the
      effective max segment size <= PAGE_SIZE.  However, dma_boundary does not
      seem visible to user space, but max_segment_size is visible and shows a
      misleading wrong value.  Fix it and inherit the stable tag of a8cf59a6.
      
      Devices on our bus ccw support DMA but no DMA mapping. Of multiple device
      types on the ccw bus, only zfcp needs dma_parms for SCSI limits.  So, leave
      dma_parms setup in zfcp and do not move it to the bus.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: 50c2e910 ("scsi: introduce a max_segment_size host_template parameters")
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      b6319569
  4. 23 Jan, 2019 6 commits
  5. 12 Jan, 2019 9 commits
  6. 09 Jan, 2019 9 commits
  7. 07 Jan, 2019 3 commits
    • Linus Torvalds's avatar
      Linux 5.0-rc1 · bfeffd15
      Linus Torvalds authored
      bfeffd15
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 85e1ffbd
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - improve boolinit.cocci and use_after_iter.cocci semantic patches
      
       - fix alignment for kallsyms
      
       - move 'asm goto' compiler test to Kconfig and clean up jump_label
         CONFIG option
      
       - generate asm-generic wrappers automatically if arch does not
         implement mandatory UAPI headers
      
       - remove redundant generic-y defines
      
       - misc cleanups
      
      * tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kconfig: rename generated .*conf-cfg to *conf-cfg
        kbuild: remove unnecessary stubs for archheader and archscripts
        kbuild: use assignment instead of define ... endef for filechk_* rules
        arch: remove redundant UAPI generic-y defines
        kbuild: generate asm-generic wrappers if mandatory headers are missing
        arch: remove stale comments "UAPI Header export list"
        riscv: remove redundant kernel-space generic-y
        kbuild: change filechk to surround the given command with { }
        kbuild: remove redundant target cleaning on failure
        kbuild: clean up rule_dtc_dt_yaml
        kbuild: remove UIMAGE_IN and UIMAGE_OUT
        jump_label: move 'asm goto' support test to Kconfig
        kallsyms: lower alignment on ARM
        scripts: coccinelle: boolinit: drop warnings on named constants
        scripts: coccinelle: check for redeclaration
        kconfig: remove unused "file" field of yylval union
        nds32: remove redundant kernel-space generic-y
        nios2: remove unneeded HAS_DMA define
      85e1ffbd
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ac5eed2b
      Linus Torvalds authored
      Pull perf tooling updates form Ingo Molnar:
       "A final batch of perf tooling changes: mostly fixes and small
        improvements"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
        perf session: Add comment for perf_session__register_idle_thread()
        perf thread-stack: Fix thread stack processing for the idle task
        perf thread-stack: Allocate an array of thread stacks
        perf thread-stack: Factor out thread_stack__init()
        perf thread-stack: Allow for a thread stack array
        perf thread-stack: Avoid direct reference to the thread's stack
        perf thread-stack: Tidy thread_stack__bottom() usage
        perf thread-stack: Simplify some code in thread_stack__process()
        tools gpio: Allow overriding CFLAGS
        tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
        tools thermal tmon: Allow overriding CFLAGS assignments
        tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
        perf c2c: Increase the HITM ratio limit for displayed cachelines
        perf c2c: Change the default coalesce setup
        perf trace beauty ioctl: Beautify USBDEVFS_ commands
        perf trace beauty: Export function to get the files for a thread
        perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
        perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
        tools headers uapi: Grab a copy of usbdevice_fs.h
        perf trace: Store the major number for a file when storing its pathname
        ...
      ac5eed2b
  8. 06 Jan, 2019 3 commits
    • Linus Torvalds's avatar
      Change mincore() to count "mapped" pages rather than "cached" pages · 574823bf
      Linus Torvalds authored
      The semantics of what "in core" means for the mincore() system call are
      somewhat unclear, but Linux has always (since 2.3.52, which is when
      mincore() was initially done) treated it as "page is available in page
      cache" rather than "page is mapped in the mapping".
      
      The problem with that traditional semantic is that it exposes a lot of
      system cache state that it really probably shouldn't, and that users
      shouldn't really even care about.
      
      So let's try to avoid that information leak by simply changing the
      semantics to be that mincore() counts actual mapped pages, not pages
      that might be cheaply mapped if they were faulted (note the "might be"
      part of the old semantics: being in the cache doesn't actually guarantee
      that you can access them without IO anyway, since things like network
      filesystems may have to revalidate the cache before use).
      
      In many ways the old semantics were somewhat insane even aside from the
      information leak issue.  From the very beginning (and that beginning is
      a long time ago: 2.3.52 was released in March 2000, I think), the code
      had a comment saying
      
        Later we can get more picky about what "in core" means precisely.
      
      and this is that "later".  Admittedly it is much later than is really
      comfortable.
      
      NOTE! This is a real semantic change, and it is for example known to
      change the output of "fincore", since that program literally does a
      mmmap without populating it, and then doing "mincore()" on that mapping
      that doesn't actually have any pages in it.
      
      I'm hoping that nobody actually has any workflow that cares, and the
      info leak is real.
      
      We may have to do something different if it turns out that people have
      valid reasons to want the old semantics, and if we can limit the
      information leak sanely.
      
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Masatake YAMATO <yamato@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      574823bf
    • Linus Torvalds's avatar
      Fix 'acccess_ok()' on alpha and SH · 94bd8a05
      Linus Torvalds authored
      Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'")
      broke both alpha and SH booting in qemu, as noticed by Guenter Roeck.
      
      It turns out that the bug wasn't actually in that commit itself (which
      would have been surprising: it was mostly a no-op), but in how the
      addition of access_ok() to the strncpy_from_user() and strnlen_user()
      functions now triggered the case where those functions would test the
      access of the very last byte of the user address space.
      
      The string functions actually did that user range test before too, but
      they did it manually by just comparing against user_addr_max().  But
      with user_access_begin() doing the check (using "access_ok()"), it now
      exposed problems in the architecture implementations of that function.
      
      For example, on alpha, the access_ok() helper macro looked like this:
      
        #define __access_ok(addr, size) \
              ((get_fs().seg & (addr | size | (addr+size))) == 0)
      
      and what it basically tests is of any of the high bits get set (the
      USER_DS masking value is 0xfffffc0000000000).
      
      And that's completely wrong for the "addr+size" check.  Because it's
      off-by-one for the case where we check to the very end of the user
      address space, which is exactly what the strn*_user() functions do.
      
      Why? Because "addr+size" will be exactly the size of the address space,
      so trying to access the last byte of the user address space will fail
      the __access_ok() check, even though it shouldn't.  As a result, the
      user string accessor functions failed consistently - because they
      literally don't know how long the string is going to be, and the max
      access is going to be that last byte of the user address space.
      
      Side note: that alpha macro is buggy for another reason too - it re-uses
      the arguments twice.
      
      And SH has another version of almost the exact same bug:
      
        #define __addr_ok(addr) \
              ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
      
      so far so good: yes, a user address must be below the limit.  But then:
      
        #define __access_ok(addr, size)         \
              (__addr_ok((addr) + (size)))
      
      is wrong with the exact same off-by-one case: the case when "addr+size"
      is exactly _equal_ to the limit is actually perfectly fine (think "one
      byte access at the last address of the user address space")
      
      The SH version is actually seriously buggy in another way: it doesn't
      actually check for overflow, even though it did copy the _comment_ that
      talks about overflow.
      
      So it turns out that both SH and alpha actually have completely buggy
      implementations of access_ok(), but they happened to work in practice
      (although the SH overflow one is a serious serious security bug, not
      that anybody likely cares about SH security).
      
      This fixes the problems by using a similar macro on both alpha and SH.
      It isn't trying to be clever, the end address is based on this logic:
      
              unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;
      
      which basically says "add start and length, and then subtract one unless
      the length was zero".  We can't subtract one for a zero length, or we'd
      just hit an underflow instead.
      
      For a lot of access_ok() users the length is a constant, so this isn't
      actually as expensive as it initially looks.
      Reported-and-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94bd8a05
    • Linus Torvalds's avatar
      Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt · baa67073
      Linus Torvalds authored
      Pull fscrypt updates from Ted Ts'o:
       "Add Adiantum support for fscrypt"
      
      * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
        fscrypt: add Adiantum support
      baa67073