1. 29 Mar, 2022 6 commits
    • Jens Axboe's avatar
      Merge tag 'nvme-5.18-2022-03-29' of git://git.infradead.org/nvme into for-5.18/drivers · 1e06b3e7
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "- fix multipath hang when disk goes live over reconnect (Anton Eidelman)
       - fix RCU hole that allowed for endless looping in multipath round robin
         (Chris Leech)
       - remove redundant assignment after left shift (Colin Ian King)
       - add quirks for Samsung X5 SSDs (Monish Kumar R)
       - fix the read-only state for zoned namespaces with unsupposed features
         (Pankaj Raghav)
       - use a private workqueue instead of the system workqueue in nvmet
         (Sagi Grimberg)
       - allow duplicate NSIDs for private namespaces (Sungup Moon)
       - expose use_threaded_interrupts read-only in sysfs (Xin Hao)"
      
      * tag 'nvme-5.18-2022-03-29' of git://git.infradead.org/nvme:
        nvme-multipath: fix hang when disk goes live over reconnect
        nvme: fix RCU hole that allowed for endless looping in multipath round robin
        nvme: allow duplicate NSIDs for private namespaces
        nvmet: remove redundant assignment after left shift
        nvmet: use a private workqueue instead of the system workqueue
        nvme-pci: add quirks for Samsung X5 SSDs
        nvme-pci: expose use_threaded_interrupts read-only in sysfs
        nvme: fix the read-only state for zoned namespaces with unsupposed features
      1e06b3e7
    • Anton Eidelman's avatar
      nvme-multipath: fix hang when disk goes live over reconnect · a4a6f3c8
      Anton Eidelman authored
      nvme_mpath_init_identify() invoked from nvme_init_identify() fetches a
      fresh ANA log from the ctrl.  This is essential to have an up to date
      path states for both existing namespaces and for those scan_work may
      discover once the ctrl is up.
      
      This happens in the following cases:
        1) A new ctrl is being connected.
        2) An existing ctrl is successfully reconnected.
        3) An existing ctrl is being reset.
      
      While in (1) ctrl->namespaces is empty, (2 & 3) may have namespaces, and
      nvme_read_ana_log() may call nvme_update_ns_ana_state().
      
      This result in a hang when the ANA state of an existing namespace changes
      and makes the disk live: nvme_mpath_set_live() issues IO to the namespace
      through the ctrl, which does NOT have IO queues yet.
      
      See sample hang below.
      
      Solution:
      - nvme_update_ns_ana_state() to call set_live only if ctrl is live
      - nvme_read_ana_log() call from nvme_mpath_init_identify()
        therefore only fetches and parses the ANA log;
        any erros in this process will fail the ctrl setup as appropriate;
      - a separate function nvme_mpath_update()
        is called in nvme_start_ctrl();
        this parses the ANA log without fetching it.
        At this point the ctrl is live,
        therefore, disks can be set live normally.
      
      Sample failure:
          nvme nvme0: starting error recovery
          nvme nvme0: Reconnecting in 10 seconds...
          block nvme0n6: no usable path - requeuing I/O
          INFO: task kworker/u8:3:312 blocked for more than 122 seconds.
                Tainted: G            E     5.14.5-1.el7.elrepo.x86_64 #1
          Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp]
          Call Trace:
           __schedule+0x2a2/0x7e0
           schedule+0x4e/0xb0
           io_schedule+0x16/0x40
           wait_on_page_bit_common+0x15c/0x3e0
           do_read_cache_page+0x1e0/0x410
           read_cache_page+0x12/0x20
           read_part_sector+0x46/0x100
           read_lba+0x121/0x240
           efi_partition+0x1d2/0x6a0
           bdev_disk_changed.part.0+0x1df/0x430
           bdev_disk_changed+0x18/0x20
           blkdev_get_whole+0x77/0xe0
           blkdev_get_by_dev+0xd2/0x3a0
           __device_add_disk+0x1ed/0x310
           device_add_disk+0x13/0x20
           nvme_mpath_set_live+0x138/0x1b0 [nvme_core]
           nvme_update_ns_ana_state+0x2b/0x30 [nvme_core]
           nvme_update_ana_state+0xca/0xe0 [nvme_core]
           nvme_parse_ana_log+0xac/0x170 [nvme_core]
           nvme_read_ana_log+0x7d/0xe0 [nvme_core]
           nvme_mpath_init_identify+0x105/0x150 [nvme_core]
           nvme_init_identify+0x2df/0x4d0 [nvme_core]
           nvme_init_ctrl_finish+0x8d/0x3b0 [nvme_core]
           nvme_tcp_setup_ctrl+0x337/0x390 [nvme_tcp]
           nvme_tcp_reconnect_ctrl_work+0x24/0x40 [nvme_tcp]
           process_one_work+0x1bd/0x360
           worker_thread+0x50/0x3d0
      Signed-off-by: default avatarAnton Eidelman <anton@lightbitslabs.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      a4a6f3c8
    • Chris Leech's avatar
      nvme: fix RCU hole that allowed for endless looping in multipath round robin · d6d67427
      Chris Leech authored
      Make nvme_ns_remove match the assumptions elsewhere.
      
      1) !NVME_NS_READY needs to be srcu synchronized to make sure nothing is
         running in __nvme_find_path or nvme_round_robin_path that will
         re-assign this ns to current_path.
      
      2) Any matching current_path entries need to be cleared before removing
         from the siblings list, to prevent calling nvme_round_robin_path with
         an "old" ns that's off list.
      
      3) Finally the list_del_rcu can happen, and then synchronize again
         before releasing any reference counts.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      d6d67427
    • Sungup Moon's avatar
      nvme: allow duplicate NSIDs for private namespaces · 5974ea7c
      Sungup Moon authored
      A NVMe subsystem with multiple controller can have private namespaces
      that use the same NSID under some conditions:
      
       "If Namespace Management, ANA Reporting, or NVM Sets are supported, the
        NSIDs shall be unique within the NVM subsystem. If the Namespace
        Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
         a) for shared namespace shall be unique; and
         b) for private namespace are not required to be unique."
      
      Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec.
      
      Make sure this specific setup is supported in Linux.
      
      Fixes: 9ad1927a ("nvme: always search for namespace head")
      Signed-off-by: default avatarSungup Moon <sungup.moon@samsung.com>
      [hch: refactored and fixed the controller vs subsystem based naming
            conflict]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      5974ea7c
    • Colin Ian King's avatar
      nvmet: remove redundant assignment after left shift · 63bc732c
      Colin Ian King authored
      The left shift is followed by a re-assignment back to cc_css, the
      assignment is redundant.  Fix this by replacing the "<<=" operator with
      "<<" instead.
      
      This cleans up the clang scan build warning:
      
      drivers/nvme/target/core.c:1124:10: warning: Although the value stored to 'cc_css' is used in the enclosing expression, the value is never actually read from 'cc_css' [deadcode.DeadStores]
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      63bc732c
    • Sagi Grimberg's avatar
      nvmet: use a private workqueue instead of the system workqueue · 8832cf92
      Sagi Grimberg authored
      Any attempt to flush kernel-global WQs has possibility of deadlock
      so we should simply stop using them, instead introduce nvmet_wq
      which is the generic nvmet workqueue for work elements that
      don't explicitly require a dedicated workqueue (by the mere fact
      that they are using the system_wq).
      
      Changes were done using the following replaces:
      
       - s/schedule_work(/queue_work(nvmet_wq, /g
       - s/schedule_delayed_work(/queue_delayed_work(nvmet_wq, /g
       - s/flush_scheduled_work()/flush_workqueue(nvmet_wq)/g
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      8832cf92
  2. 23 Mar, 2022 3 commits
  3. 21 Mar, 2022 3 commits
  4. 18 Mar, 2022 1 commit
    • Jens Axboe's avatar
      Merge tag 'nvme-5.18-2022-03-17' of git://git.infradead.org/nvme into for-5.18/drivers · ae53aea6
      Jens Axboe authored
      Pull NVMe updates from Christoph:
      
      "Second round of nvme updates for Linux 5.18
      
       - add lockdep annotations for in-kernel sockets (Chris Leech)
       - use vmalloc for ANA log buffer (Hannes Reinecke)
       - kerneldoc fixes (Chaitanya Kulkarni)
       - cleanups (Guoqing Jiang, Chaitanya Kulkarni, me)
       - warn about shared namespaces without multipathing (me)"
      
      * tag 'nvme-5.18-2022-03-17' of git://git.infradead.org/nvme:
        nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH
        nvme: remove nvme_alloc_request and nvme_alloc_request_qid
        nvme: cleanup how disk->disk_name is assigned
        nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate
        nvmet: use snprintf() with PAGE_SIZE in configfs
        nvmet: don't fold lines
        nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal
        nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport
        nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport
        nvme-tcp: lockdep: annotate in-kernel sockets
        nvme-tcp: don't fold the line
        nvme-tcp: don't initialize ret variable
        nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio
        nvme-multipath: use vmalloc for ANA log buffer
      ae53aea6
  5. 17 Mar, 2022 1 commit
  6. 16 Mar, 2022 3 commits
  7. 15 Mar, 2022 1 commit
  8. 14 Mar, 2022 10 commits
  9. 11 Mar, 2022 1 commit
  10. 10 Mar, 2022 1 commit
  11. 09 Mar, 2022 4 commits
  12. 08 Mar, 2022 6 commits
    • Jens Axboe's avatar
      Merge branch 'md-next' of... · a2daeab5
      Jens Axboe authored
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.18/drivers
      
      Pull MD fixes from Song:
      
      "Most of these changes are minor fixes and clean-ups."
      
      * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md: use msleep() in md_notify_reboot()
        lib/raid6: Include <asm/ppc-opcode.h> for VPERMXOR
        lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3
        lib/raid6/test: fix multiple definition linking error
        md: raid1/raid10: drop pending_cnt
      a2daeab5
    • Eric Dumazet's avatar
      md: use msleep() in md_notify_reboot() · 7d959f6e
      Eric Dumazet authored
      Calling mdelay(1000) from process context, even while a reboot
      is in progress, does not make sense.
      
      Using msleep() allows other threads to make progress.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: linux-raid@vger.kernel.org
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      7d959f6e
    • Paul Menzel's avatar
      lib/raid6: Include <asm/ppc-opcode.h> for VPERMXOR · 5b401e4e
      Paul Menzel authored
      On Ubuntu 21.10 (ppc64le) building raid6test with gcc (Ubuntu
      11.2.0-7ubuntu2) 11.2.0 fails with the error below.
      
          gcc -I.. -I ../../../include -g -O2                       \
                   -I../../../arch/powerpc/include -DCONFIG_ALTIVEC \
                   -c -o vpermxor1.o vpermxor1.c
          vpermxor1.c: In function ‘raid6_vpermxor1_gen_syndrome_real’:
          vpermxor1.c:64:29: error: expected string literal before ‘VPERMXOR’
             64 |   asm(VPERMXOR(%0,%1,%2,%3):"=v"(wq0):"v"(gf_high), "v"(gf_low), "v"(wq0));
                |       ^~~~~~~~
          make: *** [Makefile:58: vpermxor1.o] Error 1
      
      So, include the header asm/ppc-opcode.h defining this macro also when
      not building the Linux kernel but only this too.
      
      Cc: Matt Brown <matthew.brown.dev@gmail.com>
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      5b401e4e
    • Paul Menzel's avatar
      lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3 · 633174a7
      Paul Menzel authored
      Buidling raid6test on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the
      errors below:
      
          $ cd lib/raid6/test/
          $ make
          <stdin>:1:1: error: stray ‘\’ in program
          <stdin>:1:2: error: stray ‘#’ in program
          <stdin>:1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ \
              before ‘<’ token
      
          [...]
      
      The errors come from the HAS_ALTIVEC test, which fails, and the POWER
      optimized versions are not built. That’s also reason nobody noticed on the
      other architectures.
      
      GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release
      announcment:
      
      > * WARNING: Backward-incompatibility!
      >   Number signs (#) appearing inside a macro reference or function invocation
      >   no longer introduce comments and should not be escaped with backslashes:
      >   thus a call such as:
      >     foo := $(shell echo '#')
      >   is legal.  Previously the number sign needed to be escaped, for example:
      >     foo := $(shell echo '\#')
      >   Now this latter will resolve to "\#".  If you want to write makefiles
      >   portable to both versions, assign the number sign to a variable:
      >     H := \#
      >     foo := $(shell echo '$H')
      >   This was claimed to be fixed in 3.81, but wasn't, for some reason.
      >   To detect this change search for 'nocomment' in the .FEATURES variable.
      
      So, do the same as commit 9564a8cf ("Kbuild: fix # escaping in .cmd
      files for future Make") and commit 929bef46 ("bpf: Use $(pound) instead
      of \# in Makefiles") and define and use a $(pound) variable.
      
      Reference for the change in make:
      https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57
      
      Cc: Matt Brown <matthew.brown.dev@gmail.com>
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      633174a7
    • Dirk Müller's avatar
      lib/raid6/test: fix multiple definition linking error · a5359ddd
      Dirk Müller authored
      GCC 10+ defaults to -fno-common, which enforces proper declaration of
      external references using "extern". without this change a link would
      fail with:
      
        lib/raid6/test/algos.c:28: multiple definition of `raid6_call';
        lib/raid6/test/test.c:22: first defined here
      
      the pq.h header that is included already includes an extern declaration
      so we can just remove the redundant one here.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDirk Müller <dmueller@suse.de>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      a5359ddd
    • Mariusz Tkaczyk's avatar
      md: raid1/raid10: drop pending_cnt · daae161f
      Mariusz Tkaczyk authored
      Those counters are not necessary after commit 11bb45e8aaf6 ("md: drop queue
      limitation for RAID1 and RAID10"). Remove them from all code (conf and
      plug structs). raid1_plug_cb and raid10_plug_cb are identical, so move
      definition of raid1_plug_cb to common raid1-10 definitions and use it for
      RAID10 too.
      Signed-off-by: default avatarMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      daae161f