1. 28 Jun, 2024 15 commits
  2. 27 Jun, 2024 4 commits
  3. 26 Jun, 2024 9 commits
  4. 24 Jun, 2024 2 commits
  5. 23 Jun, 2024 1 commit
  6. 21 Jun, 2024 4 commits
  7. 20 Jun, 2024 5 commits
    • Alan Adamson's avatar
      nvme: Atomic write support · 5f9bbea0
      Alan Adamson authored
      Add support to set block layer request_queue atomic write limits. The
      limits will be derived from either the namespace or controller atomic
      parameters.
      
      NVMe atomic-related parameters are grouped into "normal" and "power-fail"
      (or PF) class of parameter. For atomic write support, only PF parameters
      are of interest. The "normal" parameters are concerned with racing reads
      and writes (which also applies to PF). See NVM Command Set Specification
      Revision 1.0d section 2.1.4 for reference.
      
      Whether to use per namespace or controller atomic parameters is decided by
      NSFEAT bit 1 - see Figure 97: Identify – Identify Namespace Data
      Structure, NVM Command Set.
      
      NVMe namespaces may define an atomic boundary, whereby no atomic guarantees
      are provided for a write which straddles this per-lba space boundary. The
      block layer merging policy is such that no merges may occur in which the
      resultant request would straddle such a boundary.
      
      Unlike SCSI, NVMe specifies no granularity or alignment rules, apart from
      atomic boundary rule. In addition, again unlike SCSI, there is no
      dedicated atomic write command - a write which adheres to the atomic size
      limit and boundary is implicitly atomic.
      
      If NSFEAT bit 1 is set, the following parameters are of interest:
      - NAWUPF (Namespace Atomic Write Unit Power Fail)
      - NABSPF (Namespace Atomic Boundary Size Power Fail)
      - NABO (Namespace Atomic Boundary Offset)
      
      and we set request_queue limits as follows:
      - atomic_write_unit_max = rounddown_pow_of_two(NAWUPF)
      - atomic_write_max_bytes = NAWUPF
      - atomic_write_boundary = NABSPF
      
      If in the unlikely scenario that NABO is non-zero, then atomic writes will
      not be supported at all as dealing with this adds extra complexity. This
      policy may change in future.
      
      In all cases, atomic_write_unit_min is set to the logical block size.
      
      If NSFEAT bit 1 is unset, the following parameter is of interest:
      - AWUPF (Atomic Write Unit Power Fail)
      
      and we set request_queue limits as follows:
      - atomic_write_unit_max = rounddown_pow_of_two(AWUPF)
      - atomic_write_max_bytes = AWUPF
      - atomic_write_boundary = 0
      
      A new function, nvme_valid_atomic_write(), is also called from submission
      path to verify that a request has been submitted to the driver will
      actually be executed atomically. As mentioned, there is no dedicated NVMe
      atomic write command (which may error for a command which exceeds the
      controller atomic write limits).
      
      Note on NABSPF:
      There seems to be some vagueness in the spec as to whether NABSPF applies
      for NSFEAT bit 1 being unset. Figure 97 does not explicitly mention NABSPF
      and how it is affected by bit 1. However Figure 4 does tell to check Figure
      97 for info about per-namespace parameters, which NABSPF is, so it is
      implied. However currently nvme_update_disk_info() does check namespace
      parameter NABO regardless of this bit.
      Signed-off-by: default avatarAlan Adamson <alan.adamson@oracle.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      jpg: total rewrite
      Signed-off-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Link: https://lore.kernel.org/r/20240620125359.2684798-11-john.g.garry@oracle.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5f9bbea0
    • John Garry's avatar
      scsi: scsi_debug: Atomic write support · 84f3a3c0
      John Garry authored
      Add initial support for atomic writes.
      
      As is standard method, feed device properties via modules param, those
      being:
      - atomic_max_size_blks
      - atomic_alignment_blks
      - atomic_granularity_blks
      - atomic_max_size_with_boundary_blks
      - atomic_max_boundary_blks
      
      These just match sbc4r22 section 6.6.4 - Block limits VPD page.
      
      We just support ATOMIC WRITE (16).
      
      The major change in the driver is how we lock the device for RW accesses.
      
      Currently the driver uses a per-device lock for accessing device metadata
      and "media" data (calls to do_device_access()) atomically for the duration
      of the whole read/write command.
      
      This should not suit verifying atomic writes. Reason being that currently
      all reads/writes are atomic, so using atomic writes does not prove
      anything.
      
      Change device access model to basis that regular writes only atomic on a
      per-sector basis, while reads and atomic writes are fully atomic.
      
      As mentioned, since accessing metadata and device media is atomic,
      continue to have regular writes involving metadata - like discard or PI -
      as atomic. We can improve this later.
      
      Currently we only support model where overlapping going reads or writes
      wait for current access to complete before commencing an atomic write.
      This is described in 4.29.3.2 section of the SBC. However, we simplify,
      things and wait for all accesses to complete (when issuing an atomic
      write).
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Acked-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Link: https://lore.kernel.org/r/20240620125359.2684798-10-john.g.garry@oracle.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      84f3a3c0
    • John Garry's avatar
      scsi: sd: Atomic write support · bf4ae8f2
      John Garry authored
      Support is divided into two main areas:
      - reading VPD pages and setting sdev request_queue limits
      - support WRITE ATOMIC (16) command and tracing
      
      The relevant block limits VPD page need to be read to allow the block layer
      request_queue atomic write limits to be set. These VPD page limits are
      described in sbc4r22 section 6.6.4 - Block limits VPD page.
      
      There are five limits of interest:
      - MAXIMUM ATOMIC TRANSFER LENGTH
      - ATOMIC ALIGNMENT
      - ATOMIC TRANSFER LENGTH GRANULARITY
      - MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY
      - MAXIMUM ATOMIC BOUNDARY SIZE
      
      MAXIMUM ATOMIC TRANSFER LENGTH is the maximum length for a WRITE ATOMIC
      (16) command. It will not be greater than the device MAXIMUM TRANSFER
      LENGTH.
      
      ATOMIC ALIGNMENT and ATOMIC TRANSFER LENGTH GRANULARITY are the minimum
      alignment and length values for an atomic write in terms of logical blocks.
      
      Unlike NVMe, SCSI does not specify an LBA space boundary, but does specify
      a per-IO boundary granularity. The maximum boundary size is specified in
      MAXIMUM ATOMIC BOUNDARY SIZE. When used, this boundary value is set in the
      WRITE ATOMIC (16) ATOMIC BOUNDARY field - layout for the WRITE_ATOMIC_16
      command can be found in sbc4r22 section 5.48. This boundary value is the
      granularity size at which the device may atomically write the data. A value
      of zero in WRITE ATOMIC (16) ATOMIC BOUNDARY field means that all data must
      be atomically written together.
      
      MAXIMUM ATOMIC TRANSFER LENGTH WITH BOUNDARY is the maximum atomic write
      length if a non-zero boundary value is set.
      
      For atomic write support, the WRITE ATOMIC (16) boundary is not of much
      interest, as the block layer expects each request submitted to be executed
      atomically. However, the SCSI spec does leave itself open to a quirky
      scenario where MAXIMUM ATOMIC TRANSFER LENGTH is zero, yet MAXIMUM ATOMIC
      TRANSFER LENGTH WITH BOUNDARY and MAXIMUM ATOMIC BOUNDARY SIZE are both
      non-zero. This case will be supported.
      
      To set the block layer request_queue atomic write capabilities, sanitize
      the VPD page limits and set limits as follows:
      - atomic_write_unit_min is derived from granularity and alignment values.
        If no granularity value is not set, use physical block size
      - atomic_write_unit_max is derived from MAXIMUM ATOMIC TRANSFER LENGTH. In
        the scenario where MAXIMUM ATOMIC TRANSFER LENGTH is zero and boundary
        limits are non-zero, use MAXIMUM ATOMIC BOUNDARY SIZE for
        atomic_write_unit_max. New flag scsi_disk.use_atomic_write_boundary is
        set for this scenario.
      - atomic_write_boundary_bytes is set to zero always
      
      SCSI also supports a WRITE ATOMIC (32) command, which is for type 2
      protection enabled. This is not going to be supported now, so check for
      T10_PI_TYPE2_PROTECTION when setting any request_queue limits.
      
      To handle an atomic write request, add support for WRITE ATOMIC (16)
      command in handler sd_setup_atomic_cmnd(). Flag use_atomic_write_boundary
      is checked here for encoding ATOMIC BOUNDARY field.
      
      Trace info is also added for WRITE_ATOMIC_16 command.
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Acked-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Link: https://lore.kernel.org/r/20240620125359.2684798-9-john.g.garry@oracle.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bf4ae8f2
    • John Garry's avatar
      block: Add fops atomic write support · caf336f8
      John Garry authored
      Support atomic writes by submitting a single BIO with the REQ_ATOMIC set.
      
      It must be ensured that the atomic write adheres to its rules, like
      naturally aligned offset, so call blkdev_dio_invalid() ->
      blkdev_atomic_write_valid() [with renaming blkdev_dio_unaligned() to
      blkdev_dio_invalid()] for this purpose. The BIO submission path currently
      checks for atomic writes which are too large, so no need to check here.
      
      In blkdev_direct_IO(), if the nr_pages exceeds BIO_MAX_VECS, then we cannot
      produce a single BIO, so error in this case.
      
      Finally set FMODE_CAN_ATOMIC_WRITE when the bdev can support atomic writes
      and the associated file flag is for O_DIRECT.
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Acked-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Link: https://lore.kernel.org/r/20240620125359.2684798-8-john.g.garry@oracle.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      caf336f8
    • Prasad Singamsetty's avatar
      block: Add atomic write support for statx · 9abcfbd2
      Prasad Singamsetty authored
      Extend statx system call to return additional info for atomic write support
      support if the specified file is a block device.
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarPrasad Singamsetty <prasad.singamsetty@oracle.com>
      Signed-off-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Acked-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Link: https://lore.kernel.org/r/20240620125359.2684798-7-john.g.garry@oracle.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9abcfbd2