1. 17 Jul, 2020 3 commits
    • Coly Li's avatar
      block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers · ecdef9f4
      Coly Li authored
      Currently REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are defined as
      even numbers 6 and 8, such zone reset bios are treated as READ bios by
      bio_data_dir(), which is obviously misleading.
      
      The macro bio_data_dir() is defined in include/linux/bio.h as,
       55 #define bio_data_dir(bio) \
       56         (op_is_write(bio_op(bio)) ? WRITE : READ)
      
      And op_is_write() is defined in include/linux/blk_types.h as,
      397 static inline bool op_is_write(unsigned int op)
      398 {
      399         return (op & 1);
      400 }
      
      The convention of op_is_write() is when there is data transfer then the
      op code should be odd number, and treat as a write op. bio_data_dir()
      treats all bio direction as READ if op_is_write() reports false, and
      WRITE if op_is_write() reports true.
      
      Because REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are even numbers,
      although they don't transfer data but reporting them as READ bio by
      bio_data_dir() is misleading and might be wrong. Because these two
      commands will reset the writer pointers of the resetting zones, and all
      content after the reset write pointer will be invalid and unaccessible,
      obviously they are not READ bios in any means.
      
      This patch changes REQ_OP_ZONE_RESET from 6 to 15, and changes
      REQ_OP_ZONE_RESET_ALL from 8 to 17. Now bios with these two op code
      can be treated as WRITE by bio_data_dir(). Although they don't transfer
      data, now we keep them consistent with REQ_OP_DISCARD and
      REQ_OP_WRITE_ZEROES with the ituition that they change on-media content
      and should be WRITE request.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
      Cc: Keith Busch <kbusch@kernel.org>
      Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ecdef9f4
    • Yufen Yu's avatar
      block: defer flush request no matter whether we have elevator · b5718d6c
      Yufen Yu authored
      Commit 7520872c ("block: don't defer flushes on blk-mq + scheduling")
      tried to fix deadlock for cycled wait between flush requests and data
      request into flush_data_in_flight. The former holded all driver tags
      and wait for data request completion, but the latter can not complete
      for waiting free driver tags.
      
      After commit 923218f6 ("blk-mq: don't allocate driver tag upfront
      for flush rq"), flush requests will not get driver tag before queuing
      into flush queue.
      
      * With elevator, flush request just get sched_tags before inserting
        flush queue. It will not get driver tag until issue them to driver.
        data request on list fq->flush_data_in_flight will complete in
        the end.
      
      * Without elevator, each flush request will get a driver tag when
        allocate request. Then data request on fq->flush_data_in_flight
        don't worry about lacking driver tag.
      
      In both of these cases, cycled wait cannot be true. So we may allow
      to defer flush request.
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b5718d6c
    • Wei Yongjun's avatar
      block: make blk_timeout_init() static · 943c4d90
      Wei Yongjun authored
      The sparse tool complains as follows:
      
      block/blk-timeout.c:93:12: warning:
       symbol 'blk_timeout_init' was not declared. Should it be static?
      
      Function blk_timeout_init() is not used outside of blk-timeout.c, so
      mark it static.
      
      Fixes: 9054650f ("block: relax jiffies rounding for timeouts")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      943c4d90
  2. 16 Jul, 2020 6 commits
  3. 15 Jul, 2020 3 commits
    • Jens Axboe's avatar
      Revert "blk-rq-qos: remove redundant finish_wait to rq_qos_wait." · e791ee68
      Jens Axboe authored
      This reverts commit 826f2f48.
      
      Qian Cai reports that this commit causes stalls with swap. Revert until
      the reason can be figured out.
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e791ee68
    • Ming Lei's avatar
      block: always remove partitions from blk_drop_partitions() · d0f0f1b4
      Ming Lei authored
      In theory, when GENHD_FL_NO_PART_SCAN is set, no partitions can be created
      on one disk. However, ioctl(BLKPG, BLKPG_ADD_PARTITION) doesn't check
      GENHD_FL_NO_PART_SCAN, so partitions still can be added even though
      GENHD_FL_NO_PART_SCAN is set.
      
      So far blk_drop_partitions() only removes partitions when disk_part_scan_enabled()
      return true. This way can make ghost partition on loop device after changing/clearing
      FD in case that PARTSCAN is disabled, such as partitions can be added
      via 'parted' on loop disk even though GENHD_FL_NO_PART_SCAN is set.
      
      Fix this issue by always removing partitions in blk_drop_partitions(), and
      this way is correct because the current code supposes that no partitions
      can be added in case of GENHD_FL_NO_PART_SCAN.
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d0f0f1b4
    • Jens Axboe's avatar
      block: relax jiffies rounding for timeouts · 9054650f
      Jens Axboe authored
      In doing high IOPS testing, blk-mq is generally pretty well optimized.
      There are a few things that stuck out as using more CPU than what is
      really warranted, and one thing is the round_jiffies_up() that we do
      twice for each request. That accounts for about 0.8% of the CPU in
      my testing.
      
      We can make this cheaper by avoiding an integer division, by just adding
      a rough HZ mask that we can AND with instead. The timeouts are only on a
      second granularity already, we don't have to be that accurate here and
      this patch barely changes that. All we care about is nice grouping.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9054650f
  4. 10 Jul, 2020 2 commits
  5. 08 Jul, 2020 12 commits
  6. 07 Jul, 2020 1 commit
  7. 02 Jul, 2020 2 commits
  8. 01 Jul, 2020 11 commits