1. 09 Feb, 2015 5 commits
  2. 05 Feb, 2015 8 commits
  3. 28 Jan, 2015 4 commits
  4. 23 Jan, 2015 2 commits
    • Shaohua Li's avatar
      blk-mq: add tag allocation policy · 24391c0d
      Shaohua Li authored
      This is the blk-mq part to support tag allocation policy. The default
      allocation policy isn't changed (though it's not a strict FIFO). The new
      policy is round-robin for libata. But it's a try-best implementation. If
      multiple tasks are competing, the tags returned will be mixed (which is
      unavoidable even with !mq, as requests from different tasks can be
      mixed in queue)
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      24391c0d
    • Shaohua Li's avatar
      block: support different tag allocation policy · ee1b6f7a
      Shaohua Li authored
      The libata tag allocation is using a round-robin policy. Next patch will
      make libata use block generic tag allocation, so let's add a policy to
      tag allocation.
      
      Currently two policies: FIFO (default) and round-robin.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      ee1b6f7a
  5. 22 Jan, 2015 1 commit
  6. 21 Jan, 2015 2 commits
    • Martin K. Petersen's avatar
      block: Add discard flag to blkdev_issue_zeroout() function · d93ba7a5
      Martin K. Petersen authored
      blkdev_issue_discard() will zero a given block range. This is done by
      way of explicit writing, thus provisioning or allocating the blocks on
      disk.
      
      There are use cases where the desired behavior is to zero the blocks but
      unprovision them if possible. The blocks must deterministically contain
      zeroes when they are subsequently read back.
      
      This patch adds a flag to blkdev_issue_zeroout() that provides this
      variant. If the discard flag is set and a block device guarantees
      discard_zeroes_data we will use REQ_DISCARD to clear the block range. If
      the device does not support discard_zeroes_data or if the discard
      request fails we will fall back to first REQ_WRITE_SAME and then a
      regular REQ_WRITE.
      
      Also update the callers of blkdev_issue_zero() to reflect the new flag
      and make sb_issue_zeroout() prefer the discard approach.
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      d93ba7a5
    • Jeff Moyer's avatar
      cfq-iosched: fix incorrect filing of rt async cfqq · c6ce1943
      Jeff Moyer authored
      Hi,
      
      If you can manage to submit an async write as the first async I/O from
      the context of a process with realtime scheduling priority, then a
      cfq_queue is allocated, but filed into the wrong async_cfqq bucket.  It
      ends up in the best effort array, but actually has realtime I/O
      scheduling priority set in cfqq->ioprio.
      
      The reason is that cfq_get_queue assumes the default scheduling class and
      priority when there is no information present (i.e. when the async cfqq
      is created):
      
      static struct cfq_queue *
      cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
      	      struct bio *bio, gfp_t gfp_mask)
      {
      	const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
      	const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
      
      cic->ioprio starts out as 0, which is "invalid".  So, class of 0
      (IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:
      
      		async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);
      
      static struct cfq_queue **
      cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
      {
              switch (ioprio_class) {
              case IOPRIO_CLASS_RT:
                      return &cfqd->async_cfqq[0][ioprio];
              case IOPRIO_CLASS_NONE:
                      ioprio = IOPRIO_NORM;
                      /* fall through */
              case IOPRIO_CLASS_BE:
                      return &cfqd->async_cfqq[1][ioprio];
              case IOPRIO_CLASS_IDLE:
                      return &cfqd->async_idle_cfqq;
              default:
                      BUG();
              }
      }
      
      Here, instead of returning a class mapped from the process' scheduling
      priority, we get back the bucket associated with IOPRIO_CLASS_BE.
      
      Now, there is no queue allocated there yet, so we create it:
      
      		cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);
      
      That function ends up doing this:
      
      			cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
      			cfq_init_prio_data(cfqq, cic);
      
      cfq_init_cfqq marks the priority as having changed.  Then, cfq_init_prio
      data does this:
      
      	ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
      	switch (ioprio_class) {
      	default:
      		printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
      	case IOPRIO_CLASS_NONE:
      		/*
      		 * no prio set, inherit CPU scheduling settings
      		 */
      		cfqq->ioprio = task_nice_ioprio(tsk);
      		cfqq->ioprio_class = task_nice_ioclass(tsk);
      		break;
      
      So we basically have two code paths that treat IOPRIO_CLASS_NONE
      differently, which results in an RT async cfqq filed into a best effort
      bucket.
      
      Attached is a patch which fixes the problem.  I'm not sure how to make
      it cleaner.  Suggestions would be welcome.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Tested-by: default avatarHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c6ce1943
  7. 14 Jan, 2015 2 commits
    • Jens Axboe's avatar
      blk-mq: fix false negative out-of-tags condition · 0bf36498
      Jens Axboe authored
      The blk-mq tagging tries to maintain some locality between CPUs and
      the tags issued. The tags are split into groups of words, and the
      words may not be fully populated. When searching for a new free tag,
      blk-mq may look at partial words, hence it passes in an offset/size
      to find_next_zero_bit(). However, it does that wrong, the size must
      always be the full length of the number of tags in that word,
      otherwise we'll potentially miss some near the end.
      
      Another issue is when __bt_get() goes from one word set to the next.
      It bumps the index, but not the last_tag associated with the
      previous index. Bump that to be in the range of the new word.
      
      Finally, clean up __bt_get() and __bt_get_word() a bit and get
      rid of the goto in there, and the unnecessary 'wrap' variable.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0bf36498
    • Matthew Wilcox's avatar
      block: Change direct_access calling convention · dd22f551
      Matthew Wilcox authored
      In order to support accesses to larger chunks of memory, pass in a
      'size' parameter (counted in bytes), and return the amount available at
      that address.
      
      Add a new helper function, bdev_direct_access(), to handle common
      functionality including partition handling, checking the length requested
      is positive, checking for the sector being page-aligned, and checking
      the length of the request does not pass the end of the partition.
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarBoaz Harrosh <boaz@plexistor.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      dd22f551
  8. 02 Jan, 2015 3 commits
  9. 31 Dec, 2014 1 commit
  10. 29 Dec, 2014 1 commit
  11. 28 Dec, 2014 4 commits
  12. 27 Dec, 2014 4 commits
  13. 26 Dec, 2014 3 commits