- 09 Aug, 2018 10 commits
-
-
Shenghui Wang authored
Remove the tailing backslash in macro BTREE_FLAG in btree.h Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Shenghui Wang authored
The pr_err statement in the code for sysfs_attatch section would run for various error codes, which maybe confusing. E.g, Run the command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach [the backing dev got attached on the first run] echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach In dmesg, after the command run twice, we can get: bcache: bch_cached_dev_attach() Can't attach sda6: already attached bcache: __cached_dev_store() Can't attach 796b5c05-b03c-4bc7-9cbd-\ a8df5e8be891 : cache set not found The first statement in the message was right, but the second was confusing. bch_cached_dev_attach has various pr_ statements for various error codes, except ENOENT. After the change, rerun above command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach In dmesg we only got: bcache: bch_cached_dev_attach() Can't attach sda6: already attached No confusing "cache set not found" message anymore. And for some not exist SET-UUID: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be898 > \ /sys/block/bcache0/bcache/attach In dmesg we can get: bcache: __cached_dev_store() Can't attach 796b5c05-b03c-4bc7-9cbd-\ a8df5e8be898 : cache set not found Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
Commit b1092c9a ("bcache: allow quick writeback when backing idle") allows the writeback rate to be faster if there is no I/O request on a bcache device. It works well if there is only one bcache device attached to the cache set. If there are many bcache devices attached to a cache set, it may introduce performance regression because multiple faster writeback threads of the idle bcache devices will compete the btree level locks with the bcache device who have I/O requests coming. This patch fixes the above issue by only permitting fast writebac when all bcache devices attached on the cache set are idle. And if one of the bcache devices has new I/O request coming, minimized all writeback throughput immediately and let PI controller __update_writeback_rate() to decide the upcoming writeback rate for each bcache device. Also when all bcache devices are idle, limited wrieback rate to a small number is wast of thoughput, especially when backing devices are slower non-rotation devices (e.g. SATA SSD). This patch sets a max writeback rate for each backing device if the whole cache set is idle. A faster writeback rate in idle time means new I/Os may have more available space for dirty data, and people may observe a better write performance then. Please note bcache may change its cache mode in run time, and this patch still works if the cache mode is switched from writeback mode and there is still dirty data on cache. Fixes: Commit b1092c9a ("bcache: allow quick writeback when backing idle") Cc: stable@vger.kernel.org #4.16+ Signed-off-by: Coly Li <colyli@suse.de> Tested-by: Kai Krakow <kai@kaishome.de> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Cc: Michael Lyle <mlyle@lyle.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
This patch tries to add code comments in bset.c, to make some tricky code and designment to be more comprehensible. Most information of this patch comes from the discussion between Kent and I, he offers very informative details. If there is any mistake of the idea behind the code, no doubt that's from me misrepresentation. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
This patch updates code comment in bch_keylist_realloc() by fixing incorrected function names, to make the code to be more comprehennsible. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
This patch updates the code comment in struct cache with correct array names, to make the code to be more comprehensible. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
This patch adds a line of code comment in super.c:register_bdev(), to make code to be more comprehensible. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
In bch_btree_node_get() the read-in btree node will be partially prefetched into L1 cache for following bset iteration (if there is). But if the btree node read is failed, the perfetch operations will waste L1 cache space. This patch checkes whether read operation and only does cache prefetch when read I/O succeeded. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
When writeback is not running, writeback rate should be 0, other value is misleading. And the following dyanmic writeback rate debug parameters should be 0 too, rate, proportional, integral, change otherwise they are misleading when writeback is not running. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Coly Li authored
Greg KH suggests that normal code should not care about debugfs. Therefore no matter successful or failed of debugfs_create_dir() execution, it is unncessary to check its return value. There are two functions called debugfs_create_dir() and check the return value, which are bch_debug_init() and closure_debug_init(). This patch changes these two functions from int to void type, and ignore return values of debugfs_create_dir(). This patch does not fix exact bug, just makes things work as they should. Signed-off-by: Coly Li <colyli@suse.de> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Cc: Kai Krakow <kai@kaishome.de> Cc: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 08 Aug, 2018 3 commits
-
-
zhong jiang authored
kmem_cache_destroy has taken null pointer into account. So it is safe to drop the null check before calling the function. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
zhong jiang authored
mempool_destroy has taken the null pointer into account. So it is safe to remove the null check. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
zhong jiang authored
debugfs_remove_recursive has taken null pointer into account. So it is safe to drop the null check before calling the function. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 07 Aug, 2018 4 commits
-
-
Bart Van Assche authored
This patch does not change any functionality but avoids that gcc reports the following warnings when building with W=1: block/cfq-iosched.c: In function ?cfq_back_seek_max_store?: block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (__data < (MIN)) \ ^ block/cfq-iosched.c:4756:1: note: in expansion of macro ?STORE_FUNCTION? STORE_FUNCTION(cfq_back_seek_max_store, &cfqd->cfq_back_max, 0, UINT_MAX, 0); ^~~~~~~~~~~~~~ block/cfq-iosched.c: In function ?cfq_slice_idle_store?: block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (__data < (MIN)) \ ^ block/cfq-iosched.c:4759:1: note: in expansion of macro ?STORE_FUNCTION? STORE_FUNCTION(cfq_slice_idle_store, &cfqd->cfq_slice_idle, 0, UINT_MAX, 1); ^~~~~~~~~~~~~~ block/cfq-iosched.c: In function ?cfq_group_idle_store?: block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (__data < (MIN)) \ ^ block/cfq-iosched.c:4760:1: note: in expansion of macro ?STORE_FUNCTION? STORE_FUNCTION(cfq_group_idle_store, &cfqd->cfq_group_idle, 0, UINT_MAX, 1); ^~~~~~~~~~~~~~ block/cfq-iosched.c: In function ?cfq_low_latency_store?: block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (__data < (MIN)) \ ^ block/cfq-iosched.c:4765:1: note: in expansion of macro ?STORE_FUNCTION? STORE_FUNCTION(cfq_low_latency_store, &cfqd->cfq_latency, 0, 1, 0); ^~~~~~~~~~~~~~ block/cfq-iosched.c: In function ?cfq_slice_idle_us_store?: block/cfq-iosched.c:4775:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (__data < (MIN)) \ ^ block/cfq-iosched.c:4782:1: note: in expansion of macro ?USEC_STORE_FUNCTION? USEC_STORE_FUNCTION(cfq_slice_idle_us_store, &cfqd->cfq_slice_idle, 0, UINT_MAX); ^~~~~~~~~~~~~~~~~~~ block/cfq-iosched.c: In function ?cfq_group_idle_us_store?: block/cfq-iosched.c:4775:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (__data < (MIN)) \ ^ block/cfq-iosched.c:4783:1: note: in expansion of macro ?USEC_STORE_FUNCTION? USEC_STORE_FUNCTION(cfq_group_idle_us_store, &cfqd->cfq_group_idle, 0, UINT_MAX); ^~~~~~~~~~~~~~~~~~~ Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
This patch avoids that gcc complains about fall-through when building with W=1. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Anchal Agarwal authored
I am currently running a large bare metal instance (i3.metal) on EC2 with 72 cores, 512GB of RAM and NVME drives, with a 4.18 kernel. I have a workload that simulates a database workload and I am running into lockup issues when writeback throttling is enabled,with the hung task detector also kicking in. Crash dumps show that most CPUs (up to 50 of them) are all trying to get the wbt wait queue lock while trying to add themselves to it in __wbt_wait (see stack traces below). [ 0.948118] CPU: 45 PID: 0 Comm: swapper/45 Not tainted 4.14.51-62.38.amzn1.x86_64 #1 [ 0.948119] Hardware name: Amazon EC2 i3.metal/Not Specified, BIOS 1.0 10/16/2017 [ 0.948120] task: ffff883f7878c000 task.stack: ffffc9000c69c000 [ 0.948124] RIP: 0010:native_queued_spin_lock_slowpath+0xf8/0x1a0 [ 0.948125] RSP: 0018:ffff883f7fcc3dc8 EFLAGS: 00000046 [ 0.948126] RAX: 0000000000000000 RBX: ffff887f7709ca68 RCX: ffff883f7fce2a00 [ 0.948128] RDX: 000000000000001c RSI: 0000000000740001 RDI: ffff887f7709ca68 [ 0.948129] RBP: 0000000000000002 R08: 0000000000b80000 R09: 0000000000000000 [ 0.948130] R10: ffff883f7fcc3d78 R11: 000000000de27121 R12: 0000000000000002 [ 0.948131] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 [ 0.948132] FS: 0000000000000000(0000) GS:ffff883f7fcc0000(0000) knlGS:0000000000000000 [ 0.948134] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.948135] CR2: 000000c424c77000 CR3: 0000000002010005 CR4: 00000000003606e0 [ 0.948136] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.948137] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.948138] Call Trace: [ 0.948139] <IRQ> [ 0.948142] do_raw_spin_lock+0xad/0xc0 [ 0.948145] _raw_spin_lock_irqsave+0x44/0x4b [ 0.948149] ? __wake_up_common_lock+0x53/0x90 [ 0.948150] __wake_up_common_lock+0x53/0x90 [ 0.948155] wbt_done+0x7b/0xa0 [ 0.948158] blk_mq_free_request+0xb7/0x110 [ 0.948161] __blk_mq_complete_request+0xcb/0x140 [ 0.948166] nvme_process_cq+0xce/0x1a0 [nvme] [ 0.948169] nvme_irq+0x23/0x50 [nvme] [ 0.948173] __handle_irq_event_percpu+0x46/0x300 [ 0.948176] handle_irq_event_percpu+0x20/0x50 [ 0.948179] handle_irq_event+0x34/0x60 [ 0.948181] handle_edge_irq+0x77/0x190 [ 0.948185] handle_irq+0xaf/0x120 [ 0.948188] do_IRQ+0x53/0x110 [ 0.948191] common_interrupt+0x87/0x87 [ 0.948192] </IRQ> .... [ 0.311136] CPU: 4 PID: 9737 Comm: run_linux_amd64 Not tainted 4.14.51-62.38.amzn1.x86_64 #1 [ 0.311137] Hardware name: Amazon EC2 i3.metal/Not Specified, BIOS 1.0 10/16/2017 [ 0.311138] task: ffff883f6e6a8000 task.stack: ffffc9000f1ec000 [ 0.311141] RIP: 0010:native_queued_spin_lock_slowpath+0xf5/0x1a0 [ 0.311142] RSP: 0018:ffffc9000f1efa28 EFLAGS: 00000046 [ 0.311144] RAX: 0000000000000000 RBX: ffff887f7709ca68 RCX: ffff883f7f722a00 [ 0.311145] RDX: 0000000000000035 RSI: 0000000000d80001 RDI: ffff887f7709ca68 [ 0.311146] RBP: 0000000000000202 R08: 0000000000140000 R09: 0000000000000000 [ 0.311147] R10: ffffc9000f1ef9d8 R11: 000000001a249fa0 R12: ffff887f7709ca68 [ 0.311148] R13: ffffc9000f1efad0 R14: 0000000000000000 R15: ffff887f7709ca00 [ 0.311149] FS: 000000c423f30090(0000) GS:ffff883f7f700000(0000) knlGS:0000000000000000 [ 0.311150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.311151] CR2: 00007feefcea4000 CR3: 0000007f7016e001 CR4: 00000000003606e0 [ 0.311152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.311153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.311154] Call Trace: [ 0.311157] do_raw_spin_lock+0xad/0xc0 [ 0.311160] _raw_spin_lock_irqsave+0x44/0x4b [ 0.311162] ? prepare_to_wait_exclusive+0x28/0xb0 [ 0.311164] prepare_to_wait_exclusive+0x28/0xb0 [ 0.311167] wbt_wait+0x127/0x330 [ 0.311169] ? finish_wait+0x80/0x80 [ 0.311172] ? generic_make_request+0xda/0x3b0 [ 0.311174] blk_mq_make_request+0xd6/0x7b0 [ 0.311176] ? blk_queue_enter+0x24/0x260 [ 0.311178] ? generic_make_request+0xda/0x3b0 [ 0.311181] generic_make_request+0x10c/0x3b0 [ 0.311183] ? submit_bio+0x5c/0x110 [ 0.311185] submit_bio+0x5c/0x110 [ 0.311197] ? __ext4_journal_stop+0x36/0xa0 [ext4] [ 0.311210] ext4_io_submit+0x48/0x60 [ext4] [ 0.311222] ext4_writepages+0x810/0x11f0 [ext4] [ 0.311229] ? do_writepages+0x3c/0xd0 [ 0.311239] ? ext4_mark_inode_dirty+0x260/0x260 [ext4] [ 0.311240] do_writepages+0x3c/0xd0 [ 0.311243] ? _raw_spin_unlock+0x24/0x30 [ 0.311245] ? wbc_attach_and_unlock_inode+0x165/0x280 [ 0.311248] ? __filemap_fdatawrite_range+0xa3/0xe0 [ 0.311250] __filemap_fdatawrite_range+0xa3/0xe0 [ 0.311253] file_write_and_wait_range+0x34/0x90 [ 0.311264] ext4_sync_file+0x151/0x500 [ext4] [ 0.311267] do_fsync+0x38/0x60 [ 0.311270] SyS_fsync+0xc/0x10 [ 0.311272] do_syscall_64+0x6f/0x170 [ 0.311274] entry_SYSCALL_64_after_hwframe+0x42/0xb7 In the original patch, wbt_done is waking up all the exclusive processes in the wait queue, which can cause a thundering herd if there is a large number of writer threads in the queue. The original intention of the code seems to be to wake up one thread only however, it uses wake_up_all() in __wbt_done(), and then uses the following check in __wbt_wait to have only one thread actually get out of the wait loop: if (waitqueue_active(&rqw->wait) && rqw->wait.head.next != &wait->entry) return false; The problem with this is that the wait entry in wbt_wait is define with DEFINE_WAIT, which uses the autoremove wakeup function. That means that the above check is invalid - the wait entry will have been removed from the queue already by the time we hit the check in the loop. Secondly, auto-removing the wait entries also means that the wait queue essentially gets reordered "randomly" (e.g. threads re-add themselves in the order they got to run after being woken up). Additionally, new requests entering wbt_wait might overtake requests that were queued earlier, because the wait queue will be (temporarily) empty after the wake_up_all, so the waitqueue_active check will not stop them. This can cause certain threads to starve under high load. The fix is to leave the woken up requests in the queue and remove them in finish_wait() once the current thread breaks out of the wait loop in __wbt_wait. This will ensure new requests always end up at the back of the queue, and they won't overtake requests that are already in the wait queue. With that change, the loop in wbt_wait is also in line with many other wait loops in the kernel. Waking up just one thread drastically reduces lock contention, as does moving the wait queue add/remove out of the loop. A significant drop in lockdep's lock contention numbers is seen when running the test application on the patched kernel. Signed-off-by: Anchal Agarwal <anchalag@amazon.com> Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The target loopback driver is a low-level driver for the SCSI subsystem, and as such needs to depend on it. Fixes: 8a39a047 ("target: don't depend on SCSI") Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 06 Aug, 2018 4 commits
-
-
Gustavo A. R. Silva authored
Return statements in functions returning bool should use true or false instead of an integer value. This code was detected with the help of Coccinelle. Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Matias Bjørling authored
A minor version number increase should not break backwards compatibility. Fixes: 3cb98f84 ("lightnvm: add minor version to generic geometry") Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
git://git.infradead.org/nvmeJens Axboe authored
Pull NVMe changes from Christoph: "This contains the support for TP4004, Asymmetric Namespace Access, which makes NVMe multipathing usable in practice." * 'nvme-4.19' of git://git.infradead.org/nvme: nvmet: use Retain Async Event bit to clear AEN nvmet: support configuring ANA groups nvmet: add minimal ANA support nvmet: track and limit the number of namespaces per subsystem nvmet: keep a port pointer in nvmet_ctrl nvme: add ANA support nvme: remove nvme_req_needs_failover nvme: simplify the API for getting log pages nvme.h: add ANA definitions nvme.h: add support for the log specific field Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Pull in 4.18-rc6 to get the NVMe core AEN change to avoid a merge conflict down the line. Signed-of-by: Jens Axboe <axboe@kernel.dk>
-
- 02 Aug, 2018 13 commits
-
-
Kees Cook authored
To avoid introducing problems like those fixed in commit f7068114 ("sr: pass down correctly sized SCSI sense buffer"), this creates a macro wrapper for scsi_execute() that verifies the size of the sense buffer similar to what was done for command string sizes in commit 3756f640 ("exec: avoid gcc-8 warning for get_task_comm"). Another solution could be to add a length argument to scsi_execute(), but this function already takes a lot of arguments and Jens was not fond of that approach. Additionally, this moves the SCSI_SENSE_BUFFERSIZE definition into scsi_device.h, and removes a redundant include for scsi_device.h from scsi_cmnd.h. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kees Cook authored
To support future compile-time sizeof() checks that will be able to validate the length of sense buffers, this removes the only dynamically allocated sense buffers in the tree by putting the 96 byte sense buffers on the stack. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kees Cook authored
This removes more casts of struct request_sense and uses the standard struct scsi_sense_hdr instead. This also fixes any possible stale values since the prior code did not check the sense length. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kees Cook authored
This is already able to process the sense buffer, so remove the redundant parsing during the failure path. This also fixes any possible stale values since the prior code did not check the sense length. Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kees Cook authored
There is a lot of needless struct request_sense usage in the CDROM code. These can all be struct scsi_sense_hdr instead, to avoid any confusion over their respective structure sizes. This patch is a lot of noise changing "sense" to "sshdr", but the final code is more readable to distinguish between "sense" meaning "struct request_sense" and "sshdr" meaning "struct scsi_sense_hdr". Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The core target code only needs code from scsi_common.c, which is now separately selectable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Split scsi_common.o out of SCSI so that non-SCSI users can pull it in easily for future sense buffer helper usage. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kees Cook authored
This removes the unused sense buffer in read_cap16() and write_same16(). Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kees Cook authored
This drops unused sense buffers from: cdrom_eject() cdrom_read_capacity() cdrom_read_tocentry() ide_cd_lockdoor() ide_cd_read_toc() Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
The passed 'nr' from userspace represents the total depth, meantime inside 'struct blk_mq_tags', 'nr_tags' stores the total tag depth, and 'nr_reserved_tags' stores the reserved part. There are two issues in blk_mq_tag_update_depth() now: 1) for growing tags, we should have used the passed 'nr', and keep the number of reserved tags not changed. 2) the passed 'nr' should have been used for checking against 'tags->nr_tags', instead of number of the normal part. This patch fixes the above two cases, and avoids kernel crash caused by wrong resizing sbitmap queue. Cc: "Ewan D. Milne" <emilne@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Omar Sandoval <osandov@fb.com> Tested by: Marco Patalano <mpatalan@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
Runtime PM isn't ready for blk-mq yet, and commit 765e40b6 ("block: disable runtime-pm for blk-mq") tried to disable it. Unfortunately, it can't take effect in that way since user space still can switch it on via 'echo auto > /sys/block/sdN/device/power/control'. This patch disables runtime-pm for blk-mq really by pm_runtime_disable() and fixes all kinds of PM related kernel crash. Cc: Tomas Janousek <tomi@nomi.cz> Cc: Przemek Socha <soprwa@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: <stable@vger.kernel.org> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Gustavo A. R. Silva authored
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 114722 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Dennis Zhou (Facebook) authored
Currently, avg_lat is calculated by accumulating the mean of every window in a long running cumulative average. As time goes on, the metric becomes less and less useful due to the accumulated history. This patch reuses the same calculation done in load averages to make the avg_lat metric more lively. Unlike load averages, the avg only advances when a window elapses (due to an io). Idle periods extend the most recent window. Bucketing is used to limit the history of avg_lat by binding it to the window size. So, the window range for 1/exp (decay rate) is [1 min, 2.5 min) when windows elapse immediately. The current sample window size is exposed in the debug info to enable calculation of the window range. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 01 Aug, 2018 4 commits
-
-
Josef Bacik authored
We were hitting a panic in production where we put too many times on the request queue. This is because we'd get the throttle_queue of the parent if we fork()'ed while we needed to be throttled, but we didn't have a reference on it. Instead just clear these flags on fork so the child doesn't pay for the sins of its father. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Josef Bacik authored
The blkg lifetime is protected by the queue lifetime, so we need to put the queue _after_ we're done using the blkg. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Josef Bacik authored
At this point we have a ref on the blkg, we need to drop it if we don't have a iolat. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
zhong jiang authored
Simplify the code by using the PTR_ERR_OR_ZERO, instead of the open code. It is better. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 31 Jul, 2018 1 commit
-
-
Jens Axboe authored
Fixes a link failure whtn BLK_DEV_INTEGRITY isn't defined. Fixes: 10c41ddd ("block: move dif_prepare/dif_complete functions to block layer") Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 30 Jul, 2018 1 commit
-
-
xiao jin authored
We find the memory use-after-free issue in __blk_drain_queue() on the kernel 4.14. After read the latest kernel 4.18-rc6 we think it has the same problem. Memory is allocated for q->fq in the blk_init_allocated_queue(). If the elevator init function called with error return, it will run into the fail case to free the q->fq. Then the __blk_drain_queue() uses the same memory after the free of the q->fq, it will lead to the unpredictable event. The patch is to set q->fq as NULL in the fail case of blk_init_allocated_queue(). Fixes: commit 7c94e1c1 ("block: introduce blk_flush_queue to drive flush machinery") Cc: <stable@vger.kernel.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: xiao jin <jin.xiao@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-