- 14 Oct, 2014 33 commits
-
-
John Spray authored
MDS session state and client global ID is useful instrumentation when testing. Signed-off-by: John Spray <john.spray@redhat.com>
-
John Spray authored
...so that it can be used from the ceph debugfs code when dumping session info. Signed-off-by: John Spray <john.spray@redhat.com>
-
Yan, Zheng authored
Current code set new file/directory's initial ACL in a non-atomic manner. Client first sends request to MDS to create new file/directory, then set the initial ACL after the new file/directory is successfully created. The fix is include the initial ACL in create/mkdir/mknod MDS requests. So MDS can handle creating file/directory and setting the initial ACL in one request. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
-
Yan, Zheng authored
Current code uses page array to present MDS request data. Pages in the array are allocated/freed by caller of ceph_mdsc_do_request(). If request is interrupted, the pages can be freed while they are still being used by the request message. The fix is use pagelist to present MDS request data. Pagelist is reference counted. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
-
Yan, Zheng authored
this allow pagelist to present data that may be sent multiple times. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
-
Yan, Zheng authored
only regular file and directory have vxattrs. Signed-off-by: Yan, Zheng <zyan@redhat.com>
-
John Spray authored
Implement version 2 of CEPH_MSG_CLIENT_SESSION syntax, which includes additional client metadata to allow the MDS to report on clients by user-sensible names like hostname. Signed-off-by: John Spray <john.spray@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
-
Chao Yu authored
Both ceph_update_writeable_page and ceph_setattr will verify file size with max size ceph supported. There are two caller for ceph_update_writeable_page, ceph_write_begin and ceph_page_mkwrite. For ceph_write_begin, we have already verified the size in generic_write_checks of ceph_write_iter; for ceph_page_mkwrite, we have no chance to change file size when mmap. Likewise we have already verified the size in inode_change_ok when we call ceph_setattr. So let's remove the redundant code for max file size verification. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
-
Yan, Zheng authored
ceph_sync_read and generic_file_read_iter() have already advanced the IO iterator. Signed-off-by: Yan, Zheng <zyan@redhat.com>
-
Yan, Zheng authored
ceph_find_inode() may wait on freeing inode, using it inside the s_mutex may cause deadlock. (the freeing inode is waiting for OSD read reply, but dispatch thread is blocked by the s_mutex) Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
-
Yan, Zheng authored
Following sequence of events can happen. - Client releases an inode, queues cap release message. - A 'lookup' reply brings the same inode back, but the reply doesn't contain xattrs because MDS didn't receive the cap release message and thought client already has up-to-data xattrs. The fix is force sending a getattr request to MDS if xattrs_version is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client does not have xattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>
-
Josh Durgin authored
max_discard_sectors must be set for the queue to support discard. Operations implementing discard for rbd zero data, so report that. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Josh Durgin authored
Only allocate two osd ops for discard requests, since the preallocation hint is only added for regular writes. Use rbd_img_obj_request_fill() to recreate the original write or discard osd operations, isolating that logic to one place, and change the assert in rbd_osd_req_create_copyup() to accept discard requests as well. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Josh Durgin authored
rbd_img_request_fill() creates a ceph_osd_request and has logic for adding the appropriate osd ops to it based on the request type and image properties. For layered images, the original rbd_obj_request is resent with a copyup operation in front, using a new ceph_osd_request. The logic for adding the original operations should be the same as when first sending them, so move it to a helper function. op_type only needs to be checked once, so create a helper for that as well and call it outside the loop in rbd_img_request_fill(). Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Josh Durgin authored
Discard requests are a form of write, so they should go through the same process as plain write requests and trigger copy-on-write for layered images. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Josh Durgin authored
Discard may try to delete an object from a non-layered image that does not exist. If this occurs, the image already has no data in that range, so change the result to success. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Josh Durgin authored
Discards take a reference to the snapshot context of an image when they are created. This reference needs to be cleaned up when the request is done just as it is for regular writes. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Josh Durgin authored
In rbd_img_request_fill() the image size is only checked to determine whether we can truncate an object instead of zeroing it for discard requests. Take rbd_dev->header_rwsem while reading the image size, and move this read into the discard check, so that non-discard ops don't need to take the semaphore in this function. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Guangliang Zhao authored
This patch add the discard support for rbd driver. There are three types operation in the driver: 1. The objects would be removed if they completely contained within the discard range. 2. The objects would be truncated if they partly contained within the discard range, and align with their boundary. 3. Others would be zeroed. A discard request from blkdev_issue_discard() is defined which REQ_WRITE and REQ_DISCARD both marked and no data, so we must check the REQ_DISCARD first when getting the request type. This resolve: http://tracker.ceph.com/issues/190 [ Ilya Dryomov: This is incomplete and somewhat buggy, see follow up commits by Josh Durgin for refinements and fixes which weren't folded in to preserve authorship. ] Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Guangliang Zhao authored
It could only handle the read and write operations now, extend it for the coming discard support. Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Guangliang Zhao authored
It need to copyup the parent's content when layered writing, but an entire object write would overwrite it, so skip it. Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
To clarify the conditions and make it easier to add new ones. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
-
Josh Durgin authored
These fields may both change while the image is mapped if a snapshot is created or deleted or the image is resized. They are guarded by rbd_dev->header_rwsem, so hold that while reading them, and store a local copy to refer to outside of the critical section. The local copy will stay consistent since the snapshot context is reference counted, and the mapping size is just a u64. This prevents torn loads from giving us inconsistent values. Move reading header.snapc into the caller of rbd_img_request_create() so that we only need to take the semaphore once. The read-only caller, rbd_parent_request_create() can just pass NULL for snapc, since the snapshot context is only relevant for writes. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
-
Ilya Dryomov authored
Trying to map an image out of a pool for which we don't have an 'x' permission bit fails with -ERANGE from ceph_extract_encoded_string() due to an unsigned vs signed bug. Fix it and get rid of the -EINVAL sink, thus propagating rbd::get_id cls method errors. (I've seen a bunch of unexplained -ERANGE reports, I bet this is it). Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
queue_work() doesn't "fail to queue", it returns false if work was already on a queue, which can't happen here since we allocate event_work right before we queue it. So don't bother at all. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Yan, Zheng authored
we may corrupt waiting list if a request in the waiting list is kicked. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
-
Yan, Zheng authored
Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
-
Joe Perches authored
Use the more common pr_warn. Other miscellanea: o Coalesce formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
-
Yan, Zheng authored
So the recovering MDS does not need to fetch these ununsed inodes during cache rejoin. This may reduce MDS recovery time. Signed-off-by: Yan, Zheng <zyan@redhat.com>
-
Li RongQing authored
If the state variable is krealloced successfully, map->osd_state will be freed, once following two reallocation failed, and exit the function without resetting map->osd_state, map->osd_state become a wild pointer. fix it by resetting them after krealloc successfully. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
-
Ilya Dryomov authored
We want "cbc(aes)" algorithm, so select CRYPTO_CBC too, not just CRYPTO_AES. Otherwise on !CRYPTO_CBC kernels we fail rbd map/mount with libceph: error -2 building auth method x request Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
-
Ilya Dryomov authored
Both not yet registered (r_linger && list_empty(&r_linger_item)) and registered linger requests should use the new tid on resend to avoid the dup op detection logic on the OSDs, yet we were doing this only for "registered" case. Factor out and simplify the "registered" logic and use the new helper for "not registered" case as well. Fixes: http://tracker.ceph.com/issues/8806Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
Introduce __enqueue_request() and switch to it. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
- 05 Oct, 2014 2 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "This is a set of two small fixes, both to code which went in during the merge window: cxgb4i has a scheduling in atomic bug in its new ipv6 code and uas fails to work properly with the new scsi-mq code" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] uas: disable use of blk-mq I/O path [SCSI] cxgb4i: avoid holding mutex in interrupt context
-
- 04 Oct, 2014 1 commit
-
-
https://git.kernel.org/pub/scm/linux/kernel/git/josh/linuxLinus Torvalds authored
Pull kconfig fixes for tiny setups from Josh Triplett: "Two Kconfig bugfixes for 3.17 related to tinification. These fixes make the Kconfig "General Setup" menu much more usable" * tag 'tiny/kconfig-for-3.17' of https://git.kernel.org/pub/scm/linux/kernel/git/josh/linux: init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu init/Kconfig: Hide printk log config if CONFIG_PRINTK=n
-
- 03 Oct, 2014 4 commits
-
-
Josh Triplett authored
commit 03b8c7b6 ("futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test") added the HAVE_FUTEX_CMPXCHG symbol right below FUTEX. This placed it right in the middle of the options for the EXPERT menu. However, HAVE_FUTEX_CMPXCHG does not depend on EXPERT or FUTEX, so Kconfig stops placing items in the EXPERT menu, and displays the remaining several EXPERT items (starting with EPOLL) directly in the General Setup menu. Since both users of HAVE_FUTEX_CMPXCHG only select it "if FUTEX", make HAVE_FUTEX_CMPXCHG itself depend on FUTEX. With this change, the subsequent items display as part of the EXPERT menu again; the EMBEDDED menu now appears as the next top-level item in the General Setup menu, which makes General Setup much shorter and more usable. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: stable <stable@vger.kernel.org>
-
Josh Triplett authored
The buffers sized by CONFIG_LOG_BUF_SHIFT and CONFIG_LOG_CPU_MAX_BUF_SHIFT do not exist if CONFIG_PRINTK=n, so don't ask about their size at all. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: stable <stable@vger.kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c fixes from Wolfram Sang: "Two i2c driver bugfixes" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: qup: Fix order of runtime pm initialization i2c: rk3x: fix 0 length write transfers
-
Linus Torvalds authored
Merge tag 'trace-fixes-v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull trace ring buffer iterator fix from Steven Rostedt: "While testing some new changes for 3.18, I kept hitting a bug every so often in the ring buffer. At first I thought it had to do with some of the changes I was working on, but then testing something else I realized that the bug was in 3.17 itself. I ran several bisects as the bug was not very reproducible, and finally came up with the commit that I could reproduce easily within a few minutes, and without the change I could run the tests over an hour without issue. The change fit the bug and I figured out a fix. That bad commit was: Commit 651e22f2 "ring-buffer: Always reset iterator to reader page" This commit fixed a bug, but in the process created another one. It used the wrong value as the cached value that is used to see if things changed while an iterator was in use. This made it look like a change always happened, and could cause the iterator to go into an infinite loop" * tag 'trace-fixes-v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Fix infinite spin in reading buffer
-