1. 13 Dec, 2012 7 commits
  2. 05 Nov, 2012 1 commit
    • Sage Weil's avatar
      ceph: Fix i_size update race · 22cddde1
      Sage Weil authored
      ceph_aio_write() has an optimization that marks cap EPH_CAP_FILE_WR
      dirty before data is copied to page cache and inode size is updated.
      If ceph_check_caps() flushes the dirty cap before the inode size is
      updated, MDS can miss the new inode size. The fix is move
      ceph_{get,put}_cap_refs() into ceph_write_{begin,end}() and call
      __ceph_mark_dirty_caps() after inode size is updated.
      Signed-off-by: default avatarYan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: default avatarSage Weil <sage@inktank.com>
      22cddde1
  3. 04 Nov, 2012 1 commit
  4. 01 Nov, 2012 11 commits
  5. 30 Oct, 2012 14 commits
    • Alex Elder's avatar
      rbd: define image specification structure · 0d7dbfce
      Alex Elder authored
      Group the fields that uniquely specify an rbd image into a new
      reference-counted rbd_spec structure.  This structure will be used
      to describe the desired image when mapping an image, and when
      probing parent images in layered rbd devices.  Replace the set of
      fields in the rbd device structure with a pointer to a dynamically
      allocated rbd_spec.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      0d7dbfce
    • Alex Elder's avatar
      rbd: have rbd_add_parse_args() return error · dc79b113
      Alex Elder authored
      Change the interface to rbd_add_parse_args() so it returns an
      error code rather than a pointer.  Return the ceph_options result
      via a pointer whose address is passed as an argument.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      dc79b113
    • Alex Elder's avatar
      rbd: pass and populate rbd_options structure · 4e9afeba
      Alex Elder authored
      Have the caller pass the address of an rbd_options structure to
      rbd_add_parse_args(), to be initialized with the information
      gleaned as a result of the parse.
      
      I know, this is another near-reversal of a recent change...
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      4e9afeba
    • Alex Elder's avatar
      rbd: remove snap_name arg from rbd_add_parse_args() · 819d52bf
      Alex Elder authored
      The snapshot name returned by rbd_add_parse_args() just gets saved
      in the rbd_dev eventually.  So just do that inside that function and
      do away with the snap_name argument, both in rbd_add_parse_args()
      and rbd_dev_set_mapping().
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      819d52bf
    • Alex Elder's avatar
      rbd: remove options args from rbd_add_parse_args() · f28e565a
      Alex Elder authored
      They "options" argument to rbd_add_parse_args() (and it's partner
      options_size) is now only needed within the function, so there's no
      need to have the caller allocate and pass the options buffer.  Just
      allocate the options buffer within the function using dup_token().
      
      Also distinguish between failures due to failed memory allocation
      and failing because a required argument was missing.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      f28e565a
    • Alex Elder's avatar
      rbd: get rid of snap_name_len · e5c35534
      Alex Elder authored
      The value returned in the "snap_name_len" argument to
      rbd_add_parse_args() is never actually used, so get rid of it.
      
      The snap_name_len recorded in rbd_dev_v2_snap_name() is not
      useful either, so get rid of that too.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      e5c35534
    • Alex Elder's avatar
      rbd: do all argument parsing in one place · 0ddebc0c
      Alex Elder authored
      This patch makes rbd_add_parse_args() be the single place all
      argument parsing occurs for an image map request:
          - Move the ceph_parse_options() call into that function
          - Use local variables rather than parameters to hold the list
            of monitor addresses supplied
          - Rather than returning it, pass the snapshot name (and its
            length) back via parameters
          - Have the function return a ceph_options structure pointer
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      0ddebc0c
    • Alex Elder's avatar
      rbd: move ceph_parse_options() call up · 78cea76e
      Alex Elder authored
      Move option parsing out of rbd_get_client() and into its caller.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      78cea76e
    • Alex Elder's avatar
      rbd: rename snap_exists field · daba5fdb
      Alex Elder authored
      A Boolean field "snap_exists" in an rbd mapping is used to indicate
      whether a mapped snapshot has been removed from an image's snapshot
      context, to stop sending requests for that snapshot as soon as we
      know it's gone.
      
      Generalize the interpretation of this field so it applies to
      non-snapshot (i.e. "head") mappings.  That is, define its value
      to be false until the mapping has been set, and then define it to be
      true for both snapshot mappings or head mappings.
      
      Rename the field "exists" to reflect the broader interpretation.
      The rbd_mapping structure is on its way out, so move the field
      back into the rbd_device structure.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      daba5fdb
    • Alex Elder's avatar
      rbd: move snap info out of rbd_mapping struct · 971f839a
      Alex Elder authored
      Moving the snap_id and snap_name fields into the separate
      rbd_mapping structure was misguided.  (And in time, perhaps
      we'll do away with that structure altogether...)
      
      Move these fields back into struct rbd_device.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      971f839a
    • Alex Elder's avatar
      rbd: make pool_id a 64 bit value · 86992098
      Alex Elder authored
      If a format 2 image has a parent, its pool id will be specified
      using a 64-bit value.  Change the pool id we save for an image to
      match that.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      86992098
    • Alex Elder's avatar
      rbd: remove snapshots on error in rbd_add() · 41f38c2b
      Alex Elder authored
      If rbd_dev_snaps_update() has ever been called for an rbd device
      structure there could be snapshot structures on its snaps list.
      In rbd_add(), this function is called but a subsequent error
      path neglected to clean up any of these snapshots.
      
      Add a call to rbd_remove_all_snaps() in the appropriate spot to
      remedy this.  Change a couple of error labels to be a little
      clearer while there.
      
      Drop the leading underscores from the function name; there's nothing
      special about that function that they might signify.  As suggested
      in review, the leading underscores in __rbd_remove_snap_dev() have
      been removed as well.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      41f38c2b
    • Alex Elder's avatar
      rbd: simplify rbd_rq_fn() · f7760dad
      Alex Elder authored
      When processing a request, rbd_rq_fn() makes clones of the bio's in
      the request's bio chain and submits the results to osd's to be
      satisfied.  If a request bio straddles the boundary between objects
      backing the rbd image, it must be represented by two cloned bio's,
      one for the first part (at the end of one object) and one for the
      second (at the beginning of the next object).
      
      This has been handled by a function bio_chain_clone(), which
      includes an interface only a mother could love, and which has
      been found to have other problems.
      
      This patch defines two new fairly generic bio functions (one which
      replaces bio_chain_clone()) to help out the situation, and then
      revises rbd_rq_fn() to make use of them.
      
      First, bio_clone_range() clones a portion of a single bio, starting
      at a given offset within the bio and including only as many bytes
      as requested.  As a convenience, a request to clone the entire bio
      is passed directly to bio_clone().
      
      Second, bio_chain_clone_range() performs a similar function,
      producing a chain of cloned bio's covering a sub-range of the
      source chain.  No bio_pair structures are used, and if successful
      the result will represent exactly the specified range.
      
      Using bio_chain_clone_range() makes bio_rq_fn() a little easier
      to understand, because it avoids the need to pass very much
      state information between consecutive calls.  By avoiding the need
      to track a bio_pair structure, it also eliminates the problem
      described here:  http://tracker.newdream.net/issues/2933
      
      Note that a block request (and therefore the complete length of
      a bio chain processed in rbd_rq_fn()) is an unsigned int, while
      the result of rbd_segment_length() is u64.  This change makes
      this range trunctation explicit, and trips a bug if the the
      segment boundary is too far off.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      f7760dad
    • Sage Weil's avatar
      libceph: fix osdmap decode error paths · 0ed7285e
      Sage Weil authored
      Ensure that we set the err value correctly so that we do not pass a 0
      value to ERR_PTR and confuse the calling code.  (In particular,
      osd_client.c handle_map() will BUG(!newmap)).
      Signed-off-by: default avatarSage Weil <sage@inktank.com>
      Reviewed-by: default avatarAlex Elder <elder@inktank.com>
      0ed7285e
  6. 26 Oct, 2012 6 commits
    • Alex Elder's avatar
      rbd: kill rbd_device->rbd_opts · 069a4b56
      Alex Elder authored
      The rbd_device structure has an embedded rbd_options structure.
      Such a structure is needed to work with the generic ceph argument
      parsing code, but there's no need to keep it around once argument
      parsing is done.
      
      Use a local variable to hold the rbd options used in parsing in
      rbd_get_client(), and just transfer its content (it's just a
      read_only flag) into the field in the rbd_mapping sub-structure
      that requires that information.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      Reviewed-by: default avatarDan Mick <dan.mick@inktank.com>
      069a4b56
    • Alex Elder's avatar
      rbd: simplify rbd_merge_bvec() · e5cfeed2
      Alex Elder authored
      The aim of this patch is to make what's going on rbd_merge_bvec() a
      bit more obvious than it was before.  This was an issue when a
      recent btrfs bug led us to question whether the merge function was
      working correctly.
      
      Use "obj" rather than "chunk" to indicate the units whose boundaries
      we care about we call (rados) "objects".
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarDan Mick <dan.mick@inktank.com>
      e5cfeed2
    • Alex Elder's avatar
      rbd: increase maximum snapshot name length · d4b125e9
      Alex Elder authored
      Change RBD_MAX_SNAP_NAME_LEN to be based on NAME_MAX.  That is a
      practical limit for the length of a snapshot name (based on the
      presence of a directory using the name under /sys/bus/rbd to
      represent the snapshot).
      
      The /sys entry is created by prefixing it with "snap_"; define that
      prefix symbolically, and take its length into account in defining
      the snapshot name length limit.
      
      Enforce the limit in rbd_add_parse_args().  Also delete a dout()
      call in that function that was not meant to be committed.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarDan Mick <dan.mick@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      d4b125e9
    • Alex Elder's avatar
      rbd: verify rbd image order value · db2388b6
      Alex Elder authored
      This adds a verification that an rbd image's object order is
      within the upper and lower bounds supported by this implementation.
      
      It must be at least 9 (SECTOR_SHIFT), because the Linux bio system
      assumes that minimum granularity.
      
      It also must be less than 32 (at the moment anyway) because there
      exist spots in the code that store the size of a "segment" (object
      backing an rbd image) in a signed int variable, which can be 32 bits
      including the sign.  We should be able to relax this limit once
      we've verified the code uses 64-bit types where needed.
      
      Note that the CLI tool already limits the order to the range 12-25.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      db2388b6
    • Alex Elder's avatar
      rbd: consolidate rbd_do_op() calls · 4634246d
      Alex Elder authored
      The two calls to rbd_do_op() from rbd_rq_fn() differ only in the
      value passed for the snapshot id and the snapshot context.
      
      For reads the snapshot always comes from the mapping, and for writes
      the snapshot id is always CEPH_NOSNAP.
      
      The snapshot context is always null for reads.  For writes, the
      snapshot context always comes from the rbd header, but it is
      acquired under protection of header semaphore and could change
      thereafter, so we can't simply use what's available inside
      rbd_do_op().
      
      Eliminate the snapid parameter from rbd_do_op(), and set it
      based on the I/O direction inside that function instead.  Always
      pass the snapshot context acquired in the caller, but reset it
      to a null pointer inside rbd_do_op() if the operation is a read.
      
      As a result, there is no difference in the read and write calls
      to rbd_do_op() made in rbd_rq_fn(), so just call it unconditionally.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      4634246d
    • Alex Elder's avatar
      rbd: drop rbd_do_op() opcode and flags · ff2e4bb5
      Alex Elder authored
      The only callers of rbd_do_op() are in rbd_rq_fn(), where call one
      is used for writes and the other used for reads.  The request passed
      to rbd_do_op() already encodes the I/O direction, and that
      information can be used inside the function to set the opcode and
      flags value (rather than passing them in as arguments).
      
      So get rid of the opcode and flags arguments to rbd_do_op().
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      ff2e4bb5