1. 01 Mar, 2013 30 commits
    • Joe Thornber's avatar
      dm: add cache target · c6b4fcba
      Joe Thornber authored
      Add a target that allows a fast device such as an SSD to be used as a
      cache for a slower device such as a disk.
      
      A plug-in architecture was chosen so that the decisions about which data
      to migrate and when are delegated to interchangeable tunable policy
      modules.  The first general purpose module we have developed, called
      "mq" (multiqueue), follows in the next patch.  Other modules are
      under development.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarHeinz Mauelshagen <mauelshagen@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      c6b4fcba
    • Joe Thornber's avatar
      dm persistent data: add bitset · 7a87edfe
      Joe Thornber authored
      Add a persistent bitset as a wrapper around dm-array.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      7a87edfe
    • Joe Thornber's avatar
      dm persistent data: add transactional array · 6513c29f
      Joe Thornber authored
      Add a transactional array.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      6513c29f
    • Joe Thornber's avatar
      dm thin: remove cells from stack · 025b9685
      Joe Thornber authored
      This patch takes advantage of the new bio-prison interface where the
      memory is now passed in rather than using a mempool in bio-prison.
      This allows the map function to avoid performing potentially-blocking
      allocations that could lead to deadlocks: We want to avoid the cell
      allocation that is done in bio_detain.
      
      (The potential for mempool deadlocks still remains in other functions
      that use bio_detain.)
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      025b9685
    • Joe Thornber's avatar
      dm bio prison: pass cell memory in · 6beca5eb
      Joe Thornber authored
      Change the dm_bio_prison interface so that instead of allocating memory
      internally, dm_bio_detain is supplied with a pre-allocated cell each
      time it is called.
      
      This enables a subsequent patch to move the allocation of the struct
      dm_bio_prison_cell outside the thin target's mapping function so it can
      no longer block there.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      6beca5eb
    • Joe Thornber's avatar
      dm persistent data: add btree_walk · 4e7f1f90
      Joe Thornber authored
      Add dm_btree_walk to iterate through the contents of a btree.
      This will be used by the dm cache target.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      4e7f1f90
    • Alasdair G Kergon's avatar
      dm: add target num_write_bios fn · b0d8ed4d
      Alasdair G Kergon authored
      Add a num_write_bios function to struct target.
      
      If an instance of a target sets this, it will be queried before the
      target's mapping function is called on a write bio, and the response
      controls the number of copies of the write bio that the target will
      receive.
      
      This provides a convenient way for a target to send the same data to
      more than one device.  The new cache target uses this in writethrough
      mode, to send the data both to the cache and the backing device.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      b0d8ed4d
    • Mikulas Patocka's avatar
      dm kcopyd: introduce configurable throttling · df5d2e90
      Mikulas Patocka authored
      This patch allows the administrator to reduce the rate at which kcopyd
      issues I/O.
      
      Each module that uses kcopyd acquires a throttle parameter that can be
      set in /sys/module/*/parameters.
      
      We maintain a history of kcopyd usage by each module in the variables
      io_period and total_period in struct dm_kcopyd_throttle. The actual
      kcopyd activity is calculated as a percentage of time equal to
      "(100 * io_period / total_period)".  This is compared with the user-defined
      throttle percentage threshold and if it is exceeded, we sleep.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      df5d2e90
    • Mikulas Patocka's avatar
      dm ioctl: allow message to return data · a2606241
      Mikulas Patocka authored
      This patch introduces enhanced message support that allows the
      device-mapper core to recognise messages that are common to all devices,
      and for messages to return data to userspace.
      
      Core messages are processed by the function "message_for_md".  If the
      device mapper doesn't support the message, it is passed to the target
      driver.
      
      If the message returns data, the kernel sets the flag
      DM_MESSAGE_OUT_FLAG.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      a2606241
    • Mikulas Patocka's avatar
      dm ioctl: optimize functions without variable params · 02cde50b
      Mikulas Patocka authored
      Device-mapper ioctls receive and send data in a buffer supplied
      by userspace.  The buffer has two parts.  The first part contains
      a 'struct dm_ioctl' and has a fixed size.  The second part depends
      on the ioctl and has a variable size.
      
      This patch recognises the specific ioctls that do not use the variable
      part of the buffer and skips allocating memory for it.
      
      In particular, when a device is suspended and a resume ioctl is sent,
      this now avoid memory allocation completely.
      
      The variable "struct dm_ioctl tmp" is moved from the function
      copy_params to its caller ctl_ioctl and renamed to param_kernel.
      It is used directly when the ioctl function doesn't need any arguments.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      02cde50b
    • Mikulas Patocka's avatar
      dm ioctl: introduce ioctl_flags · e2914cc2
      Mikulas Patocka authored
      This patch introduces flags for each ioctl function.
      
      So far, one flag is defined, IOCTL_FLAGS_NO_PARAMS.  It is set if the
      function processing the ioctl doesn't take or produce any parameters in
      the section of the data buffer that has a variable size.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      e2914cc2
    • Jun'ichi Nomura's avatar
      dm: merge io_pool and tio_pool · 5f015204
      Jun'ichi Nomura authored
      This patch merges io_pool and tio_pool into io_pool and cleans up
      related functions.
      
      Though device-mapper used to have 2 pools of objects for each dm device,
      the use of bioset frontbad for per-bio data has shrunk the number of
      pools to 1 for both bio-based and request-based device types.
      (See c0820cf5 "dm: introduce per_bio_data" and
       94818742 "dm: Use bioset's front_pad for dm_rq_clone_bio_info")
      
      So dm no longer has to maintain 2 different pointers.
      
      No functional changes.
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      5f015204
    • Jun'ichi Nomura's avatar
      dm: remove unused _rq_bio_info_cache · 23e5083b
      Jun'ichi Nomura authored
      Remove _rq_bio_info_cache, which is no longer used.
      No functional changes.
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      23e5083b
    • Mike Christie's avatar
      dm: fix limits initialization when there are no data devices · 87eb5b21
      Mike Christie authored
      dm_calculate_queue_limits will first reset the provided limits to
      defaults using blk_set_stacking_limits; whereby defeating the purpose of
      retaining the original live table's limits -- as was intended via commit
      3ae70656 ("dm: retain table limits when
      swapping to new table with no devices").
      
      Fix this improper limits initialization (in the no data devices case) by
      avoiding the call to dm_calculate_queue_limits.
      
      [patch header revised by Mike Snitzer]
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # v3.6+
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      87eb5b21
    • Mikulas Patocka's avatar
      dm snapshot: add missing module aliases · 23cb2109
      Mikulas Patocka authored
      Add module aliases so that autoloading works correctly if the user
      tries to activate "snapshot-origin" or "snapshot-merge" targets.
      
      Reference: https://bugzilla.redhat.com/889973Reported-by: default avatarChao Yang <chyang@redhat.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      23cb2109
    • Mike Snitzer's avatar
      dm persistent data: set some btree fn parms const · 018cede9
      Mike Snitzer authored
      Mark some constant parameters constant in some dm-btree functions.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      018cede9
    • Alasdair G Kergon's avatar
      dm: refactor bio cloning · e4c93811
      Alasdair G Kergon authored
      Refactor part of the bio splitting and cloning code to try to make it
      easier to understand.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      e4c93811
    • Alasdair G Kergon's avatar
      dm: rename bio cloning functions · 14fe594d
      Alasdair G Kergon authored
      Rename functions involved in splitting and cloning bios.
      
      The sequence of functions is now:
        (1) __split_and_process* - entry point that selects the processing strategy
        (2) __send* - prepare the details for each bio needed and loop through them
        (3) __clone_and_map* - creates a clone and maps it
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      14fe594d
    • Alasdair G Kergon's avatar
      dm: rename request variables to bios · 55a62eef
      Alasdair G Kergon authored
      Use 'bio' in the name of variables and functions that deal with
      bios rather than 'request' to avoid confusion with the normal
      block layer use of 'request'.
      
      No functional changes.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      55a62eef
    • Alasdair G Kergon's avatar
      dm: clean up clone_bio · bd2a49b8
      Alasdair G Kergon authored
      Remove the no-longer-used struct bio_set argument from clone_bio and split_bvec.
      Use tio->ti in __map_bio() instead of passing in ti.
      Factor out some code for setting up cloned bios.
      Take target_request_nr as a parameter to alloc_tio().
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: Joe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      bd2a49b8
    • Kees Cook's avatar
      dm persistent data: remove CONFIG_EXPERIMENTAL · 88ae4c52
      Kees Cook authored
      The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
      while now and is almost always enabled by default. As agreed during the
      Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      88ae4c52
    • Alasdair G Kergon's avatar
      dm: remove CONFIG_EXPERIMENTAL · d57916a0
      Alasdair G Kergon authored
      Remove EXPERIMENTAL from all existing device-mapper targets.
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      d57916a0
    • Mike Snitzer's avatar
      dm thin: use block_size_is_power_of_two · 58f77a21
      Mike Snitzer authored
      Use block_size_is_power_of_two() rather than checking
      sectors_per_block_shift directly.  Also introduce local pool variable in
      get_bio_block() to eliminate redundant tc->pool dereferences.
      
      No functional change.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      58f77a21
    • Mikulas Patocka's avatar
      dm bufio: use WRITE_FLUSH instead of REQ_FLUSH · 3daec3b4
      Mikulas Patocka authored
      Use WRITE_FLUSH instead of REQ_FLUSH for submitted requests to make it
      consistent with the rest of the kernel. There is no functional change.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      3daec3b4
    • Wang Sheng-Hui's avatar
      dm table: remove superfluous variable reset · d2ce70a1
      Wang Sheng-Hui authored
      If allocation fails, the local var *t is not used any more after kfree.
      Don't need to reset it to NULL. Remove the unnecesary NULL set here.
      Signed-off-by: default avatarWang Sheng-Hui <shhuiw@gmail.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      d2ce70a1
    • Mike Snitzer's avatar
      dm thin: support a non power of 2 discard_granularity · f13945d7
      Mike Snitzer authored
      Support a non-power-of-2 discard granularity in dm-thin, now that the block
      layer supports this(via 8dd2cb7e "block:
      discard granularity might not be power of 2" and
      59771079 "blk: avoid divide-by-zero with zero
      discard granularity").
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      f13945d7
    • Mikulas Patocka's avatar
      dm: fix truncated status strings · fd7c092e
      Mikulas Patocka authored
      Avoid returning a truncated table or status string instead of setting
      the DM_BUFFER_FULL_FLAG when the last target of a table fills the
      buffer.
      
      When processing a table or status request, the function retrieve_status
      calls ti->type->status. If ti->type->status returns non-zero,
      retrieve_status assumes that the buffer overflowed and sets
      DM_BUFFER_FULL_FLAG.
      
      However, targets don't return non-zero values from their status method
      on overflow. Most targets returns always zero.
      
      If a buffer overflow happens in a target that is not the last in the
      table, it gets noticed during the next iteration of the loop in
      retrieve_status; but if a buffer overflow happens in the last target, it
      goes unnoticed and erroneously truncated data is returned.
      
      In the current code, the targets behave in the following way:
      * dm-crypt returns -ENOMEM if there is not enough space to store the
        key, but it returns 0 on all other overflows.
      * dm-thin returns errors from the status method if a disk error happened.
        This is incorrect because retrieve_status doesn't check the error
        code, it assumes that all non-zero values mean buffer overflow.
      * all the other targets always return 0.
      
      This patch changes the ti->type->status function to return void (because
      most targets don't use the return code). Overflow is detected in
      retrieve_status: if the status method fills up the remaining space
      completely, it is assumed that buffer overflow happened.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      fd7c092e
    • Jun'ichi Nomura's avatar
      dm: do not replace bioset for request based dm · 16245bdc
      Jun'ichi Nomura authored
      This patch fixes a regression introduced in v3.8, which causes oops
      like this when dm-multipath is used:
      
      general protection fault: 0000 [#1] SMP
      RIP: 0010:[<ffffffff810fe754>]  [<ffffffff810fe754>] mempool_free+0x24/0xb0
      Call Trace:
        <IRQ>
        [<ffffffff81187417>] bio_put+0x97/0xc0
        [<ffffffffa02247a5>] end_clone_bio+0x35/0x90 [dm_mod]
        [<ffffffff81185efd>] bio_endio+0x1d/0x30
        [<ffffffff811f03a3>] req_bio_endio.isra.51+0xa3/0xe0
        [<ffffffff811f2f68>] blk_update_request+0x118/0x520
        [<ffffffff811f3397>] blk_update_bidi_request+0x27/0xa0
        [<ffffffff811f343c>] blk_end_bidi_request+0x2c/0x80
        [<ffffffff811f34d0>] blk_end_request+0x10/0x20
        [<ffffffffa000b32b>] scsi_io_completion+0xfb/0x6c0 [scsi_mod]
        [<ffffffffa000107d>] scsi_finish_command+0xbd/0x120 [scsi_mod]
        [<ffffffffa000b12f>] scsi_softirq_done+0x13f/0x160 [scsi_mod]
        [<ffffffff811f9fd0>] blk_done_softirq+0x80/0xa0
        [<ffffffff81044551>] __do_softirq+0xf1/0x250
        [<ffffffff8142ee8c>] call_softirq+0x1c/0x30
        [<ffffffff8100420d>] do_softirq+0x8d/0xc0
        [<ffffffff81044885>] irq_exit+0xd5/0xe0
        [<ffffffff8142f3e3>] do_IRQ+0x63/0xe0
        [<ffffffff814257af>] common_interrupt+0x6f/0x6f
        <EOI>
        [<ffffffffa021737c>] srp_queuecommand+0x8c/0xcb0 [ib_srp]
        [<ffffffffa0002f18>] scsi_dispatch_cmd+0x148/0x310 [scsi_mod]
        [<ffffffffa000a38e>] scsi_request_fn+0x31e/0x520 [scsi_mod]
        [<ffffffff811f1e57>] __blk_run_queue+0x37/0x50
        [<ffffffff811f1f69>] blk_delay_work+0x29/0x40
        [<ffffffff81059003>] process_one_work+0x1c3/0x5c0
        [<ffffffff8105b22e>] worker_thread+0x15e/0x440
        [<ffffffff8106164b>] kthread+0xdb/0xe0
        [<ffffffff8142db9c>] ret_from_fork+0x7c/0xb0
      
      The regression was introduced by the change
      c0820cf5 "dm: introduce per_bio_data", where dm started to replace
      bioset during table replacement.
      For bio-based dm, it is good because clone bios do not exist during the
      table replacement.
      For request-based dm, however, (not-yet-mapped) clone bios may stay in
      request queue and survive during the table replacement.
      So freeing the old bioset could cause the oops in bio_put().
      
      Since the size of front_pad may change only with bio-based dm,
      it is not necessary to replace bioset for request-based dm.
      Reported-by: default avatarBart Van Assche <bvanassche@acm.org>
      Tested-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Acked-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Acked-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      16245bdc
    • Linus Torvalds's avatar
      Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux · b6669737
      Linus Torvalds authored
      Pull nfsd changes from J Bruce Fields:
       "Miscellaneous bugfixes, plus:
      
         - An overhaul of the DRC cache by Jeff Layton.  The main effect is
           just to make it larger.  This decreases the chances of intermittent
           errors especially in the UDP case.  But we'll need to watch for any
           reports of performance regressions.
      
         - Containerized nfsd: with some limitations, we now support
           per-container nfs-service, thanks to extensive work from Stanislav
           Kinsbursky over the last year."
      
      Some notes about conflicts, since there were *two* non-data semantic
      conflicts here:
      
       - idr_remove_all() had been added by a memory leak fix, but has since
         become deprecated since idr_destroy() does it for us now.
      
       - xs_local_connect() had been added by this branch to make AF_LOCAL
         connections be synchronous, but in the meantime Trond had changed the
         calling convention in order to avoid a RCU dereference.
      
      There were a couple of more obvious actual source-level conflicts due to
      the hlist traversal changes and one just due to code changes next to
      each other, but those were trivial.
      
      * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
        SUNRPC: make AF_LOCAL connect synchronous
        nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
        svcrpc: fix rpc server shutdown races
        svcrpc: make svc_age_temp_xprts enqueue under sv_lock
        lockd: nlmclnt_reclaim(): avoid stack overflow
        nfsd: enable NFSv4 state in containers
        nfsd: disable usermode helper client tracker in container
        nfsd: use proper net while reading "exports" file
        nfsd: containerize NFSd filesystem
        nfsd: fix comments on nfsd_cache_lookup
        SUNRPC: move cache_detail->cache_request callback call to cache_read()
        SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
        SUNRPC: rework cache upcall logic
        SUNRPC: introduce cache_detail->cache_request callback
        NFS: simplify and clean cache library
        NFS: use SUNRPC cache creation and destruction helper for DNS cache
        nfsd4: free_stid can be static
        nfsd: keep a checksum of the first 256 bytes of request
        sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
        sunrpc: fix comment in struct xdr_buf definition
        ...
      b6669737
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 1cf0209c
      Linus Torvalds authored
      Pull Ceph updates from Sage Weil:
       "A few groups of patches here.  Alex has been hard at work improving
        the RBD code, layout groundwork for understanding the new formats and
        doing layering.  Most of the infrastructure is now in place for the
        final bits that will come with the next window.
      
        There are a few changes to the data layout.  Jim Schutt's patch fixes
        some non-ideal CRUSH behavior, and a set of patches from me updates
        the client to speak a newer version of the protocol and implement an
        improved hashing strategy across storage nodes (when the server side
        supports it too).
      
        A pair of patches from Sam Lang fix the atomicity of open+create
        operations.  Several patches from Yan, Zheng fix various mds/client
        issues that turned up during multi-mds torture tests.
      
        A final set of patches expose file layouts via virtual xattrs, and
        allow the policies to be set on directories via xattrs as well
        (avoiding the awkward ioctl interface and providing a consistent
        interface for both kernel mount and ceph-fuse users)."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (143 commits)
        libceph: add support for HASHPSPOOL pool flag
        libceph: update osd request/reply encoding
        libceph: calculate placement based on the internal data types
        ceph: update support for PGID64, PGPOOL3, OSDENC protocol features
        ceph: update "ceph_features.h"
        libceph: decode into cpu-native ceph_pg type
        libceph: rename ceph_pg -> ceph_pg_v1
        rbd: pass length, not op for osd completions
        rbd: move rbd_osd_trivial_callback()
        libceph: use a do..while loop in con_work()
        libceph: use a flag to indicate a fault has occurred
        libceph: separate non-locked fault handling
        libceph: encapsulate connection backoff
        libceph: eliminate sparse warnings
        ceph: eliminate sparse warnings in fs code
        rbd: eliminate sparse warnings
        libceph: define connection flag helpers
        rbd: normalize dout() calls
        rbd: barriers are hard
        rbd: ignore zero-length requests
        ...
      1cf0209c
  2. 28 Feb, 2013 10 commits
    • Linus Torvalds's avatar
      Merge tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux · de1a2262
      Linus Torvalds authored
      Pull writeback fixes from Wu Fengguang:
       "Two writeback fixes
      
         - fix negative (setpoint - dirty) in 32bit archs
      
         - use down_read_trylock() in writeback_inodes_sb(_nr)_if_idle()"
      
      * tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
        Negative (setpoint-dirty) in bdi_position_ratio()
        vfs: re-implement writeback_inodes_sb(_nr)_if_idle() and rename them
      de1a2262
    • Linus Torvalds's avatar
      Merge branch 'for-3.9/drivers' of git://git.kernel.dk/linux-block · f042fea0
      Linus Torvalds authored
      Pull block driver bits from Jens Axboe:
       "After the block IO core bits are in, please grab the driver updates
        from below as well.  It contains:
      
         - Fix ancient regression in dac960.  Nobody must be using that
           anymore...
      
         - Some good fixes from Guo Ghao for loop, fixing both potential
           oopses and deadlocks.
      
         - Improve mtip32xx for NUMA systems, by being a bit more clever in
           distributing work.
      
         - Add IBM RamSan 70/80 driver.  A second round of fixes for that is
           pending, that will come in through for-linus during the 3.9 cycle
           as per usual.
      
         - A few xen-blk{back,front} fixes from Konrad and Roger.
      
         - Other minor fixes and improvements."
      
      * 'for-3.9/drivers' of git://git.kernel.dk/linux-block:
        loopdev: ignore negative offset when calculate loop device size
        loopdev: remove an user triggerable oops
        loopdev: move common code into loop_figure_size()
        loopdev: update block device size in loop_set_status()
        loopdev: fix a deadlock
        xen-blkback: use balloon pages for persistent grants
        xen-blkfront: drop the use of llist_for_each_entry_safe
        xen/blkback: Don't trust the handle from the frontend.
        xen-blkback: do not leak mode property
        block: IBM RamSan 70/80 driver fixes
        rsxx: add slab.h include to dma.c
        drivers/block/mtip32xx: add missing GENERIC_HARDIRQS dependency
        block: remove new __devinit/exit annotations on ramsam driver
        block: IBM RamSan 70/80 device driver
        drivers/block/mtip32xx/mtip32xx.c:1726:5: sparse: symbol 'mtip_send_trim' was not declared. Should it be static?
        drivers/block/mtip32xx/mtip32xx.c:4029:1: sparse: symbol 'mtip_workq_sdbf0' was not declared. Should it be static?
        dac960: return success instead of -ENOTTY
        mtip32xx: add trim support
        mtip32xx: Add workqueue and NUMA support
        block: delete super ancient PC-XT driver for 1980's hardware
      f042fea0
    • Linus Torvalds's avatar
      Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block · ee89f812
      Linus Torvalds authored
      Pull block IO core bits from Jens Axboe:
       "Below are the core block IO bits for 3.9.  It was delayed a few days
        since my workstation kept crashing every 2-8h after pulling it into
        current -git, but turns out it is a bug in the new pstate code (divide
        by zero, will report separately).  In any case, it contains:
      
         - The big cfq/blkcg update from Tejun and and Vivek.
      
         - Additional block and writeback tracepoints from Tejun.
      
         - Improvement of the should sort (based on queues) logic in the plug
           flushing.
      
         - _io() variants of the wait_for_completion() interface, using
           io_schedule() instead of schedule() to contribute to io wait
           properly.
      
         - Various little fixes.
      
        You'll get two trivial merge conflicts, which should be easy enough to
        fix up"
      
      Fix up the trivial conflicts due to hlist traversal cleanups (commit
      b67bfe0d: "hlist: drop the node parameter from iterators").
      
      * 'for-3.9/core' of git://git.kernel.dk/linux-block: (39 commits)
        block: remove redundant check to bd_openers()
        block: use i_size_write() in bd_set_size()
        cfq: fix lock imbalance with failed allocations
        drivers/block/swim3.c: fix null pointer dereference
        block: don't select PERCPU_RWSEM
        block: account iowait time when waiting for completion of IO request
        sched: add wait_for_completion_io[_timeout]
        writeback: add more tracepoints
        block: add block_{touch|dirty}_buffer tracepoint
        buffer: make touch_buffer() an exported function
        block: add @req to bio_{front|back}_merge tracepoints
        block: add missing block_bio_complete() tracepoint
        block: Remove should_sort judgement when flush blk_plug
        block,elevator: use new hashtable implementation
        cfq-iosched: add hierarchical cfq_group statistics
        cfq-iosched: collect stats from dead cfqgs
        cfq-iosched: separate out cfqg_stats_reset() from cfq_pd_reset_stats()
        blkcg: make blkcg_print_blkgs() grab q locks instead of blkcg lock
        block: RCU free request_queue
        blkcg: implement blkg_[rw]stat_recursive_sum() and blkg_[rw]stat_merge()
        ...
      ee89f812
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 21f3b24d
      Linus Torvalds authored
      Pull first round of SCSI updates from James Bottomley:
       "The patch set is mostly driver updates (bnx2fc, ipr, lpfc, qla4) and a
        few bug fixes"
      
      Pull delayed because google hates James, and sneakily considers his pull
      requests spam. Why, google, why?
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (60 commits)
        [SCSI] aacraid: 1024 max outstanding command support for Series 7 and above
        [SCSI] bnx2fc: adjust duplicate test
        [SCSI] qla4xxx: Update driver version to 5.03.00-k4
        [SCSI] qla4xxx: Fix return code for qla4xxx_session_get_param.
        [SCSI] qla4xxx: wait for boot target login response during probe.
        [SCSI] qla4xxx: Added support for force firmware dump
        [SCSI] qla4xxx: Re-register IRQ handler while retrying initialize of adapter
        [SCSI] qla4xxx: Throttle active IOCBs to firmware limits
        [SCSI] qla4xxx: Remove unnecessary code from qla4xxx_init_local_data
        [SCSI] qla4xxx: Quiesce driver activities while loopback
        [SCSI] qla4xxx: Rename MBOX_ASTS_IDC_NOTIFY to MBOX_ASTS_IDC_REQUEST_NOTIFICATION
        [SCSI] qla4xxx: Add spurious interrupt messages under debug level 2
        [SCSI] cxgb4i: Remove the scsi host device when removing device
        [SCSI] bfa: fix strncpy() limiter in bfad_start_ops()
        [SCSI] qla4xxx: Update driver version to 5.03.00-k3
        [SCSI] qla4xxx: Correct the validation to check in get_sys_info mailbox
        [SCSI] qla4xxx: Pass correct function param to qla4_8xxx_rd_direct
        [SCSI] lpfc 8.3.37: Update lpfc version for 8.3.37 driver release
        [SCSI] lpfc 8.3.37: Fixed infinite loop in lpfc_sli4_fcf_rr_next_index_get.
        [SCSI] lpfc 8.3.37: Fixed crash due to SLI Port invalid resource count
        ...
      21f3b24d
    • J. Bruce Fields's avatar
      SUNRPC: make AF_LOCAL connect synchronous · dc107402
      J. Bruce Fields authored
      It doesn't appear that anyone actually needs to connect asynchronously.
      
      Also, using a workqueue for the connect means we lose the namespace
      information from the original process.  This is a problem since there's
      no way to explicitly pass in a filesystem namespace for resolution of an
      AF_LOCAL address.
      Acked-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      dc107402
    • Linus Torvalds's avatar
      Merge branch 'akpm' (final batch from Andrew) · 2a7d2b96
      Linus Torvalds authored
      Merge third patch-bumb from Andrew Morton:
       "This wraps me up for -rc1.
         - Lots of misc stuff and things which were deferred/missed from
           patchbombings 1 & 2.
         - ocfs2 things
         - lib/scatterlist
         - hfsplus
         - fatfs
         - documentation
         - signals
         - procfs
         - lockdep
         - coredump
         - seqfile core
         - kexec
         - Tejun's large IDR tree reworkings
         - ipmi
         - partitions
         - nbd
         - random() things
         - kfifo
         - tools/testing/selftests updates
         - Sasha's large and pointless hlist cleanup"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (163 commits)
        hlist: drop the node parameter from iterators
        kcmp: make it depend on CHECKPOINT_RESTORE
        selftests: add a simple doc
        tools/testing/selftests/Makefile: rearrange targets
        selftests/efivarfs: add create-read test
        selftests/efivarfs: add empty file creation test
        selftests: add tests for efivarfs
        kfifo: fix kfifo_alloc() and kfifo_init()
        kfifo: move kfifo.c from kernel/ to lib/
        arch Kconfig: centralise CONFIG_ARCH_NO_VIRT_TO_BUS
        w1: add support for DS2413 Dual Channel Addressable Switch
        memstick: move the dereference below the NULL test
        drivers/pps/clients/pps-gpio.c: use devm_kzalloc
        Documentation/DMA-API-HOWTO.txt: fix typo
        include/linux/eventfd.h: fix incorrect filename is a comment
        mtd: mtd_stresstest: use prandom_bytes()
        mtd: mtd_subpagetest: convert to use prandom library
        mtd: mtd_speedtest: use prandom_bytes
        mtd: mtd_pagetest: convert to use prandom library
        mtd: mtd_oobtest: convert to use prandom library
        ...
      2a7d2b96
    • Sasha Levin's avatar
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin authored
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: default avatarPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
    • Cyrill Gorcunov's avatar
      kcmp: make it depend on CHECKPOINT_RESTORE · 1e142b29
      Cyrill Gorcunov authored
      Since kcmp syscall has been implemented (initially on x86 architecture) a
      number of other archs wire it up as well: xtensa, sparc, sh, s390, mips,
      microblaze, m68k (not taking into account those who uses
      <asm-generic/unistd.h> for syscall numbers definitions).
      
      But the Makefile, which turns kcmp.o generation on still depends on former
      config-x86.  Thus get rid of this limitation and make kcmp.o depend on
      CHECKPOINT_RESTORE option.
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e142b29
    • Jeremy Kerr's avatar
      selftests: add a simple doc · 80d03428
      Jeremy Kerr authored
      This change adds a little documentation to the tests under
      tools/testing/selftests/, based on akpm's explanation.
      
      [akpm@linux-foundation.org: move from Documentation to tools/testing/selftests/README.txt]
      Signed-off-by: default avatarJeremy Kerr <jk@ozlabs.org>
      Cc: Dave Young <dyoung@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80d03428
    • Andrew Morton's avatar
      tools/testing/selftests/Makefile: rearrange targets · 66a01b96
      Andrew Morton authored
      Do it one-per-line to reduce patch conflict pain.
      
      Cc: Dave Young <dyoung@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      66a01b96