Commits · 5e901a2b95db709c5e40599ff4df6029be1e2a12 · Kirill Smelkov / linux

01 Oct, 2010 8 commits

blkio-throttle: There is no need to convert jiffies to milli seconds · 5e901a2b

Vivek Goyal authored Oct 01, 2010

o Do not convert jiffies to mili seconds as it is not required. Just work
  with jiffies and HZ.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

5e901a2b

blkio-throttle: Fix link failure failure on i386 · 3aad5d3e

Vivek Goyal authored Oct 01, 2010

o Randy Dunlap reported following linux-next failure. This patch fixes it.

on i386:

blk-throttle.c:(.text+0x1abb8): undefined reference to `__udivdi3'
blk-throttle.c:(.text+0x1b1dc): undefined reference to `__udivdi3'

o bytes_per_second interface is 64bit and I was continuing to do 64 bit
  division even on 32bit platform without help of special macros/functions
  hence the failure.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

3aad5d3e

blkio: Recalculate the throttled bio dispatch time upon throttle limit change · fe071437

Vivek Goyal authored Oct 01, 2010

o Currently any cgroup throttle limit changes are processed asynchronousy and
the change does not take affect till a new bio is dispatched from same group.

o It might happen that a user sets a redicuously low limit on throttling.
Say 1 bytes per second on reads. In such cases simple operations like mount
a disk can wait for a very long time.

o Once bio is throttled, there is no easy way to come out of that wait even if
user increases the read limit later.

o This patch fixes it. Now if a user changes the cgroup limits, we recalculate
the bio dispatch time according to new limits.

o Can't take queueu lock under blkcg_lock, hence after the change I wake
up the dispatch thread again which recalculates the time. So there are some
variables being synchronized across two threads without lock and I had to
make use of barriers. Hoping I have used barriers correctly. Any review of
memory barrier code especially will help.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

fe071437

blkio: Add root group to td->tg_list · 02977e4a

Vivek Goyal authored Oct 01, 2010

o Currently all the dynamically allocated groups, except root grp is added
  to td->tg_list. This was not a problem so far but in next patch I will
  travel through td->tg_list to process any updates of limits on the group.
  If root group is not in tg_list, then root group's updates are not
  processed.

o It is better to root group also to tg_list instead of doing special
  processing for it during limit updates.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

02977e4a

blkio: deletion of a cgroup was causes oops · 61014e96

Vivek Goyal authored Oct 01, 2010

o Now a cgroup list of blkg elements can contain blkg from multiple policies.
Before sending an unlink event, make sure blkg belongs to they policy. If
policy does not own the blkg, do not send update for this blkg.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

61014e96

blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n · 13f98250

Vivek Goyal authored Oct 01, 2010

Currently throttling related files were visible even if user had disabled
throttling using config options. It was switching off background throttling
of bio but not the cgroup files. This patch fixes it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

13f98250

block: set the bounce_pfn to the actual DMA limit rather than to max memory · efb012b3

Malahal Naineni authored Oct 01, 2010

The bounce_pfn of the request queue in 64 bit systems is set to the
current max_low_pfn. Adding more memory later makes this incorrect.
Memory allocated beyond this boot time max_low_pfn appear to require
bounce buffers (bounce buffers are actually not allocated but used in
calculating segments that may result in "over max segments limit"
errors).
Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

efb012b3

block: revert bad fix for memory hotplug causing bounces · 260a67a9

Jens Axboe authored Oct 01, 2010

Revert "block: set the bounce_pfn to the actual DMA limit rather than to max memory"

This reverts commit c49825fa.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

260a67a9

25 Sep, 2010 1 commit

Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK · e4ecda1b

Mark Lord authored Sep 25, 2010

Ensure that 'sysctl_hung_task_timeout_secs' is defined
even when CONFIG_DETECT_HUNG_TASK is not set.
This way we can safely reference it without need for
ifdefs in the code elsewhere.  eg. in block/blk-exec.c
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

e4ecda1b

24 Sep, 2010 2 commits

block: set the bounce_pfn to the actual DMA limit rather than to max memory · c49825fa

Malahal Naineni authored Sep 24, 2010

The bounce_pfn of the request queue in 64 bit systems is set to the
current max_low_pfn. Adding more memory later makes this incorrect.
Memory allocated beyond this boot time max_low_pfn appear to require
bounce buffers (bounce buffers are actually not allocated but used in
calculating segments that may result in "over max segments limit"
errors).
Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

c49825fa

block: Prevent hang_check firing during long I/O · 4b197769

Mark Lord authored Sep 24, 2010

During long I/O operations, the hang_check timer may fire,
trigger stack dumps that unnecessarily alarm the user.

Eg.  hdparm --security-erase NULL /dev/sdb  ## can take *hours* to complete

So, if hang_check is armed, we should wake up periodically
to prevent it from triggering.  This patch uses a wake-up interval
equal to half the hang_check timer period, which keeps overhead low enough.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

4b197769

20 Sep, 2010 1 commit

cfq: improve fsync performance for small files · 749ef9f8

Corrado Zoccolo authored Sep 20, 2010

Fsync performance for small files achieved by cfq on high-end disks is
lower than what deadline can achieve, due to idling introduced between
the sync write happening in process context and the journal commit.

Moreover, when competing with a sequential reader, a process writing
small files and fsync-ing them is starved.

This patch fixes the two problems by:
- marking journal commits as WRITE_SYNC, so that they get the REQ_NOIDLE
  flag set,
- force all queues that have REQ_NOIDLE requests to be put in the noidle
  tree.

Having the queue associated to the fsync-ing process and the one associated
 to journal commits in the noidle tree allows:
- switching between them without idling,
- fairness vs. competing idling queues, since they will be serviced only
  after the noidle tree expires its slice.
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

749ef9f8

17 Sep, 2010 1 commit

do_mounts: only enable PARTUUID for CONFIG_BLOCK · 6d0aed7a

Jens Axboe authored Sep 17, 2010

When CONFIG_BLOCK is not enabled:

init/do_mounts.c:71: error: implicit declaration of function 'dev_to_part'
init/do_mounts.c:71: warning: initialization makes pointer from integer without a cast
init/do_mounts.c:73: error: dereferencing pointer to incomplete type
init/do_mounts.c:76: error: dereferencing pointer to incomplete type
init/do_mounts.c:76: error: dereferencing pointer to incomplete type
init/do_mounts.c:102: error: implicit declaration of function 'part_pack_uuid'
init/do_mounts.c:104: error: 'block_class' undeclared (first use in this function)
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

6d0aed7a

16 Sep, 2010 10 commits

block: Fix race during disk initialization · 01ea5063

Signed-off-by: Jan Kara authored Sep 16, 2010

When a new disk is being discovered, add_disk() first ties the bdev to gendisk
(via register_disk()->blkdev_get()) and only after that calls
bdi_register_bdev(). Because register_disk() also creates disk's kobject, it
can happen that userspace manages to open and modify the device's data (or
inode) before its BDI is properly initialized leading to a warning in
__mark_inode_dirty().

Fix the problem by registering BDI early enough.

This patch addresses https://bugzilla.kernel.org/show_bug.cgi?id=16312

Cc: stable@kernel.org
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

01ea5063

blkio: Documentation Update · 2786c4e5

Vivek Goyal authored Sep 15, 2010

o Documentation update
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

2786c4e5

blkio: Implementation of IOPS limit logic · 8e89d13f

Vivek Goyal authored Sep 15, 2010

o core logic of implementing IOPS throttling.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

8e89d13f

blk-cgroup: cgroup changes for IOPS limit support · 7702e8f4

Vivek Goyal authored Sep 15, 2010

o cgroup changes for IOPS throttling rules.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

7702e8f4

blkio: Core implementation of throttle policy · e43473b7

Vivek Goyal authored Sep 15, 2010

o Actual implementation of throttling policy in block layer. Currently it
  implements READ and WRITE bytes per second throttling logic. IOPS throttling
  comes in later patches.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

e43473b7

blk-cgroup: Introduce cgroup changes for throttling policy · 4c9eefa1

Vivek Goyal authored Sep 15, 2010

o cgroup chagnes for throttle policy.

o Introduces READ and WRITE bytes per second throttling rules.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

4c9eefa1

blk-cgroup: Prepare the base for supporting more than one IO control policies · 062a644d

Vivek Goyal authored Sep 15, 2010

o This patch prepares the base for introducing new IO control policies.
  Currently all the code is written knowing there is only one policy
  and that is proportional bandwidth. Creating infrastructure for newer
  policies to come in.

o Also there were many functions which were generated using macro. It was
  very confusing. Got rid of those.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

062a644d

blk-cgroup: Kill the header printed at the start of blkio.weight_device file · af41d7bd

Vivek Goyal authored Sep 15, 2010

o Kill extra "dev weight" header which is printed when somebody reads
  blkio.weight_device file. This really seems to be out of convention. No other
  blkio files are printing any header at the start of file. I think it is ok
  to just print values and how to interpret values should be part of
  documentation.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

af41d7bd

core: match_dev_by_uuid() should not be marked __init · 38b6f45a

Jens Axboe authored Sep 16, 2010

It is also called outside the scope of init functions. Stephen
reports:

WARNING: init/mounts.o(.text+0x21a): Section mismatch in reference from the function name_to_dev_t() to the function .init.text:match_dev_by_uuid()
The function name_to_dev_t() references
the function __init match_dev_by_uuid().
This is often because name_to_dev_t lacks a __init
annotation or the annotation of match_dev_by_uuid is wrong.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

38b6f45a

sg: fix a warning in blk_rq_aligned() call · 2610a254

Namhyung Kim authored Sep 16, 2010

2nd argument of blk_rq_aligned() has changed to 'unsigned long' by
the previous commit 'block: fix an address space warning in blk-map.c'.
That commit neglected to update a user of that function.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

2610a254

15 Sep, 2010 4 commits

init: add support for root devices specified by partition UUID · b5af921e

Will Drewry authored Aug 31, 2010

This is the third patch in a series which adds support for
storing partition metadata, optionally, off of the hd_struct.

One major use for that data is being able to resolve partition
by other identities than just the index on a block device.  Device
enumeration varies by platform and there's a benefit to being able
to use something like EFI GPT's GUIDs to determine the correct
block device and partition to mount as the root.

This change adds that support to root= by adding support for
the following syntax:

  root=PARTUUID=hex-uuid
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

b5af921e

genhd, efi: add efi partition metadata to hd_structs · eec7ecfe

Will Drewry authored Aug 31, 2010

This change extends the partition_meta_info structure to
support EFI GPT-specific metadata and ensures that data
is copied in on partition scanning.
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

eec7ecfe

block, partition: add partition_meta_info to hd_struct · 6d1d8050

Will Drewry authored Aug 31, 2010

I'm reposting this patch series as v4 since there have been no additional
comments, and I cleaned up one extra bit of unneeded code (in 3/3). The patches
are against Linus's tree: 2bfc96a1
(2.6.36-rc3).

Would this patchset be suitable for inclusion in an mm branch?

This changes adds a partition_meta_info struct which itself contains a
union of structures that provide partition table specific metadata.

This change leaves the union empty. The subsequent patch includes an
implementation for CONFIG_EFI_PARTITION-based metadata.
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

6d1d8050

block: fix an address space warning in blk-map.c · 14417799

Namhyung Kim authored Sep 15, 2010

Change type of 2nd parameter of blk_rq_aligned() into unsigned long
and remove unnecessary casting. Now we can call it with 'uaddr'
instead of 'ubuf' in __blk_rq_map_user() so that it can remove
following warnings from sparse:

 block/blk-map.c:57:31: warning: incorrect type in argument 2 (different address spaces)
 block/blk-map.c:57:31:    expected void *addr
 block/blk-map.c:57:31:    got void [noderef] <asn:1>*ubuf

However blk_rq_map_kern() needs one more local variable to handle it.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

14417799

14 Sep, 2010 1 commit

block: block_dump: Add number of sectors to debug output · 8dcbdc74

San Mehat authored Sep 14, 2010

Signed-off-by: San Mehat <san@android.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

8dcbdc74

10 Sep, 2010 3 commits

zfcp: Report scatter gather limit for DIX protection information · 175b79f0

Christof Schmitt authored Sep 10, 2010

When sending DIX integrity segments with an I/O request, the
restriction for the maximum number of segments is still the same for
the zfcp hardware. Report the new sg_prot_tablesize for the SCSI host,
so that the number of integrity segments plus the number of data
segments is not larger than the hardware limit. This results in using
half of the hardware segments for integrity data and the other half
for regular data.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>

175b79f0

block/scsi: Provide a limit on the number of integrity segments · 13f05c8d

Martin K. Petersen authored Sep 10, 2010

Some controllers have a hardware limit on the number of protection
information scatter-gather list segments they can handle.

Introduce a max_integrity_segments limit in the block layer and provide
a new scsi_host_template setting that allows HBA drivers to provide a
value suitable for the hardware.

Add support for honoring the integrity segment limit when merging both
bios and requests.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>

13f05c8d

Consolidate min_not_zero · c8bf1336

Martin K. Petersen authored Sep 10, 2010

We have several users of min_not_zero, each of them using their own
definition.  Move the define to kernel.h.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>

c8bf1336

23 Aug, 2010 1 commit
- Linux 2.6.36-rc2 · 76be97c1
  Linus Torvalds authored Aug 22, 2010
  
  76be97c1
22 Aug, 2010 8 commits

Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 3dc8d7f0

Linus Torvalds authored Aug 22, 2010

* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PIT: free irq source id in handling error path
  KVM: destroy workqueue on kvm_create_pit() failures
  KVM: fix poison overwritten caused by using wrong xstate size

3dc8d7f0

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel · 4238a417

Linus Torvalds authored Aug 22, 2010

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (58 commits)
  drm/i915,intel_agp: Add support for Sandybridge D0
  drm/i915: fix render pipe control notify on sandybridge
  agp/intel: set 40-bit dma mask on Sandybridge
  drm/i915: Remove the conflicting BUG_ON()
  drm/i915/suspend: s/IS_IRONLAKE/HAS_PCH_SPLIT/
  drm/i915/suspend: Flush register writes before busy-waiting.
  i915: disable DAC on Ironlake also when doing CRT load detection.
  drm/i915: wait for actual vblank, not just 20ms
  drm/i915: make sure eDP PLL is enabled at the right time
  drm/i915: fix VGA plane disable for Ironlake+
  drm/i915: eDP mode set sequence corrections
  drm/i915: add panel reset workaround
  drm/i915: Enable RC6 on Ironlake.
  drm/i915/sdvo: Only set is_lvds if we have a valid fixed mode.
  drm/i915: Set up a render context on Ironlake
  drm/i915 invalidate indirect state pointers at end of ring exec
  drm/i915: Wake-up wait_request() from elapsed hang-check (v2)
  drm/i915: Apply i830 errata for cursor alignment
  drm/i915: Only update i845/i865 CURBASE when disabled (v2)
  drm/i915: FBC is updated within set_base() so remove second call in mode_set()
  ...

4238a417

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 · bc584c51

Linus Torvalds authored Aug 22, 2010

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  slab: fix object alignment
  slub: add missing __percpu markup in mm/slub_def.h

bc584c51

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 · a28e0852
Linus Torvalds authored Aug 22, 2010
```
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: wait for discard to finish
```
a28e0852

drm/i915,intel_agp: Add support for Sandybridge D0 · 4fefe435

Zhenyu Wang authored Aug 19, 2010

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>

4fefe435

drm/i915: fix render pipe control notify on sandybridge · 3fdef020

Zhenyu Wang authored Aug 19, 2010

This one is missed in last pipe control fix for sandybridge,
that really unmask interrupt bit for notify in render engine IMR.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>

3fdef020

agp/intel: set 40-bit dma mask on Sandybridge · 877fdacf

Zhenyu Wang authored Aug 19, 2010

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>

877fdacf

drm/i915: Remove the conflicting BUG_ON() · 156dadc1

Chris Wilson authored Aug 15, 2010

We now attempt to free "active" objects following a GPU hang as either
the GPU will be reset or the hang is permenant. In either case, the GPU
writes will not be flushed to main memory and it should be safe to
return that memory back to the system.

The BUG_ON(active) is thus overkill and can erroneously fire after a
EIO.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>

156dadc1