1. 12 Feb, 2011 2 commits
    • Eric Sandeen's avatar
      ext4: serialize unaligned asynchronous DIO · e9e3bcec
      Eric Sandeen authored
      ext4 has a data corruption case when doing non-block-aligned
      asynchronous direct IO into a sparse file, as demonstrated
      by xfstest 240.
      
      The root cause is that while ext4 preallocates space in the
      hole, mappings of that space still look "new" and 
      dio_zero_block() will zero out the unwritten portions.  When
      more than one AIO thread is going, they both find this "new"
      block and race to zero out their portion; this is uncoordinated
      and causes data corruption.
      
      Dave Chinner fixed this for xfs by simply serializing all
      unaligned asynchronous direct IO.  I've done the same here.
      The difference is that we only wait on conversions, not all IO.
      This is a very big hammer, and I'm not very pleased with
      stuffing this into ext4_file_write().  But since ext4 is
      DIO_LOCKING, we need to serialize it at this high level.
      
      I tried to move this into ext4_ext_direct_IO, but by then
      we have the i_mutex already, and we will wait on the
      work queue to do conversions - which must also take the
      i_mutex.  So that won't work.
      
      This was originally exposed by qemu-kvm installing to
      a raw disk image with a normal sector-63 alignment.  I've
      tested a backport of this patch with qemu, and it does
      avoid the corruption.  It is also quite a lot slower
      (14 min for package installs, vs. 8 min for well-aligned)
      but I'll take slow correctness over fast corruption any day.
      
      Mingming suggested that we can track outstanding
      conversions, and wait on those so that non-sparse
      files won't be affected, and I've implemented that here;
      unaligned AIO to nonsparse files won't take a perf hit.
      
      [tytso@mit.edu: Keep the mutex as a hashed array instead
       of bloating the ext4 inode]
      
      [tytso@mit.edu: Fix up namespace issues so that global
       variables are protected with an "ext4_" prefix.]
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      e9e3bcec
    • Eric Sandeen's avatar
      ext4: make grpinfo slab cache names static · 2892c15d
      Eric Sandeen authored
      In 2.6.37 I was running into oopses with repeated module
      loads & unloads.  I tracked this down to:
      
      fb1813f4 ext4: use dedicated slab caches for group_info structures
      
      (this was in addition to the features advert unload problem)
      
      The kstrdup & subsequent kfree of the cache name was causing
      a double free.  In slub, at least, if I read it right it allocates
      & frees the name itself, slab seems to do something different...
      so in slub I think we were leaking -our- cachep->name, and double
      freeing the one allocated by slub.
      
      After getting lost in slab/slub/slob a bit, I just looked at other
      sized-caches that get allocated.  jbd2, biovec, sgpool all do it
      more or less the way jbd2 does.  Below patch follows the jbd2
      method of dynamically allocating a cache at mount time from
      a list of static names.
      
      (This might also possibly fix a race creating the caches with
      parallel mounts running).
      
      [Folded in a fix from Dan Carpenter which fixed an off-by-one error in
      the original patch]
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      2892c15d
  2. 07 Feb, 2011 1 commit
    • Curt Wohlgemuth's avatar
      ext4: Fix data corruption with multi-block writepages support · d50bdd5a
      Curt Wohlgemuth authored
      This fixes a corruption problem with the multi-block
      writepages submittal change for ext4, from commit
      bd2d0210 ("ext4: use bio
      layer instead of buffer layer in mpage_da_submit_io").
      
      (Note that this corruption is not present in 2.6.37 on
      ext4, because the corruption was detected after the
      feature was merged in 2.6.37-rc1, and so it was turned
      off by adding a non-default mount option,
      mblk_io_submit.  With this commit, which hopefully
      fixes the last of the bugs with this feature, we'll be
      able to turn on this performance feature by default in
      2.6.38, and remove the mblk_io_submit option.)
      
      The ext4 code path to bundle multiple pages for
      writeback in ext4_bio_write_page() had a bug: we should
      be clearing buffer head dirty flags *before* we submit
      the bio, not in the completion routine.
      
      The patch below was tested on 2.6.37 under KVM with the
      postgresql script which was submitted by Jon Nelson as
      documented in commit 1449032b.
      
      Without the patch, I'd hit the corruption problem about
      50-70% of the time.  With the patch, I executed the
      script > 100 times with no corruption seen.
      
      I also fixed a bug to make sure ext4_end_bio() doesn't
      dereference the bio after the bio_put() call.
      Reported-by: default avatarJon Nelson <jnelson@jamponi.net>
      Reported-by: default avatarMatthias Bayer <jackdachef@gmail.com>
      Signed-off-by: default avatarCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      d50bdd5a
  3. 03 Feb, 2011 3 commits
  4. 22 Jan, 2011 2 commits
    • Linus Torvalds's avatar
      Linux 2.6.38-rc2 · 1bae4ce2
      Linus Torvalds authored
      1bae4ce2
    • Linus Torvalds's avatar
      Merge branch 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 · 13a3cec8
      Linus Torvalds authored
      * 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (101 commits)
        [media] staging/lirc: fix mem leaks and ptr err usage
        [media] hdpvr: reduce latency of i2c read/write w/recycled buffer
        [media] hdpvr: enable IR part
        [media] rc/mceusb: timeout should be in ns, not us
        [media] v4l2-device: fix 'use-after-freed' oops
        [media] v4l2-dev: don't memset video_device.dev
        [media] zoran: use video_device_alloc instead of kmalloc
        [media] w9966: zero device state after a detach
        [media] v4l: Fix a use-before-set in the control framework
        [media] v4l: Include linux/videodev2.h in media/v4l2-ctrls.h
        [media] DocBook/v4l: update V4L2 revision and update copyright years
        [media] DocBook/v4l: fix validation error in dev-rds.xml
        [media] v4l2-ctrls: queryctrl shouldn't attempt to replace V4L2_CID_PRIVATE_BASE IDs
        [media] v4l2-ctrls: fix missing 'read-only' check
        [media] pvrusb2: Provide more information about IR units to lirc_zilog and ir-kbd-i2c
        [media] ir-kbd-i2c: Add back defaults setting for Zilog Z8's at addr 0x71
        [media] lirc_zilog: Update TODO.lirc_zilog
        [media] lirc_zilog: Add Andy Walls to copyright notice and authors list
        [media] lirc_zilog: Remove useless struct i2c_driver.command function
        [media] lirc_zilog: Remove unneeded tests for existence of the IR Tx function
        ...
      13a3cec8
  5. 21 Jan, 2011 32 commits