1. 02 Dec, 2011 1 commit
    • Yasuaki Ishimatsu's avatar
      cfq-iosched: fix cfq_cic_link() race confition · 5eb46851
      Yasuaki Ishimatsu authored
      cfq_cic_link() has race condition. When some processes which shared ioc
      issue I/O to same block device simultaneously, cfq_cic_link() returns -EEXIST
      sometimes. The race condition might stop I/O by following steps:
      
      step  1: Process A: Issue an I/O to /dev/sda
      step  2: Process A: Get an ioc (iocA here) in get_io_context() which does not
      		    linked with a cic for the device
      step  3: Process A: Get a new cic for the device (cicA here) in
      		    cfq_alloc_io_context()
      
      step  4: Process B: Issue an I/O to /dev/sda
      step  5: Process B: Get iocA in get_io_context() since process A and B share the
      		    same ioc
      step  6: Process B: Get a new cic for the device (cicB here) in
      		    cfq_alloc_io_context() since iocA has not been linked with a
      		    cic for the device yet
      
      step  7: Process A: Link cicA to iocA in cfq_cic_link()
      step  8: Process A: Dispatch I/O to driver and finish it
      
      step  9: Process B: Try to link cicB to iocA in cfq_cic_link()
      		    But it fails with showing "cfq: cic link failed!" kernel
      		    message, since iocA has already linked with cicA at step 7.
      step 10: Process B: Wait for finishig I/O in get_request_wait()
      		    The function does not wake up, when there is no I/O to the
      		    device.
      
      When cfq_cic_link() returns -EEXIST, it means ioc has already linked with cic.
      So when cfq_cic_link() return -EEXIST, retry cfq_cic_lookup().
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5eb46851
  2. 30 Nov, 2011 1 commit
  3. 28 Nov, 2011 2 commits
  4. 25 Nov, 2011 1 commit
  5. 23 Nov, 2011 1 commit
    • Mike Snitzer's avatar
      block: initialize request_queue's numa node during · 5151412d
      Mike Snitzer authored
      struct request_queue is allocated with __GFP_ZERO so its "node" field is
      zero before initialization.  This causes an oops if node 0 is offline in
      the page allocator because its zonelists are not initialized.  From Dave
      Young's dmesg:
      
      	SRAT: Node 1 PXM 2 0-d0000000
      	SRAT: Node 1 PXM 2 100000000-330000000
      	SRAT: Node 0 PXM 1 330000000-630000000
      	Initmem setup node 1 0000000000000000-000000000affb000
      	...
      	Built 1 zonelists in Node order, mobility grouping on.
      	...
      	BUG: unable to handle kernel paging request at 0000000000001c08
      	IP: [<ffffffff8111c355>] __alloc_pages_nodemask+0xb5/0x870
      
      and __alloc_pages_nodemask+0xb5 translates to a NULL pointer on
      zonelist->_zonerefs.
      
      The fix is to initialize q->node at the time of allocation so the correct
      node is passed to the slab allocator later.
      
      Since blk_init_allocated_queue_node() is no longer needed, merge it with
      blk_init_allocated_queue().
      
      [rientjes@google.com: changelog, initializing q->node]
      Cc: stable@vger.kernel.org [2.6.37+]
      Reported-by: default avatarDave Young <dyoung@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Tested-by: default avatarDave Young <dyoung@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5151412d
  6. 22 Nov, 2011 13 commits
  7. 21 Nov, 2011 9 commits
  8. 20 Nov, 2011 12 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 6fe4c6d4
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (86 commits)
        ipv4: fix redirect handling
        ping: dont increment ICMP_MIB_INERRORS
        sky2: fix hang in napi_disable
        sky2: enforce minimum ring size
        bonding: Don't allow mode change via sysfs with slaves present
        f_phonet: fix page offset of first received fragment
        stmmac: fix pm functions avoiding sleep on spinlock
        stmmac: remove spin_lock in stmmac_ioctl.
        stmmac: parameters auto-tuning through HW cap reg
        stmmac: fix advertising 1000Base capabilties for non GMII iface
        stmmac: use mdelay on timeout of sw reset
        sky2: version 1.30
        sky2: used fixed RSS key
        sky2: reduce default Tx ring size
        sky2: rename up/down functions
        sky2: pci posting issues
        sky2: fix hang on shutdown (and other irq issues)
        r6040: fix check against MCRO_HASHEN bit in r6040_multicast_list
        MAINTAINERS: change email address for shemminger
        pch_gbe: Move #include of module.h
        ...
      6fe4c6d4
    • Linus Torvalds's avatar
      Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a4cc3889
      Linus Torvalds authored
      * 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM guest: prevent tracing recursion with kvmclock
        Revert "KVM: PPC: Add support for explicit HIOR setting"
        KVM: VMX: Check for automatic switch msr table overflow
        KVM: VMX: Add support for guest/host-only profiling
        KVM: VMX: add support for switching of PERF_GLOBAL_CTRL
        KVM: s390: announce SYNC_MMU
        KVM: s390: Fix tprot locking
        KVM: s390: handle SIGP sense running intercepts
        KVM: s390: Fix RUNNING flag misinterpretation
      a4cc3889
    • Linus Torvalds's avatar
      Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm · bb893d15
      Linus Torvalds authored
      * 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
        ARM: wire up process_vm_writev and process_vm_readv syscalls
        ARM: 7160/1: setup: avoid overflowing {elf,arch}_name from proc_info_list
        ARM: 7158/1: add new MFP implement for NUC900
        ARM: 7157/1: fix a building WARNING for nuc900
        ARM: 7156/1: l2x0: fix compile error on !CONFIG_USE_OF
        ARM: 7155/1: arch.h: Declare 'pt_regs' locally
        ARM: 7154/1: mach-bcmring: fix build error in dma.c
        ARM: 7153/1: mach-bcmring: fix build error in core.c
        ARM: 7152/1: distclean: Remove generated .dtb files
        ARM: 7150/1: Allow kernel unaligned accesses on ARMv6+ processors
        ARM: 7149/1: spi/pl022: Enable clock in probe
        Revert "ARM: 7098/1: kdump: copy kernel relocation code at the kexec prepare stage"
      bb893d15
    • Linus Torvalds's avatar
      Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2d360fcb
      Linus Torvalds authored
      * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM / Suspend: Fix bug in suspend statistics update
        PM / Hibernate: Fix the early termination of test modes
        PM / shmobile: Fix build of sh7372_pm_init() for CONFIG_PM unset
        PM Sleep: Do not extend wakeup paths to devices with ignore_children set
        PM / driver core: disable device's runtime PM during shutdown
        PM / devfreq: correct Kconfig dependency
        PM / devfreq: fix use after free in devfreq_remove_device
        PM / shmobile: Avoid restoring the INTCS state during initialization
        PM / devfreq: Remove compiler error after irq.h update
        PM / QoS: Properly use the WARN() macro in dev_pm_qos_add_request()
        PM / Clocks: Only disable enabled clocks in pm_clk_suspend()
        ARM: mach-shmobile: sh7372 A3SP no_suspend_console fix
        PM / shmobile: Don't skip debugging output in pd_power_up()
      2d360fcb
    • Josef Bacik's avatar
      Btrfs: sectorsize align offsets in fiemap · 4d479cf0
      Josef Bacik authored
      We've been hitting BUG()'s in btrfs_cont_expand and btrfs_fallocate and anywhere
      else that calls btrfs_get_extent while running xfstests 13 in a loop.  This is
      because fiemap is calling btrfs_get_extent with non-sectorsize aligned offsets,
      which will end up adding mappings that are not sectorsize aligned, which will
      cause problems in some cases for subsequent calls to btrfs_get_extent for
      similar areas that are sectorsize aligned.  With this patch I ran xfstests 13 in
      a loop for a couple of hours and didn't hit the problem that I could previously
      hit in at most 20 minutes.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      4d479cf0
    • Josef Bacik's avatar
      Btrfs: clear pages dirty for io and set them extent mapped · f7d61dcd
      Josef Bacik authored
      When doing the io_ctl helpers to clean up the free space cache stuff I stopped
      using our normal prepare_pages stuff, which means I of course forgot to do
      things like set the pages extent mapped, which will cause us all sorts of
      wonderful propblems.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      f7d61dcd
    • Josef Bacik's avatar
      Btrfs: wait on caching if we're loading the free space cache · 291c7d2f
      Josef Bacik authored
      We've been hitting panics when running xfstest 13 in a loop for long periods of
      time.  And actually this problem has always existed so we've been hitting these
      things randomly for a while.  Basically what happens is we get a thread coming
      into the allocator and reading the space cache off of disk and adding the
      entries to the free space cache as we go.  Then we get another thread that comes
      in and tries to allocate from that block group.  Since block_group->cached !=
      BTRFS_CACHE_NO it goes ahead and tries to do the allocation.  We do this because
      if we're doing the old slow way of caching we don't want to hold people up and
      wait for everything to finish.  The problem with this is we could end up
      discarding the space cache at some arbitrary point in the future, which means we
      could very well end up allocating space that is either bad, or when the real
      caching happens it could end up thinking the space isn't in use when it really
      is and cause all sorts of other problems.
      
      The solution is to add a new flag to indicate we are loading the free space
      cache from disk, and always try to cache the block group if cache->cached !=
      BTRFS_CACHE_FINISHED.  That way if we are loading the space cache anybody else
      who tries to allocate from the block group will have to wait until it's finished
      to make sure it completes successfully.  Thanks,
      Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
      291c7d2f
    • Arnd Hannemann's avatar
      Btrfs: prefix resize related printks with btrfs: · 5bb14682
      Arnd Hannemann authored
      For the user it is confusing to find something like:
      [10197.627710] new size for /dev/mapper/vg0-usr_share is 3221225472
      in kernel log, because it doesn't point directly to btrfs.
      
      This patch prefixes those messages with "btrfs:" like other btrfs
      related printks.
      Signed-off-by: default avatarArnd Hannemann <arnd@arndnet.de>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      5bb14682
    • David Sterba's avatar
      btrfs: fix stat blocks accounting · fadc0d8b
      David Sterba authored
      Round inode bytes and delalloc bytes up to real blocksize before
      converting to sector size. Otherwise eg. files smaller than 512
      are reported with zero blocks due to incorrect rounding.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      fadc0d8b
    • Li Zefan's avatar
      Btrfs: avoid unnecessary bitmap search for cluster setup · 52621cb6
      Li Zefan authored
      setup_cluster_no_bitmap() searches all the extents and bitmaps starting
      from offset. Therefore if it returns -ENOSPC, all the bitmaps starting
      from offset are in the bitmaps list, so it's sufficient to search from
      this list in setup_cluser_bitmap().
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      52621cb6
    • Li Zefan's avatar
      Btrfs: fix to search one more bitmap for cluster setup · 0f0fbf1d
      Li Zefan authored
      Suppose there are two bitmaps [0, 256], [256, 512] and one extent
      [100, 120] in the free space cache, and we want to setup a cluster
      with offset=100, bytes=50.
      
      In this case, there will be only one bitmap [256, 512] in the temporary
      bitmaps list, and then setup_cluster_bitmap() won't search bitmap [0, 256].
      
      The cause is, the list is constructed in setup_cluster_no_bitmap(),
      and only bitmaps with bitmap_entry->offset >= offset will be added
      into the list, and the very bitmap that convers offset has
      bitmap_entry->offset <= offset.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      0f0fbf1d
    • Jan Schmidt's avatar
      btrfs: mirror_num should be int, not u64 · 32240a91
      Jan Schmidt authored
      My previous patch introduced some u64 for failed_mirror variables, this one
      makes it consistent again.
      Signed-off-by: default avatarJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      32240a91