1. 20 Jul, 2012 1 commit
    • Mikulas Patocka's avatar
      dm raid1: fix crash with mirror recovery and discard · 751f188d
      Mikulas Patocka authored
      This patch fixes a crash when a discard request is sent during mirror
      recovery.
      
      Firstly, some background.  Generally, the following sequence happens during
      mirror synchronization:
      - function do_recovery is called
      - do_recovery calls dm_rh_recovery_prepare
      - dm_rh_recovery_prepare uses a semaphore to limit the number
        simultaneously recovered regions (by default the semaphore value is 1,
        so only one region at a time is recovered)
      - dm_rh_recovery_prepare calls __rh_recovery_prepare,
        __rh_recovery_prepare asks the log driver for the next region to
        recover. Then, it sets the region state to DM_RH_RECOVERING. If there
        are no pending I/Os on this region, the region is added to
        quiesced_regions list. If there are pending I/Os, the region is not
        added to any list. It is added to the quiesced_regions list later (by
        dm_rh_dec function) when all I/Os finish.
      - when the region is on quiesced_regions list, there are no I/Os in
        flight on this region. The region is popped from the list in
        dm_rh_recovery_start function. Then, a kcopyd job is started in the
        recover function.
      - when the kcopyd job finishes, recovery_complete is called. It calls
        dm_rh_recovery_end. dm_rh_recovery_end adds the region to
        recovered_regions or failed_recovered_regions list (depending on
        whether the copy operation was successful or not).
      
      The above mechanism assumes that if the region is in DM_RH_RECOVERING
      state, no new I/Os are started on this region. When I/O is started,
      dm_rh_inc_pending is called, which increases reg->pending count. When
      I/O is finished, dm_rh_dec is called. It decreases reg->pending count.
      If the count is zero and the region was in DM_RH_RECOVERING state,
      dm_rh_dec adds it to the quiesced_regions list.
      
      Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
      in DM_RH_RECOVERING state, it could be added to quiesced_regions list
      multiple times or it could be added to this list when kcopyd is copying
      data (it is assumed that the region is not on any list while kcopyd does
      its jobs). This results in memory corruption and crash.
      
      There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
      do not belong to any region, so they are always added to the sync list
      in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
      requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
      requests. These bypasses avoid the crash possibility described above.
      
      These bypasses were improperly implemented for REQ_DISCARD when
      the mirror target gained discard support in commit
      5fc2ffea (dm raid1: support discard).
      
      In do_writes, REQ_DISCARD requests is always added to the sync queue and
      immediately dispatched (even if the region is in DM_RH_RECOVERING).  However,
      dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts.  So it violates the
      rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
      corruption described above.
      
      This patch changes it so that REQ_DISCARD requests follow the same path
      as REQ_FLUSH. This avoids the crash.
      
      Reference: https://bugzilla.redhat.com/837607Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      751f188d
  2. 14 Jul, 2012 11 commits
  3. 13 Jul, 2012 19 commits
  4. 12 Jul, 2012 5 commits
  5. 11 Jul, 2012 4 commits
    • Linus Torvalds's avatar
      Merge tag 'fbdev-fixes-for-3.5-2' of git://github.com/schandinat/linux-2.6 · 918227bb
      Linus Torvalds authored
      Pull fbdev fixes from Florian Tobias Schandinat:
       "Two fixes for OMAPDSS by Tomi Valkeinen:
         - one to avoid warnings when runtime PM is not enabled
         - one workaround to dependancy issues during suspend/resume"
      
      * tag 'fbdev-fixes-for-3.5-2' of git://github.com/schandinat/linux-2.6:
        OMAPDSS: fix warnings if CONFIG_PM_RUNTIME=n
        OMAPDSS: Use PM notifiers for system suspend
      918227bb
    • Linus Torvalds's avatar
      Merge branch 'akpm' (Andrew's patch-bomb) · 00c3e276
      Linus Torvalds authored
      Merge random patches from Andrew Morton.
      
      * Merge emailed patches from Andrew Morton <akpm@linux-foundation.org>: (32 commits)
        memblock: free allocated memblock_reserved_regions later
        mm: sparse: fix usemap allocation above node descriptor section
        mm: sparse: fix section usemap placement calculation
        xtensa: fix incorrect memset
        shmem: cleanup shmem_add_to_page_cache
        shmem: fix negative rss in memcg memory.stat
        tmpfs: revert SEEK_DATA and SEEK_HOLE
        drivers/rtc/rtc-twl.c: fix threaded IRQ to use IRQF_ONESHOT
        fat: fix non-atomic NFS i_pos read
        MAINTAINERS: add OMAP CPUfreq driver to OMAP Power Management section
        sgi-xp: nested calls to spin_lock_irqsave()
        fs: ramfs: file-nommu: add SetPageUptodate()
        drivers/rtc/rtc-mxc.c: fix irq enabled interrupts warning
        mm/memory_hotplug.c: release memory resources if hotadd_new_pgdat() fails
        h8300/uaccess: add mising __clear_user()
        h8300/uaccess: remove assignment to __gu_val in unhandled case of get_user()
        h8300/time: add missing #include <asm/irq_regs.h>
        h8300/signal: fix typo "statis"
        h8300/pgtable: add missing #include <asm-generic/pgtable.h>
        drivers/rtc/rtc-ab8500.c: ensure correct probing of the AB8500 RTC when Device Tree is enabled
        ...
      00c3e276
    • Yinghai Lu's avatar
      memblock: free allocated memblock_reserved_regions later · 29f67386
      Yinghai Lu authored
      memblock_free_reserved_regions() calls memblock_free(), but
      memblock_free() would double reserved.regions too, so we could free the
      old range for reserved.regions.
      
      Also tj said there is another bug which could be related to this.
      
      | I don't think we're saving any noticeable
      | amount by doing this "free - give it to page allocator - reserve
      | again" dancing.  We should just allocate regions aligned to page
      | boundaries and free them later when memblock is no longer in use.
      
      in that case, when DEBUG_PAGEALLOC, will get panic:
      
           memblock_free: [0x0000102febc080-0x0000102febf080] memblock_free_reserved_regions+0x37/0x39
        BUG: unable to handle kernel paging request at ffff88102febd948
        IP: [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
        PGD 4826063 PUD cf67a067 PMD cf7fa067 PTE 800000102febd160
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        CPU 0
        Pid: 0, comm: swapper Not tainted 3.5.0-rc2-next-20120614-sasha #447
        RIP: 0010:[<ffffffff836a5774>]  [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
      
      See the discussion at https://lkml.org/lkml/2012/6/13/469
      
      So try to allocate with PAGE_SIZE alignment and free it later.
      Reported-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29f67386
    • Yinghai Lu's avatar
      mm: sparse: fix usemap allocation above node descriptor section · 99ab7b19
      Yinghai Lu authored
      After commit f5bf18fa ("bootmem/sparsemem: remove limit constraint
      in alloc_bootmem_section"), usemap allocations may easily be placed
      outside the optimal section that holds the node descriptor, even if
      there is space available in that section.  This results in unnecessary
      hotplug dependencies that need to have the node unplugged before the
      section holding the usemap.
      
      The reason is that the bootmem allocator doesn't guarantee a linear
      search starting from the passed allocation goal but may start out at a
      much higher address absent an upper limit.
      
      Fix this by trying the allocation with the limit at the section end,
      then retry without if that fails.  This keeps the fix from f5bf18fa
      of not panicking if the allocation does not fit in the section, but
      still makes sure to try to stay within the section at first.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[3.3.x, 3.4.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      99ab7b19