1. 12 Jun, 2013 34 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · b2cc9c19
      Linus Torvalds authored
      Pull block layer fixes from Jens Axboe:
       "Outside of bcache (which really isn't super big), these are all
        few-liners.  There are a few important fixes in here:
      
         - Fix blk pm sleeping when holding the queue lock
      
         - A small collection of bcache fixes that have been done and tested
           since bcache was included in this merge window.
      
         - A fix for a raid5 regression introduced with the bio changes.
      
         - Two important fixes for mtip32xx, fixing an oops and potential data
           corruption (or hang) due to wrong bio iteration on stacked devices."
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        scatterlist: sg_set_buf() argument must be in linear mapping
        raid5: Initialize bi_vcnt
        pktcdvd: silence static checker warning
        block: remove refs to XD disks from documentation
        blkpm: avoid sleep when holding queue lock
        mtip32xx: Correctly handle bio->bi_idx != 0 conditions
        mtip32xx: Fix NULL pointer dereference during module unload
        bcache: Fix error handling in init code
        bcache: clarify free/available/unused space
        bcache: drop "select CLOSURES"
        bcache: Fix incompatible pointer type warning
      b2cc9c19
    • Linus Torvalds's avatar
      Merge branch 'akpm' (updates from Andrew Morton) · a568fa1c
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "Bunch of fixes and one little addition to math64.h"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
        include/linux/math64.h: add div64_ul()
        mm: memcontrol: fix lockless reclaim hierarchy iterator
        frontswap: fix incorrect zeroing and allocation size for frontswap_map
        kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()
        mm: migration: add migrate_entry_wait_huge()
        ocfs2: add missing lockres put in dlm_mig_lockres_handler
        mm/page_alloc.c: fix watermark check in __zone_watermark_ok()
        drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()
        aio: fix io_destroy() regression by using call_rcu()
        rtc-at91rm9200: use shadow IMR on at91sam9x5
        rtc-at91rm9200: add shadow interrupt mask
        rtc-at91rm9200: refactor interrupt-register handling
        rtc-at91rm9200: add configuration support
        rtc-at91rm9200: add match-table compile guard
        fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory
        swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion
        drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree
        cciss: fix broken mutex usage in ioctl
        audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE
        drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel
        ...
      a568fa1c
    • Alex Shi's avatar
      include/linux/math64.h: add div64_ul() · c2853c8d
      Alex Shi authored
      There is div64_long() to handle the s64/long division, but no mocro do
      u64/ul division.  It is necessary in some scenarios, so add this
      function.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarAlex Shi <alex.shi@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c2853c8d
    • Johannes Weiner's avatar
      mm: memcontrol: fix lockless reclaim hierarchy iterator · 89dc991f
      Johannes Weiner authored
      The lockless reclaim hierarchy iterator currently has a misplaced
      barrier that can lead to use-after-free crashes.
      
      The reclaim hierarchy iterator consist of a sequence count and a
      position pointer that are read and written locklessly, with memory
      barriers enforcing ordering.
      
      The write side sets the position pointer first, then updates the
      sequence count to "publish" the new position.  Likewise, the read side
      must read the sequence count first, then the position.  If the sequence
      count is up to date, it's guaranteed that the position is up to date as
      well:
      
        writer:                         reader:
        iter->position = position       if iter->sequence == expected:
        smp_wmb()                           smp_rmb()
        iter->sequence = sequence           position = iter->position
      
      However, the read side barrier is currently misplaced, which can lead to
      dereferencing stale position pointers that no longer point to valid
      memory.  Fix this.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: <stable@kernel.org>		[3.10+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      89dc991f
    • Akinobu Mita's avatar
      frontswap: fix incorrect zeroing and allocation size for frontswap_map · 7b57976d
      Akinobu Mita authored
      The bitmap accessed by bitops must have enough size to hold the required
      numbers of bits rounded up to a multiple of BITS_PER_LONG.  And the
      bitmap must not be zeroed by memset() if the number of bits cleared is
      not a multiple of BITS_PER_LONG.
      
      This fixes incorrect zeroing and allocation size for frontswap_map.  The
      incorrect zeroing part doesn't cause any problem because frontswap_map
      is freed just after zeroing.  But the wrongly calculated allocation size
      may cause the problem.
      
      For 32bit systems, the allocation size of frontswap_map is about twice
      as large as required size.  For 64bit systems, the allocation size is
      smaller than requeired if the number of bits is not a multiple of
      BITS_PER_LONG.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b57976d
    • Chen Gang's avatar
      kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules() · 736f3203
      Chen Gang authored
      audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
      the rule itself freed in kill_rules().
      
      The reason is when it is killed, the 'rule' itself may have already
      released, we should not access it.  one example: we add a rule to an
      inode, just at the same time the other task is deleting this inode.
      
      The work flow for adding a rule:
      
          audit_receive() -> (need audit_cmd_mutex lock)
            audit_receive_skb() ->
              audit_receive_msg() ->
                audit_receive_filter() ->
                  audit_add_rule() ->
                    audit_add_tree_rule() -> (need audit_filter_mutex lock)
                      ...
                      unlock audit_filter_mutex
                      get_tree()
                      ...
                      iterate_mounts() -> (iterate all related inodes)
                        tag_mount() ->
                          tag_trunk() ->
                            create_trunk() -> (assume it is 1st rule)
                              fsnotify_add_mark() ->
                                fsnotify_add_inode_mark() ->  (add mark to inode->i_fsnotify_marks)
                              ...
                              get_tree(); (each inode will get one)
                      ...
                      lock audit_filter_mutex
      
      The work flow for deleting an inode:
      
          __destroy_inode() ->
           fsnotify_inode_delete() ->
             __fsnotify_inode_delete() ->
              fsnotify_clear_marks_by_inode() ->  (get mark from inode->i_fsnotify_marks)
                fsnotify_destroy_mark() ->
                 fsnotify_destroy_mark_locked() ->
                   audit_tree_freeing_mark() ->
                     evict_chunk() ->
                       ...
                       tree->goner = 1
                       ...
                       kill_rules() ->   (assume current->audit_context == NULL)
                         call_rcu() ->   (rule->tree != NULL)
                           audit_free_rule_rcu() ->
                             audit_free_rule()
                       ...
                       audit_schedule_prune() ->  (assume current->audit_context == NULL)
                         kthread_run() ->    (need audit_cmd_mutex and audit_filter_mutex lock)
                           prune_one() ->    (delete it from prue_list)
                             put_tree(); (match the original get_tree above)
      Signed-off-by: default avatarChen Gang <gang.chen@asianux.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      736f3203
    • Naoya Horiguchi's avatar
      mm: migration: add migrate_entry_wait_huge() · 30dad309
      Naoya Horiguchi authored
      When we have a page fault for the address which is backed by a hugepage
      under migration, the kernel can't wait correctly and do busy looping on
      hugepage fault until the migration finishes.  As a result, users who try
      to kick hugepage migration (via soft offlining, for example) occasionally
      experience long delay or soft lockup.
      
      This is because pte_offset_map_lock() can't get a correct migration entry
      or a correct page table lock for hugepage.  This patch introduces
      migration_entry_wait_huge() to solve this.
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarWanpeng Li <liwanp@linux.vnet.ibm.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>	[2.6.35+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      30dad309
    • Xue jiufei's avatar
      ocfs2: add missing lockres put in dlm_mig_lockres_handler · 27749f2f
      Xue jiufei authored
      dlm_mig_lockres_handler() is missing a dlm_lockres_put() on an error path.
      Signed-off-by: default avatarjoyce <xuejiufei@huawei.com>
      Reviewed-by: default avatarshencanquan <shencanquan@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27749f2f
    • Tomasz Stanislawski's avatar
      mm/page_alloc.c: fix watermark check in __zone_watermark_ok() · 026b0814
      Tomasz Stanislawski authored
      The watermark check consists of two sub-checks.  The first one is:
      
      	if (free_pages <= min + lowmem_reserve)
      		return false;
      
      The check assures that there is minimal amount of RAM in the zone.  If
      CMA is used then the free_pages is reduced by the number of free pages
      in CMA prior to the over-mentioned check.
      
      	if (!(alloc_flags & ALLOC_CMA))
      		free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
      
      This prevents the zone from being drained from pages available for
      non-movable allocations.
      
      The second check prevents the zone from getting too fragmented.
      
      	for (o = 0; o < order; o++) {
      		free_pages -= z->free_area[o].nr_free << o;
      		min >>= 1;
      		if (free_pages <= min)
      			return false;
      	}
      
      The field z->free_area[o].nr_free is equal to the number of free pages
      including free CMA pages.  Therefore the CMA pages are subtracted twice.
      This may cause a false positive fail of __zone_watermark_ok() if the CMA
      area gets strongly fragmented.  In such a case there are many 0-order
      free pages located in CMA.  Those pages are subtracted twice therefore
      they will quickly drain free_pages during the check against
      fragmentation.  The test fails even though there are many free non-cma
      pages in the zone.
      
      This patch fixes this issue by subtracting CMA pages only for a purpose of
      (free_pages <= min + lowmem_reserve) check.
      
      Laura said:
      
        We were observing allocation failures of higher order pages (order 5 =
        128K typically) under tight memory conditions resulting in driver
        failure.  The output from the page allocation failure showed plenty of
        free pages of the appropriate order/type/zone and mostly CMA pages in
        the lower orders.
      
        For full disclosure, we still observed some page allocation failures
        even after applying the patch but the number was drastically reduced and
        those failures were attributed to fragmentation/other system issues.
      Signed-off-by: default avatarTomasz Stanislawski <t.stanislaws@samsung.com>
      Signed-off-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Tested-by: default avatarLaura Abbott <lauraa@codeaurora.org>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Cc: <stable@vger.kernel.org>	[3.7+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      026b0814
    • Dan Carpenter's avatar
      drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info() · 282c4c0e
      Dan Carpenter authored
      The "info.fill" array isn't initialized so it can leak uninitialized stack
      information to user space.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarRobin Holt <holt@sgi.com>
      Acked-by: default avatarDimitri Sivanich <sivanich@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      282c4c0e
    • Kent Overstreet's avatar
      aio: fix io_destroy() regression by using call_rcu() · 4fcc712f
      Kent Overstreet authored
      There was a regression introduced by 36f55889 ("aio: refcounting
      cleanup"), reported by Jens Axboe - the refcounting cleanup switched to
      using RCU in the shutdown path, but the synchronize_rcu() was done in
      the context of the io_destroy() syscall greatly increasing the time it
      could block.
      
      This patch switches it to call_rcu() and makes shutdown asynchronous
      (more asynchronous than it was originally; before the refcount changes
      io_destroy() would still wait on pending kiocbs).
      
      Note that there's a global quota on the max outstanding kiocbs, and that
      quota must be manipulated synchronously; otherwise io_setup() could
      return -EAGAIN when there isn't quota available, and userspace won't
      have any way of waiting until shutdown of the old kioctxs has finished
      (besides busy looping).
      
      So we release our quota before kioctx shutdown has finished, which
      should be fine since the quota never corresponded to anything real
      anyways.
      Signed-off-by: default avatarKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Reported-by: default avatarJens Axboe <axboe@kernel.dk>
      Tested-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Tested-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4fcc712f
    • Johan Hovold's avatar
      rtc-at91rm9200: use shadow IMR on at91sam9x5 · bba00e59
      Johan Hovold authored
      Add support for the at91sam9x5-family which must use the shadow
      interrupt mask due to a hardware issue (causing RTC_IMR to always be
      zero).
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
      Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
      Cc: Robert Nelson <Robert.Nelson@digikey.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bba00e59
    • Johan Hovold's avatar
      rtc-at91rm9200: add shadow interrupt mask · e9f08bbe
      Johan Hovold authored
      Add shadow interrupt-mask register which can be used on SoCs where the
      actual hardware register is broken.
      
      Note that some care needs to be taken to make sure the shadow mask
      corresponds to the actual hardware state.  The added overhead is not an
      issue for the non-broken SoCs due to the relatively infrequent
      interrupt-mask updates.  We do, however, only use the shadow mask value
      as a fall-back when it actually needed as there is still a theoretical
      possibility that the mask is incorrect (see the code for details).
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
      Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
      Cc: Robert Nelson <Robert.Nelson@digikey.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e9f08bbe
    • Johan Hovold's avatar
      rtc-at91rm9200: refactor interrupt-register handling · e304fcd0
      Johan Hovold authored
      Add accessors for the interrupt register.
      
      This will allow us to easily add a shadow interrupt-mask register to use
      on SoCs where the interrupt-mask register cannot be used.
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
      Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
      Cc: Robert Nelson <Robert.Nelson@digikey.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e304fcd0
    • Johan Hovold's avatar
      rtc-at91rm9200: add configuration support · de645475
      Johan Hovold authored
      Add configuration support which can be used to implement SoC-specific
      workarounds for broken hardware.
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
      Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
      Cc: Robert Nelson <Robert.Nelson@digikey.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de645475
    • Johan Hovold's avatar
      rtc-at91rm9200: add match-table compile guard · 558c61e5
      Johan Hovold authored
      The members of Atmel's at91sam9x5 family (9x5) have a broken RTC
      interrupt mask register (AT91_RTC_IMR).  It does not reflect enabled
      interrupts but instead always returns zero.
      
      The kernel's rtc-at91rm9200 driver handles the RTC for the 9x5 family.
      Currently when the date/time is set, an interrupt is generated and this
      driver neglects to handle the interrupt.  The kernel complains about the
      un-handled interrupt and disables it henceforth.  This not only breaks
      the RTC function, but since that interrupt is shared (Atmel's SYS
      interrupt) then other things break as well (e.g.  the debug port no
      longer accepts characters).
      
      Tested on the at91sam9g25.  Bug confirmed by Atmel.
      
      This patch (of 5):
      
      Add missing match-table compile guard.
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
      Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
      Cc: Robert Nelson <Robert.Nelson@digikey.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      558c61e5
    • Goldwyn Rodrigues's avatar
      fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory · e0991271
      Goldwyn Rodrigues authored
      While removing a non-empty directory, the kernel dumps a message:
      
        (rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39
      
      Suppress the error message from being printed in the dmesg so users
      don't panic.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Acked-by: default avatarSunil Mushran <sunil.mushran@gmail.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0991271
    • Rafael Aquini's avatar
      swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion · cbab0e4e
      Rafael Aquini authored
      read_swap_cache_async() can race against get_swap_page(), and stumble
      across a SWAP_HAS_CACHE entry in the swap map whose page wasn't brought
      into the swapcache yet.
      
      This transient swap_map state is expected to be transitory, but the
      actual placement of discard at scan_swap_map() inserts a wait for I/O
      completion thus making the thread at read_swap_cache_async() to loop
      around its -EEXIST case, while the other end at get_swap_page() is
      scheduled away at scan_swap_map().  This can leave the system deadlocked
      if the I/O completion happens to be waiting on the CPU waitqueue where
      read_swap_cache_async() is busy looping and !CONFIG_PREEMPT.
      
      This patch introduces a cond_resched() call to make the aforementioned
      read_swap_cache_async() busy loop condition to bail out when necessary,
      thus avoiding the subtle race window.
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cbab0e4e
    • Tony Lindgren's avatar
      drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree · 24b8256a
      Tony Lindgren authored
      When booted in legacy mode device_init_wakeup() gets called by
      drivers/mfd/twl-core.c when the children are initialized.  However, when
      booted using device tree, the children are created with
      of_platform_populate() instead add_children().
      
      This means that the RTC driver will not have device_init_wakeup() set,
      and we need to call it from the driver probe like RTC drivers typically
      do.
      
      Without this we cannot test PM wake-up events on omaps for cases where
      there may not be any physical wake-up event.
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Reported-by: default avatarKevin Hilman <khilman@linaro.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Jingoo Han <jg1.han@samsung.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      24b8256a
    • Stephen M. Cameron's avatar
      cciss: fix broken mutex usage in ioctl · 03f47e88
      Stephen M. Cameron authored
      If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked
      (as is normal with the Array Configuration Utility) the process will
      hang as below.  It attempts to acquire the same mutex twice, once in
      do_ioctl() and once in cciss_unlocked_open().  The BKL was recursive,
      the mutex isn't.
      
        Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013
        [...]
        acu             D 0000000000000001     0  3246   3191 0x00000080
        Call Trace:
          schedule+0x29/0x70
          schedule_preempt_disabled+0xe/0x10
          __mutex_lock_slowpath+0x17b/0x220
          mutex_lock+0x2b/0x50
          cciss_unlocked_open+0x2f/0x110 [cciss]
          __blkdev_get+0xd3/0x470
          blkdev_get+0x5c/0x1e0
          register_disk+0x182/0x1a0
          add_disk+0x17c/0x310
          cciss_add_disk+0x13a/0x170 [cciss]
          cciss_update_drive_info+0x39b/0x480 [cciss]
          rebuild_lun_table+0x258/0x370 [cciss]
          cciss_ioctl+0x34f/0x470 [cciss]
          do_ioctl+0x49/0x70 [cciss]
          __blkdev_driver_ioctl+0x28/0x30
          blkdev_ioctl+0x200/0x7b0
          block_ioctl+0x3c/0x40
          do_vfs_ioctl+0x89/0x350
          SyS_ioctl+0xa1/0xb0
          system_call_fastpath+0x16/0x1b
      
      This mutex usage was added into the ioctl path when the big kernel lock
      was removed.  As it turns out, these paths are all thread safe anyway
      (or can easily be made so) and we don't want ioctl() to be single
      threaded in any case.
      Signed-off-by: default avatarStephen M. Cameron <scameron@beardog.cce.hp.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03f47e88
    • Oleg Nesterov's avatar
      audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE · f000cfdd
      Oleg Nesterov authored
      audit_log_start() does wait_for_auditd() in a loop until
      audit_backlog_wait_time passes or audit_skb_queue has a room.
      
      If signal_pending() is true this becomes a busy-wait loop, schedule() in
      TASK_INTERRUPTIBLE won't block.
      
      Thanks to Guy for fully investigating and explaining the problem.
      
      (akpm: that'll cause the system to lock up on a non-preemptible
      uniprocessor kernel)
      
      (Guy: "Our customer was in fact running a uniprocessor machine, and they
      reported a system hang.")
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarGuy Streeter <streeter@redhat.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f000cfdd
    • Derek Basehore's avatar
      drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel · ebf8d6c8
      Derek Basehore authored
      During resume, we call hpet_rtc_timer_init after masking an irq bit in
      hpet.  This will cause the call to hpet_disable_rtc_channel to be undone
      if RTC_AIE is the only bit not masked.
      
      Allowing the cmos interrupt handler to run before resuming caused some
      issues where the timer for the alarm was not removed.  This would cause
      other, later timers to not be cleared, so utilities such as hwclock
      would time out when waiting for the update interrupt.
      
      [akpm@linux-foundation.org: coding-style tweak]
      Signed-off-by: default avatarDerek Basehore <dbasehore@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ebf8d6c8
    • Dmitry Osipenko's avatar
      drivers/rtc/rtc-tps6586x.c: device wakeup flags correction · 5a280844
      Dmitry Osipenko authored
      Use device_init_wakeup() instead of device_set_wakeup_capable() and move
      it before rtc dev registering.  This fixes alarmtimer not registered
      when tps6586x rtc is the only wakeup compatible rtc in the system.
      Signed-off-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Cc: Laxman Dewangan <ldewangan@nvidia.com>
      Cc: Jingoo Han <jg1.han@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a280844
    • Andrey Vagin's avatar
      memcg: don't initialize kmem-cache destroying work for root caches · f101a946
      Andrey Vagin authored
      struct memcg_cache_params has a union.  Different parts of this union
      are used for root and non-root caches.  A part with destroying work is
      used only for non-root caches.
      
        BUG: unable to handle kernel paging request at 0000000fffffffe0
        IP: kmem_cache_alloc+0x41/0x1f0
        Modules linked in: netlink_diag af_packet_diag udp_diag tcp_diag inet_diag unix_diag ip6table_filter ip6_tables i2c_piix4 virtio_net virtio_balloon microcode i2c_core pcspkr floppy
        CPU: 0 PID: 1929 Comm: lt-vzctl Tainted: G      D      3.10.0-rc1+ #2
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        RIP: kmem_cache_alloc+0x41/0x1f0
        Call Trace:
         getname_flags.part.34+0x30/0x140
         getname+0x38/0x60
         do_sys_open+0xc5/0x1e0
         SyS_open+0x22/0x30
         system_call_fastpath+0x16/0x1b
        Code: f4 53 48 83 ec 18 8b 05 8e 53 b7 00 4c 8b 4d 08 21 f0 a8 10 74 0d 4c 89 4d c0 e8 1b 76 4a 00 4c 8b 4d c0 e9 92 00 00 00 4d 89 f5 <4d> 8b 45 00 65 4c 03 04 25 48 cd 00 00 49 8b 50 08 4d 8b 38 49
        RIP  [<ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: <stable@vger.kernel.org>	[3.9.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f101a946
    • Xiaowei.Hu's avatar
      ocfs2: ocfs2_prep_new_orphaned_file() should return ret · 7869e590
      Xiaowei.Hu authored
      If an error occurs, for example an EIO in __ocfs2_prepare_orphan_dir,
      ocfs2_prep_new_orphaned_file will release the inode_ac, then when the
      caller of ocfs2_prep_new_orphaned_file gets a 0 return, it will refer to
      a NULL ocfs2_alloc_context struct in the following functions.  A kernel
      panic happens.
      Signed-off-by: default avatar"Xiaowei.Hu" <xiaowei.hu@oracle.com>
      Reviewed-by: default avatarshencanquan <shencanquan@huawei.com>
      Acked-by: default avatarSunil Mushran <sunil.mushran@gmail.com>
      Cc: Joe Jin <joe.jin@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7869e590
    • Chen Gang's avatar
      lib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by 'EXTRA_FLAGS=-W'. · 5402b804
      Chen Gang authored
      For 'while' looping, need stop when 'nbytes == 0', or will cause issue.
      ('nbytes' is size_t which is always bigger or equal than zero).
      
      The related warning: (with EXTRA_CFLAGS=-W)
      
        lib/mpi/mpicoder.c:40:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
      Signed-off-by: default avatarChen Gang <gang.chen@asianux.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5402b804
    • Kees Cook's avatar
      kmsg: honor dmesg_restrict sysctl on /dev/kmsg · 637241a9
      Kees Cook authored
      The dmesg_restrict sysctl currently covers the syslog method for access
      dmesg, however /dev/kmsg isn't covered by the same protections.  Most
      people haven't noticed because util-linux dmesg(1) defaults to using the
      syslog method for access in older versions.  With util-linux dmesg(1)
      defaults to reading directly from /dev/kmsg.
      
      To fix /dev/kmsg, let's compare the existing interfaces and what they
      allow:
      
       - /proc/kmsg allows:
        - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
          single-reader interface (SYSLOG_ACTION_READ).
        - everything, after an open.
      
       - syslog syscall allows:
        - anything, if CAP_SYSLOG.
        - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
          dmesg_restrict==0.
        - nothing else (EPERM).
      
      The use-cases were:
       - dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
       - sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
         destructive SYSLOG_ACTION_READs.
      
      AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
      clear the ring buffer.
      
      Based on the comments in devkmsg_llseek, it sounds like actions besides
      reading aren't going to be supported by /dev/kmsg (i.e.
      SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
      syslog syscall actions.
      
      To this end, move the check as Josh had done, but also rename the
      constants to reflect their new uses (SYSLOG_FROM_CALL becomes
      SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
      SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
      allows destructive actions after a capabilities-constrained
      SYSLOG_ACTION_OPEN check.
      
       - /dev/kmsg allows:
        - open if CAP_SYSLOG or dmesg_restrict==0
        - reading/polling, after open
      
      Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192
      
      [akpm@linux-foundation.org: use pr_warn_once()]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarChristian Kujau <lists@nerdbynature.de>
      Tested-by: default avatarJosh Boyer <jwboyer@redhat.com>
      Cc: Kay Sievers <kay@vrfy.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      637241a9
    • Robin Holt's avatar
      reboot: rigrate shutdown/reboot to boot cpu · cf7df378
      Robin Holt authored
      We recently noticed that reboot of a 1024 cpu machine takes approx 16
      minutes of just stopping the cpus.  The slowdown was tracked to commit
      f96972f2 ("kernel/sys.c: call disable_nonboot_cpus() in
      kernel_restart()").
      
      The current implementation does all the work of hot removing the cpus
      before halting the system.  We are switching to just migrating to the
      boot cpu and then continuing with shutdown/reboot.
      
      This also has the effect of not breaking x86's command line parameter
      for specifying the reboot cpu.  Note, this code was shamelessly copied
      from arch/x86/kernel/reboot.c with bits removed pertaining to the
      reboot_cpu command line parameter.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Tested-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf7df378
    • Srivatsa S. Bhat's avatar
      CPU hotplug: provide a generic helper to disable/enable CPU hotplug · 16e53dbf
      Srivatsa S. Bhat authored
      There are instances in the kernel where we would like to disable CPU
      hotplug (from sysfs) during some important operation.  Today the freezer
      code depends on this and the code to do it was kinda tailor-made for
      that.
      
      Restructure the code and make it generic enough to be useful for other
      usecases too.
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      16e53dbf
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 1a9c3d68
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
       "Resurrect Alchemy platforms by invoking the WAIT instructions with
        interrupts enabled.  This still leaves the race condition between
        testing TIF_NEED_RESCHED and the WAIT instruction for Alchemy
        platforms which need a different fix than other MIPS platforms.  But
        at least it gets MIPS platforms flying again.
      
        There are also fixes for two build errors (CONFIG_FTRACE=y with
        CONFIG_DYNAMIC_FTRACE=n) and CONFIG_VIRTUALIZATION without CONFIG_KVM"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: ftrace: Add missing CONFIG_DYNAMIC_FTRACE
        MIPS: include: mmu_context.h: Replace VIRTUALIZATION with KVM
        MIPS: Alchemy: fix wait function
      1a9c3d68
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 6673de0e
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Just some GMA500 memory leaks and i915 regression fix due to a
        regression fix"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: prefer VBT modes for SVDO-LVDS over EDID
        drm/i915: Enable hotplug interrupts after querying hw capabilities.
        drm/i915: Fix hotplug interrupt enabling for SDVOC
        drm/gma500/cdv: Fix cursor gem obj referencing on cdv
        drm/gma500/psb: Fix cursor gem obj referencing on psb
        drm/gma500/cdv: Unpin framebuffer on crtc disable
        drm/gma500/psb: Unpin framebuffer on crtc disable
        drm/gma500: Add fb gtt offset to fb base
      6673de0e
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.10-rc5' of... · 45d53766
      Linus Torvalds authored
      Merge tag 'trace-fixes-v3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull tracing fix from Steven Rostedt:
       "Yoshihiro Yunomae fixed a regression in the output format when using
        one of the counter clocks.
      
        The new multibuffer code changed the trace_clock file to update the
        trace instances tr->clock_id but the actual traces still used the
        value from the obsolete global variable trace_clock_id"
      
      * tag 'trace-fixes-v3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix outputting formats of x86-tsc and counter when use trace_clock
      45d53766
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 8d7a8fe2
      Linus Torvalds authored
      Pull ceph fixes from Sage Weil:
       "There is a pair of fixes for double-frees in the recent bundle for
        3.10, a couple of fixes for long-standing bugs (sleep while atomic and
        an endianness fix), and a locking fix that can be triggered when osds
        are going down"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
        rbd: fix cleanup in rbd_add()
        rbd: don't destroy ceph_opts in rbd_add()
        ceph: ceph_pagelist_append might sleep while atomic
        ceph: add cpu_to_le32() calls when encoding a reconnect capability
        libceph: must hold mutex for reset_changed_osds()
      8d7a8fe2
    • Linus Torvalds's avatar
      Merge branch 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme · 77293e21
      Linus Torvalds authored
      Pull NVMe fixes from Matthew Wilcox.
      
      * 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme:
        NVMe: Add MSI support
        NVMe: Use dma_set_mask() correctly
        Return the result from user admin command IOCTL even in case of failure
        NVMe: Do not cancel command multiple times
        NVMe: fix error return code in nvme_submit_bio_queue()
        NVMe: check for integer overflow in nvme_map_user_pages()
        MAINTAINERS: update NVM EXPRESS DRIVER file list
        NVMe: Fix a signedness bug in nvme_trans_modesel_get_mp
        NVMe: Remove redundant version.h header include
      77293e21
  2. 11 Jun, 2013 6 commits