1. 14 May, 2015 8 commits
    • Andrea Arcangeli's avatar
      mm: zone_reclaim: compaction: don't depend on kswapd to invoke reset_isolation_suitable · af24516e
      Andrea Arcangeli authored
      If kswapd never need to run (only __GFP_NO_KSWAPD allocations and
      plenty of free memory) compaction is otherwise crippled down and stops
      running for a while after the free/isolation cursor meets. After that
      allocation can fail for a full cycle of compaction_deferred, until
      compaction_restarting finally reset it again.
      
      Stopping compaction for a full cycle after the cursor meets, even if
      it never failed and it's not going to fail, doesn't make sense.
      
      We already throttle compaction CPU utilization using
      defer_compaction. We shouldn't prevent compaction to run after each
      pass completes when the cursor meets, unless it failed.
      
      This makes direct compaction functional again. The throttling of
      direct compaction is still controlled by the defer_compaction
      logic.
      
      kswapd still won't risk to reset compaction, and it will wait direct
      compaction to do so. Not sure if this is ideal but it at least
      decreases the risk of kswapd doing too much work. kswapd will only run
      one pass of compaction until some allocation invokes compaction again.
      
      This decreased reliability of compaction was introduced in commit
      62997027 .
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      af24516e
    • Andrea Arcangeli's avatar
      mm: zone_reclaim: compaction: scan all memory with /proc/sys/vm/compact_memory · 02eaa78b
      Andrea Arcangeli authored
      Reset the stats so /proc/sys/vm/compact_memory will scan all memory.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      02eaa78b
    • Andrea Arcangeli's avatar
      mm: zone_reclaim: remove ZONE_RECLAIM_LOCKED · 43dc77b0
      Andrea Arcangeli authored
      Zone reclaim locked breaks zone_reclaim_mode=1. If more than one
      thread allocates memory at the same time, it forces a premature
      allocation into remote NUMA nodes even when there's plenty of clean
      cache to reclaim in the local nodes.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      43dc77b0
    • Oleg Nesterov's avatar
      mm: fix the theoretical compound_lock() vs prep_new_page() race · 93585352
      Oleg Nesterov authored
      get/put_page(thp_tail) paths do get_page_unless_zero(page_head) +
      compound_lock(). In theory this page_head can be already freed and
      reallocated as alloc_pages(__GFP_COMP, smaller_order). In this case
      get_page_unless_zero() can succeed right after set_page_refcounted(),
      and compound_lock() can race with the non-atomic __SetPageHead().
      
      Perhaps we should rework the thp locking (under discussion), but
      until then this patch moves set_page_refcounted() and adds wmb()
      to ensure that page->_count != 0 comes as a last change.
      
      I am not sure about other callers of set_page_refcounted(), but at
      first glance they look fine to me.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      93585352
    • Andrea Arcangeli's avatar
      oom: allow !__GFP_FS allocations access emergency reserves like __GFP_NOFAIL · fa175d10
      Andrea Arcangeli authored
      With the previous two commits I cannot reproduce any ext4 related
      livelocks anymore, however I hit ext4 memory corruption. ext4 thinks
      it can handle alloc_pages to fail and it doesn't use __GFP_NOFAIL in
      some places but it actually cannot. No surprise as those errors paths
      couldn't ever run so they're likely untested.
      
      I logged all the stack traces of all ext4 failures that lead to the
      ext4 final corruption, at least one of them should be the culprit (the
      lasts ones are more probable). The actual bug in the error paths
      should be found by code review (or the error paths should be deleted
      and __GFP_NOFAIL should be added to the gfp_mask).
      
      Until ext4 is fixed, it is safer to threat !__GFP_FS like __GFP_NOFAIL
      if TIF_MEMDIE is not set (so we cannot exercise any new allocation
      error path in kernel threads, because they're never picked as OOM
      killer victims and TIF_MEMDIE never gets set on them).
      
      I assume other filesystems may have become complacent of this
      accommodating allocator behavior that cannot fail an allocation if
      invoked by a kernel thread too, but the longer we keep the
      __GFP_NOFAIL behavior in should_alloc_retry for small order
      allocations, the less robust these error paths will become and the
      harder it will be to remove this livelock prone assumption in
      should_alloc_retry. In fact we should remove that assumption not just
      for !__GFP_FS allocations.
      
      In practice with this fix there's no regression and all livelocks are
      still gone. The only risk in this approach is to extinguish the
      emergency reserves earlier than before but only during OOM (during
      normal runtime GFP_ATOMIC allocation or other __GFP_MEMALLOC
      allocation reliability is not affected). Clearly this actually reduces
      the livelock risk (verified in practice too) so it is a low risk net
      improvement to the OOM handling with no risk of regression because
      this way no new allocation error paths is exercised.
      fa175d10
    • Andrea Arcangeli's avatar
      oom: fix ext4 __GFP_NOFAIL livelock · 47fb3887
      Andrea Arcangeli authored
      The previous commit fixed a ext4 livelock by not making !__GFP_FS
      allocations behave similarly to __GFP_NOFAIL and I mentioned how
      __GFP_NOFAIL is livelock prone.
      
      After letting the trinity load run for a while I actually hit the
      very __GFP_NOFAIL livelock too:
      
       #0  get_page_from_freelist (gfp_mask=0x20858, nodemask=0x0 <irq_stack_union>, order=0x0, zonelist=0xffff88007fffc100, hi
      gh_zoneidx=0x2, alloc_flags=0xc0, preferred_zone=0xffff88007fffa840, classzone_idx=classzone_idx@entry=0x1, migratetype=
      migratetype@entry=0x2) at mm/page_alloc.c:1953
       #1  0xffffffff81178e88 in __alloc_pages_slowpath (migratetype=0x2, classzone_idx=0x1, preferred_zone=0xffff88007fffa840,
       nodemask=0x0 <irq_stack_union>, high_zoneidx=ZONE_NORMAL, zonelist=0xffff88007fffc100, order=0x0, gfp_mask=0x20858) at
      mm/page_alloc.c:2597
       #2  __alloc_pages_nodemask (gfp_mask=<optimized out>, order=0x0, zonelist=0xffff87fffffffffa, nodemask=0x0 <irq_stack_un
      ion>) at mm/page_alloc.c:2832
       #3  0xffffffff811becab in alloc_pages_current (gfp=0x20858, order=0x0) at mm/mempolicy.c:2100
       #4  0xffffffff8116e450 in alloc_pages (order=0x0, gfp_mask=0x20858) at include/linux/gfp.h:336
       #5  __page_cache_alloc (gfp=0x20858) at mm/filemap.c:663
       #6  0xffffffff8116f03c in pagecache_get_page (mapping=0xffff88007cc03908, offset=0xc920f, fgp_flags=0x7, cache_gfp_mask=
      0x20858, radix_gfp_mask=0x850) at mm/filemap.c:1096
       #7  0xffffffff812160f4 in find_or_create_page (mapping=<optimized out>, gfp_mask=<optimized out>, offset=0xc920f) at inc
      lude/linux/pagemap.h:336
       #8  grow_dev_page (sizebits=0x0, size=0x1000, index=0xc920f, block=0xc920f, bdev=0xffff88007cc03580) at fs/buffer.c:1022
       #9  grow_buffers (size=<optimized out>, block=<optimized out>, bdev=<optimized out>) at fs/buffer.c:1095
       #10 __getblk_slow (size=0x1000, block=0xc920f, bdev=0xffff88007cc03580) at fs/buffer.c:1121
       #11 __getblk (bdev=0xffff88007cc03580, block=0xc920f, size=0x1000) at fs/buffer.c:1395
       #12 0xffffffff8125c8ed in sb_getblk (block=0xc920f, sb=<optimized out>) at include/linux/buffer_head.h:310
       #13 ext4_read_block_bitmap_nowait (sb=0xffff88007c579000, block_group=0x2f) at fs/ext4/balloc.c:407
       #14 0xffffffff8125ced4 in ext4_read_block_bitmap (sb=0xffff88007c579000, block_group=0x2f) at fs/ext4/balloc.c:489
       #15 0xffffffff8167963b in ext4_mb_discard_group_preallocations (sb=0xffff88007c579000, group=0x2f, needed=0x38) at fs/ex
      t4/mballoc.c:3798
       #16 0xffffffff8129ddbd in ext4_mb_discard_preallocations (needed=0x38, sb=0xffff88007c579000) at fs/ext4/mballoc.c:4346
       #17 ext4_mb_new_blocks (handle=0xffff88003305ee98, ar=0xffff88001f50b890, errp=0xffff88001f50b880) at fs/ext4/mballoc.c:4479
       #18 0xffffffff81290fd3 in ext4_ext_map_blocks (handle=0xffff88003305ee98, inode=0xffff88007b85b178, map=0xffff88001f50ba50, flags=0x25) at fs/ext4/extents.c:4453
       #19 0xffffffff81265688 in ext4_map_blocks (handle=0xffff88003305ee98, inode=0xffff88007b85b178, map=0xffff88001f50ba50, flags=0x25) at fs/ext4/inode.c:648
       #20 0xffffffff8126af77 in mpage_map_one_extent (mpd=0xffff88001f50ba28, handle=0xffff88003305ee98) at fs/ext4/inode.c:2164
       #21 mpage_map_and_submit_extent (give_up_on_write=<synthetic pointer>, mpd=0xffff88001f50ba28, handle=0xffff88003305ee98) at fs/ext4/inode.c:2219
       #22 ext4_writepages (mapping=0xffff88007b85b350, wbc=0xffff88001f50bb60) at fs/ext4/inode.c:2557
       #23 0xffffffff8117ce81 in do_writepages (mapping=0xffff88007b85b350, wbc=0xffff88001f50bb60) at mm/page-writeback.c:2046
       #24 0xffffffff812096c0 in __writeback_single_inode (inode=0xffff88007b85b178, wbc=0xffff88001f50bb60) at fs/fs-writeback.c:460
       #25 0xffffffff8120b311 in writeback_sb_inodes (sb=0xffff88007c579000, wb=0xffff88007bceb060, work=0xffff8800130f9d80) at fs/fs-writeback.c:687
       #26 0xffffffff8120b68f in __writeback_inodes_wb (wb=0xffff88007bceb060, work=0xffff8800130f9d80) at fs/fs-writeback.c:732
       #27 0xffffffff8120b94b in wb_writeback (wb=0xffff88007bceb060, work=0xffff8800130f9d80) at fs/fs-writeback.c:863
       #28 0xffffffff8120befc in wb_do_writeback (wb=0xffff88007bceb060) at fs/fs-writeback.c:998
       #29 bdi_writeback_workfn (work=0xffff88007bceb078) at fs/fs-writeback.c:1043
       #30 0xffffffff81092cf5 in process_one_work (worker=0xffff88002c555e80, work=0xffff88007bceb078) at kernel/workqueue.c:2081
       #31 0xffffffff8109376b in worker_thread (__worker=0xffff88002c555e80) at kernel/workqueue.c:2212
       #32 0xffffffff8109ba54 in kthread (_create=0xffff88007bf2e2c0) at kernel/kthread.c:207
       #33 <signal handler called>
       #34 0x0000000000000000 in irq_stack_union ()
       #35 0x0000000000000000 in ?? ()
      
      To solve this I set manually with gdb ALLOC_NO_WATERMARKS in
      alloc_flags, and the livelock resolved itself.
      
      The fix simply allows __GFP_NOFAIL allocation to get access to the
      emergency reserves in the buddy allocator if __GFP_NOFAIL triggers a
      reclaim failure signaling an out of memory condition. Worst case it'll
      deadlock because we run out of emergency reserves but not giving it
      access to the emergency reserves after the __GFP_NOFAIL hits on a out
      of memory condition may actually result in a livelock despite there
      are still ~50Mbyte free! So this is safer. After applying this OOM
      livelock fix I cannot reproduce the livelock anymore in __GFP_NOFAIL.
      47fb3887
    • Andrea Arcangeli's avatar
      mm: ext4 livelock during OOM · 7636db0a
      Andrea Arcangeli authored
      I can easily reproduce a livelock with some trinity load on a 2GB
      guest running in parallel:
      
      	./trinity -X -c remap_anon_pages -q
      	./trinity -X -c userfaultfd -q
      
      The last OOM killer invocation selects this task:
      
      Out of memory: Kill process 6537 (trinity-c6) score 106 or sacrifice child
      Killed process 6537 (trinity-c6) total-vm:414772kB, anon-rss:186744kB, file-rss:560kB
      
      The victim task shortly later is detected in uninterruptible state for
      too long by the hangcheck timer:
      
      INFO: task trinity-c6:6537 blocked for more than 120 seconds.
            Not tainted 3.16.0-rc1+ #4
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      trinity-c6      D ffff88004ec37b50 11080  6537   5530 0x00100004
       ffff88004ec37aa8 0000000000000082 0000000000000000 ffff880039174910
       ffff88004ec37fd8 0000000000004000 ffff88007c8de3d0 ffff880039174910
       0000000000000000 0000000000000000 0000000000000000 0000000000000000
      Call Trace:
       [<ffffffff8167c759>] ? schedule+0x29/0x70
       [<ffffffff8112ae42>] ? __delayacct_blkio_start+0x22/0x30
       [<ffffffff8116e290>] ? __lock_page+0x70/0x70
       [<ffffffff8167c759>] schedule+0x29/0x70
       [<ffffffff8167c82f>] io_schedule+0x8f/0xd0
       [<ffffffff8116e29e>] sleep_on_page+0xe/0x20
       [<ffffffff8167cd33>] __wait_on_bit_lock+0x73/0xb0
       [<ffffffff8116e287>] __lock_page+0x67/0x70
       [<ffffffff810c34d0>] ? wake_atomic_t_function+0x40/0x40
       [<ffffffff8116f125>] pagecache_get_page+0x165/0x1f0
       [<ffffffff8116f3d4>] grab_cache_page_write_begin+0x34/0x50
       [<ffffffff81268f82>] ext4_da_write_begin+0x92/0x380
       [<ffffffff8116d717>] generic_perform_write+0xc7/0x1d0
       [<ffffffff8116fee3>] __generic_file_write_iter+0x173/0x350
       [<ffffffff8125e6ad>] ext4_file_write_iter+0x10d/0x3c0
       [<ffffffff811db72b>] ? vfs_write+0x1bb/0x1f0
       [<ffffffff811da581>] new_sync_write+0x81/0xb0
       [<ffffffff811db62f>] vfs_write+0xbf/0x1f0
       [<ffffffff811dbb72>] SyS_write+0x52/0xc0
       [<ffffffff816816d2>] system_call_fastpath+0x16/0x1b
      3 locks held by trinity-c6/6537:
       #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811fc0ee>] __fdget_pos+0x3e/0x50
       #1:  (sb_writers#3){.+.+.+}, at: [<ffffffff811db72b>] vfs_write+0x1bb/0x1f0
       #2:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8125e62d>] ext4_file_write_iter+0x8d/0x3c0
      
      The task that holds the page lock is likely the below one that never
      returns from __alloc_pages_slowpath.
      
      ck_union>, high_zoneidx=ZONE_NORMAL, zonelist=0xffff88007fffc100, order=0x0, gfp_mask=0x50) at mm/page_alloc.c:2661
      ion>) at mm/page_alloc.c:2821
      0, radix_gfp_mask=0x50) at mm/filemap.c:1096
      emap.h:336
       at fs/ext4/mballoc.c:4442
      50, flags=0x25) at fs/ext4/extents.c:4453
      flags=0x25) at fs/ext4/inode.c:648
      64
      
      (full stack trace of the !__GFP_FS allocation in the kernel thread
      holding the page lock below)
      
      gfp_mask in __alloc_pages_slowpath is gfp_mask=0x50, so
      ___GFP_IO|___GFP_WAIT. ext4_writepages run from a kworker kernel
      thread is holding the page lock that the OOM victim task is waiting
      on.
      
      If alloc_pages returned NULL the whole livelock would resolve itself
      (-ENOMEM would be returned all the way up, ext4 thinks it can handle
      it, in reality it cannot but that's for a later patch in this series).
      ext4_writepages would return and the kworker would try again later to
      flush the dirty pages in the dirty inodes.
      
      To verify I breakpointed in the should_alloc_retry and added
      __GFP_NORETRY to the gfp_mask just before the __GFP_NORETRY check.
      
      gdb> b mm/page_alloc.c:2185
      Breakpoint 8 at 0xffffffff81179122: file mm/page_alloc.c, line 2185.
      gdb> c
      Continuing.
      [Switching to Thread 1]
      _______________________________________________________________________________
           eax:00000000 ebx:00000050  ecx:00000001  edx:00000001     eflags:00000206
           esi:0000196A edi:2963AA50  esp:4EDEF448  ebp:4EDEF568     eip:Error while running hook_stop:
      Value can't be converted to integer.
      
      Breakpoint 8, __alloc_pages_slowpath (migratetype=0x0, classzone_idx=0x1, preferred_zone=0xffff88007fffa840, nodemask=0x0 <irq_stack_union>, high_zoneidx=ZONE_NORMAL, zonelist=0xffff88007fffc100, order=0x0, gfp_mask=0x50) at mm/page_alloc.c:2713
      
      I set the breakpoint at 2185 and it stopped at 2713 but 2713 is in the
      middle of some comment, I assume that's addr2line imperfection and
      it's not relevant.
      
      Then I simply added __GFP_NORETRY to the gfp_mask in the stack:
      
      gdb> print gfp_mask
      $1 = 0x50
      gdb> set gfp_mask = 0x1050
      gdb> p gfp_mask
      $2 = 0x1050
      gdb> c
      Continuing.
      
      After that the livelock resolved itself immediately, the OOM victim
      quit and the workload continued without errors.
      
      The problem was probably introduced in commit
      11e33f6a .
      
      	/*
      	 * In this implementation, order <= PAGE_ALLOC_COSTLY_ORDER
      	 * means __GFP_NOFAIL, but that may not be true in other
      	 * implementations.
      	 */
      	if (order <= PAGE_ALLOC_COSTLY_ORDER)
      		return 1;
      
      Retrying forever and depending on the OOM killer to send a SIGKILL is
      only ok if the victim task isn't sleeping in uninterruptible state
      waiting in this case a kernel thread to release a page lock.
      
      In this case the kernel thread holding the lock would never be picked
      by the OOM killer in the first place so this is a error path in ext4
      probably never exercised.
      
      The objective of an implicit __GFP_NOFAIL behavior (but that fails the
      allocation if TIF_MEMDIE is set on the task, unlike a real
      __GFP_NOFAIL that never fails) I assume is to avoid spurious
      VM_FAULT_OOM to make the OOM killer more reliable, but I don't think
      an almost implicit __GFP_NOFAIL is safe in general. __GFP_NOFAIL is
      unsafe too in fact but at least it's very rarely used.
      
      For now we can start by letting the allocations that hold lowlevel
      filesystem locks (__GFP_FS clear) fail so they can release those
      locks. Those locks tends to be uninterruptible too.
      
      This will reduce the scope of the problem, but I'd rather prefer to
      drop that entire check quoted above in should_alloc_retry though. If a
      kernel thread would use_mm() and then take the mmap_sem for writing,
      and then run a GFP_KERNEL allocation that invokes the OOM killer to
      kill one process that is waiting in down_read in __do_page_fault the
      same problem would emerge.
      
      In short it's a tradeoff between the accuracy of the OOM killer (not
      causing spurious allocation failures in addition to killing the task)
      and the risk of livelock.
      
      Furthermore the more we hold on this change, the more likely those
      ext4 allocations done by kernel thread (never picked by the OOM
      killer, which would set TIF_MEMDIE and let alloc_pages fail once in a
      while) will never be exercised.
      
      After the fix the identical trinity load that reliably reproduces the
      problem completes and in addition to the OOM killer info, as expected
      I get this in the kernel log (instead of the below allocation error,
      I'd get the livelock earlier).
      
      kworker/u16:0: page allocation failure: order:0, mode:0x50
      CPU: 2 PID: 6006 Comm: kworker/u16:0 Tainted: G        W     3.16.0-rc1+ #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: writeback bdi_writeback_workfn (flush-254:0)
       0000000000000000 ffff880006d3f408 ffffffff81679b51 ffff88007fc8f068
       0000000000000050 ffff880006d3f498 ffffffff811748c2 0000000000000010
       ffffffffffffffff ffff88007fffc128 ffffffff810d0bed ffff880006d3f468
      Call Trace:
       [<ffffffff81679b51>] dump_stack+0x4e/0x68
       [<ffffffff811748c2>] warn_alloc_failed+0xe2/0x130
       [<ffffffff810d0bed>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff811791d0>] __alloc_pages_nodemask+0x910/0xb10
       [<ffffffff811becab>] alloc_pages_current+0x8b/0x120
       [<ffffffff8116e450>] __page_cache_alloc+0x10/0x20
       [<ffffffff8116f03c>] pagecache_get_page+0x7c/0x1f0
       [<ffffffff81299004>] ext4_mb_load_buddy+0x274/0x3b0
       [<ffffffff8129a3f2>] ext4_mb_regular_allocator+0x1e2/0x480
       [<ffffffff81296931>] ? ext4_mb_use_preallocated+0x31/0x600
       [<ffffffff8129de28>] ext4_mb_new_blocks+0x568/0x7f0
       [<ffffffff81290fd3>] ext4_ext_map_blocks+0x683/0x1970
       [<ffffffff81265688>] ext4_map_blocks+0x168/0x4d0
       [<ffffffff8126af77>] ext4_writepages+0x6e7/0x1030
       [<ffffffff8117ce81>] do_writepages+0x21/0x50
       [<ffffffff812096c0>] __writeback_single_inode+0x40/0x550
       [<ffffffff8120b311>] writeback_sb_inodes+0x281/0x560
       [<ffffffff8120b68f>] __writeback_inodes_wb+0x9f/0xd0
       [<ffffffff8120b94b>] wb_writeback+0x28b/0x510
       [<ffffffff8120befc>] bdi_writeback_workfn+0x11c/0x6a0
       [<ffffffff81092c8b>] ? process_one_work+0x15b/0x620
       [<ffffffff81092cf5>] process_one_work+0x1c5/0x620
       [<ffffffff81092c8b>] ? process_one_work+0x15b/0x620
       [<ffffffff8109376b>] worker_thread+0x11b/0x4f0
       [<ffffffff81093650>] ? init_pwq+0x190/0x190
       [<ffffffff8109ba54>] kthread+0xe4/0x100
       [<ffffffff8109b970>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff8168162c>] ret_from_fork+0x7c/0xb0
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      7636db0a
    • Andrea Arcangeli's avatar
      Revert "Merge tag 'usb-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb" · fb684033
      Andrea Arcangeli authored
      This reverts commit 42e3a58b, reversing
      changes made to 4fd48b45.
      fb684033
  2. 13 May, 2015 6 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 110bc767
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Handle max TX power properly wrt VIFs and the MAC in iwlwifi, from
          Avri Altman.
      
       2) Use the correct FW API for scan completions in iwlwifi, from Avraham
          Stern.
      
       3) FW monitor in iwlwifi accidently uses unmapped memory, fix from Liad
          Kaufman.
      
       4) rhashtable conversion of mac80211 station table was buggy, the
          virtual interface was not taken into account.  Fix from Johannes
          Berg.
      
       5) Fix deadlock in rtlwifi by not using a zero timeout for
          usb_control_msg(), from Larry Finger.
      
       6) Update reordering state before calculating loss detection, from
          Yuchung Cheng.
      
       7) Fix off by one in bluetooth firmward parsing, from Dan Carpenter.
      
       8) Fix extended frame handling in xiling_can driver, from Jeppe
          Ledet-Pedersen.
      
       9) Fix CODEL packet scheduler behavior in the presence of TSO packets,
          from Eric Dumazet.
      
      10) Fix NAPI budget testing in fm10k driver, from Alexander Duyck.
      
      11) macvlan needs to propagate promisc settings down the the lower
          device, from Vlad Yasevich.
      
      12) igb driver can oops when changing number of rings, from Toshiaki
          Makita.
      
      13) Source specific default routes not handled properly in ipv6, from
          Markus Stenberg.
      
      14) Use after free in tc_ctl_tfilter(), from WANG Cong.
      
      15) Use softirq spinlocking in netxen driver, from Tony Camuso.
      
      16) Two ARM bpf JIT fixes from Nicolas Schichan.
      
      17) Handle MSG_DONTWAIT properly in ring based AF_PACKET sends, from
          Mathias Kretschmer.
      
      18) Fix x86 bpf JIT implementation of FROM_{BE16,LE16,LE32}, from Alexei
          Starovoitov.
      
      19) ll_temac driver DMA maps TX packet header with incorrect length, fix
          from Michal Simek.
      
      20) We removed pm_qos bits from netdevice.h, but some indirect
          references remained.  Kill them.  From David Ahern.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
        net: Remove remaining remnants of pm_qos from netdevice.h
        e1000e: Add pm_qos header
        net: phy: micrel: Fix regression in kszphy_probe
        net: ll_temac: Fix DMA map size bug
        x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions
        netns: return RTM_NEWNSID instead of RTM_GETNSID on a get
        Update be2net maintainers' email addresses
        net_sched: gred: use correct backlog value in WRED mode
        pppoe: drop pppoe device in pppoe_unbind_sock_work
        net: qca_spi: Fix possible race during probe
        net: mdio-gpio: Allow for unspecified bus id
        af_packet / TX_RING not fully non-blocking (w/ MSG_DONTWAIT).
        bnx2x: limit fw delay in kdump to 5s after boot
        ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.
        ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.
        mpls: Change reserved label names to be consistent with netbsd
        usbnet: avoid integer overflow in start_xmit
        netxen_nic: use spin_[un]lock_bh around tx_clean_lock (2)
        net: xgene_enet: Set hardware dependency
        net: amd-xgbe: Add hardware dependency
        ...
      110bc767
    • David Ahern's avatar
      net: Remove remaining remnants of pm_qos from netdevice.h · 01d460dd
      David Ahern authored
      Commit e2c65448 removed pm_qos from struct net_device but left the
      comment and header file. Remove those.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01d460dd
    • David Ahern's avatar
      e1000e: Add pm_qos header · 5684044f
      David Ahern authored
      Commit e2c65448 moved pm_qos_req to e1000_adapter. Add the header file
      that defines the struct.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5684044f
    • Niklas Cassel's avatar
      net: phy: micrel: Fix regression in kszphy_probe · bced8701
      Niklas Cassel authored
      Don't do clock-mode-select if clk == NULL,
      since when building without CONFIG_HAVE_CLK,
      clk_get returns NULL and clk_get_rate returns 0.
      
      Doing clock-mode-select in this cause causes kszphy_probe to
      return -EINVAL and thus prevents the device from being probed.
      
      The original code (before regression) would return 0
      when building without CONFIG_HAVE_CLK.
      
      Cc: stable <stable@vger.kernel.org> # 3.18+
      Fixes: 1fadee0c ("net/phy: micrel: Add clock support for
      KSZ8021/KSZ8031")
      Reviewed-by: default avatarFabio Estevam <fabio.estevam@freescale.com>
      Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarNiklas Cassel <niklass@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bced8701
    • Michal Simek's avatar
      net: ll_temac: Fix DMA map size bug · 44d4f8d7
      Michal Simek authored
      DMA allocates skb->len instead of headlen
      which is used for DMA.
      Signed-off-by: default avatarMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44d4f8d7
    • Alexei Starovoitov's avatar
      x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions · 343f845b
      Alexei Starovoitov authored
      FROM_BE16:
      'ror %reg, 8' doesn't clear upper bits of the register,
      so use additional 'movzwl' insn to zero extend 16 bits into 64
      
      FROM_LE16:
      should zero extend lower 16 bits into 64 bit
      
      FROM_LE32:
      should zero extend lower 32 bits into 64 bit
      
      Fixes: 89aa0758 ("net: sock: allow eBPF programs to be attached to sockets")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      343f845b
  3. 12 May, 2015 11 commits
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 6c9d370c
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
       "One build fix for build breakage of all MIPS SMP kernels caused by
        Rusty's fix of obsolete use of cpu mask helpers, another to fix the FP
        ABI selection when loading an ELF binary"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: fix FP mode selection in lieu of .MIPS.abiflags data
        MIPS: SMP: Fix build error.
      6c9d370c
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 03906ca3
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       - update MAINTAINERS git repo pointer
       - printk garbage fix
       - fix for qib and iw_cxgb4 bugs introduced in 4.1 window
       - fix for an older iWARP netlink bug
       - fix a memcpy issue in ehca driver
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        infiniband: Remove duplicated KERN_<LEVEL> from pr_<level> uses
        IB/qib: fix test of unsigned variable
        RDMA/core: Fix for parsing netlink string attribute
        MAINTAINERS: update the official rdma git repo
        iw_cxgb4: use wildcard mapping for getting remote addr info
        IB/ehca: use correct destination for memcpy
      03906ca3
    • Nicolas Dichtel's avatar
      netns: return RTM_NEWNSID instead of RTM_GETNSID on a get · e3d8ecb7
      Nicolas Dichtel authored
      Usually, RTM_NEWxxx is returned on a get (same as a dump).
      
      Fixes: 0c7aecd4 ("netns: add rtnl cmd to add and get peer netns ids")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3d8ecb7
    • Linus Torvalds's avatar
      Merge tag 'for-v4.1-rc' of git://git.infradead.org/battery-2.6 · cc49e8c9
      Linus Torvalds authored
      Pull power supply and reset fixes from Sebastian Reichel:
       "misc fixes"
      
      * tag 'for-v4.1-rc' of git://git.infradead.org/battery-2.6:
        power: bq27x00_battery: Add missing MODULE_ALIAS
        power: reset: Add MFD_SYSCON depends for brcmstb
        power: reset: ltc2952: Remove bogus hrtimer_start() return value checks
        power_supply: fix oops in collie_battery driver
        power/reset: at91: fix return value check in at91_reset_platform_probe()
        MAINTAINERS: Add me as maintainer of Nokia N900 power supply drivers
        axp288_fuel_gauge: Add original author details
      cc49e8c9
    • Joe Perches's avatar
      infiniband: Remove duplicated KERN_<LEVEL> from pr_<level> uses · f4f01b54
      Joe Perches authored
      These KERN_<LEVEL> uses are unnecessary with pr_<level> and cause
      bad logging output so remove them.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      f4f01b54
    • Mike Marciniszyn's avatar
      IB/qib: fix test of unsigned variable · ec40f925
      Mike Marciniszyn authored
      Commit d4988623 ("IB/qib: use arch_phys_wc_add()")
      adjusted mtrr inititialization to use the new interface.
      
      Unfortunately, the new interface returns a signed
      value and the patch tested the unsigned wc_cookie.
      
      Fix the issue by changing the type of wc_cookie to int.  For
      the success case the ret left at zero to avoid
      a warning from the caller.  For failure wc_cookie
      is used as the ret.
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      ec40f925
    • Tatyana Nikolova's avatar
      RDMA/core: Fix for parsing netlink string attribute · ec04847c
      Tatyana Nikolova authored
      The string iwpm_ulib_name is recorded in a nlmsg as a netlink attribute.
      Without this fix parsing of the nlmsg by the userspace port mapper service fails
      because of unknown attribute length, causing the port mapper service not to
      register the client, which has sent the nlmsg.
      Signed-off-by: default avatarTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Cc: <stable@vger.kernel.org> #v3.16
      Reviewed-By: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      ec04847c
    • Paul Burton's avatar
      MIPS: fix FP mode selection in lieu of .MIPS.abiflags data · 620b1550
      Paul Burton authored
      Commit 46490b57 ("MIPS: kernel: elf: Improve the overall ABI and FPU
      mode checks") reworked the ELF FP ABI mode selection logic, but when
      CONFIG_MIPS_O32_FP64_SUPPORT is enabled it breaks the use of binaries
      which have no PT_MIPS_ABIFLAGS program header & associated
      .MIPS.abiflags section.
      
      A default mode is selected based upon whether the ELF contains MIPS32 or
      MIPS64 code, but that selection is made in arch_elf_pt_proc.
      arch_elf_pt_proc only executes when a PT_MIPS_ABIFLAGS program header is
      found. If one is not found then arch_elf_pt_proc is never called, and no
      default overall_fp_mode value is selected. When arch_check_elf is
      called, both abi0 & abi1 are MIPS_ABI_FP_UNKNOWN which leads to both
      prog_req & interp_req being set to none_req. none_req matches none of
      the conditions for mode selection at the end of arch_check_elf, so
      overall_fp_mode is left untouched. Finally once mips_set_personality_fp
      is called the BUG() in the default case is then hit & the kernel likely
      panics.
      
      Fix this by moving the selection of a default overall mode to the start
      of arch_check_elf, which runs once per ELF executed regardless of
      whether it has a PT_MIPS_ABIFLAGS program header.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Matthew Fortune <matthew.fortune@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org # v4.0+
      Patchwork: http://patchwork.linux-mips.org/patch/9978/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      620b1550
    • Sathya Perla's avatar
      Update be2net maintainers' email addresses · 6938f855
      Sathya Perla authored
      Emulex developers' email addresses are now "@avagotech" instead of
      "@emulex". I'm also replacing Subbu with Padmanabh and Sriharsha in the
      maintainers list. The driver's heading was outdated and did not include
      some of the chip types (BE3, Lancer and Skyhawk) that the driver has
      been supporting for a longtime. I've updated this too.
      Signed-off-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6938f855
    • Ralf Baechle's avatar
      MIPS: SMP: Fix build error. · cafb45b2
      Ralf Baechle authored
        CC      arch/mips/kernel/smp.o
      arch/mips/kernel/smp.c: In function ‘start_secondary’:
      arch/mips/kernel/smp.c:149:2: error: passing argument 2 of ‘cpumask_set_cpu’ discards ‘volatile’ qualifier from pointer target type [-Werror]
        cpumask_set_cpu(cpu, &cpu_callin_map);
        ^
      In file included from ./arch/mips/include/asm/processor.h:14:0,
                       from ./arch/mips/include/asm/thread_info.h:15,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/mips/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/interrupt.h:8,
                       from arch/mips/kernel/smp.c:24:
      include/linux/cpumask.h:272:91: note: expected ‘struct cpumask *’ but argument is of type ‘volatile struct cpumask_t *’
       static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
                                                                                                 ^
      arch/mips/kernel/smp.c: In function ‘smp_prepare_boot_cpu’:
      arch/mips/kernel/smp.c:211:2: error: passing argument 2 of ‘cpumask_set_cpu’ discards ‘volatile’ qualifier from pointer target type [-Werror]
        cpumask_set_cpu(0, &cpu_callin_map);
        ^
      In file included from ./arch/mips/include/asm/processor.h:14:0,
                       from ./arch/mips/include/asm/thread_info.h:15,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/mips/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/interrupt.h:8,
                       from arch/mips/kernel/smp.c:24:
      include/linux/cpumask.h:272:91: note: expected ‘struct cpumask *’ but argument is of type ‘volatile struct cpumask_t *’
       static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
                                                                                                 ^
      arch/mips/kernel/smp.c: In function ‘__cpu_up’:
      arch/mips/kernel/smp.c:221:10: error: passing argument 2 of ‘cpumask_test_cpu’ discards ‘volatile’ qualifier from pointer target type [-Werror]
        while (!cpumask_test_cpu(cpu, &cpu_callin_map))
                ^
      In file included from ./arch/mips/include/asm/processor.h:14:0,
                       from ./arch/mips/include/asm/thread_info.h:15,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/mips/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/interrupt.h:8,
                       from arch/mips/kernel/smp.c:24:
      include/linux/cpumask.h:294:90: note: expected ‘const struct cpumask *’ but argument is of type ‘volatile struct cpumask_t *’
       static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask)
                                                                                                ^
      cc1: all warnings being treated as errors
      make[2]: *** [arch/mips/kernel/smp.o] Error 1
      make[1]: *** [arch/mips/kernel] Error 2
      make: *** [arch/mips] Error 2
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      cafb45b2
    • Doug Ledford's avatar
      MAINTAINERS: update the official rdma git repo · 2936ae04
      Doug Ledford authored
      Linus prefers kernel.org repos to github repos for security.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      2936ae04
  4. 11 May, 2015 14 commits
    • Linus Torvalds's avatar
      Merge branch 'for-4.1' of git://linux-nfs.org/~bfields/linux · 4cfceaf0
      Linus Torvalds authored
      Pull nfsd bugfixes from Bruce Fields:
       "Mainly pnfs fixes (and for problems with generic callback code made
        more obvious by pnfs)"
      
      * 'for-4.1' of git://linux-nfs.org/~bfields/linux:
        nfsd: skip CB_NULL probes for 4.1 or later
        nfsd: fix callback restarts
        nfsd: split transport vs operation errors for callbacks
        svcrpc: fix potential GSSX_ACCEPT_SEC_CONTEXT decoding failures
        nfsd: fix pNFS return on close semantics
        nfsd: fix the check for confirmed openowner in nfs4_preprocess_stateid_op
        nfsd/blocklayout: pretend we can send deviceid notifications
      4cfceaf0
    • Steve Wise's avatar
      iw_cxgb4: use wildcard mapping for getting remote addr info · 940fd304
      Steve Wise authored
      For listening endpoints bound to the wildcard address, we need to pass
      the wildcard address mapping to iwpm_get_remote_info() instead of the
      mapped address of the new child connection.
      
      Without this fix, and with iwarp port mapping enabled, each iw_cxgb4
      connection that is spawned from a listening endpoint bound to the wildcard
      address, will generate an annoying dmesg entry about failing to find
      the remote address mapping info, and the connection state displayed in
      debugfs under /sys/kernel/debug/iw_cxgb4/<pci-slot-no>/eps  will not have
      the peer's address/port mapping info.  The connection still works though.
      
      Fixes: 5b6b8fe6 ("RDMA/cxgb4: Report the actual address of the remote connecting peer")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: default avatarTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      940fd304
    • Nicholas Mc Guire's avatar
      IB/ehca: use correct destination for memcpy · 94634e98
      Nicholas Mc Guire authored
      Using an element of a struct as the address for the memcpy of the whole
      struct may introduce a buffer overflow and does not help readability either
      simply pass the real thing as first argument to memcpy.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      94634e98
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · ef208162
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A number of driver specific fixes (including several missing
        dependencies for randconfig type cases) plus two core fixes.
      
        One makes the setup_transfer() callback optional which unbreaks some
        drivers which had been merged with it omitted due to local versions of
        this patch and another ensures that we don't corrupt data by leaking
        internal dummy buffers to callers, causing the callers to think they
        allocated those buffers"
      
      * tag 'spi-fix-v4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: fsl-espi: fix behaviour for full-duplex xfers
        spi: fsl-spi: fix devm_ioremap_resource() error case
        spi: Kconfig: Add SOC_LS1021A to SPI_FSL_DSPI dependence
        spi/omap2-mcpsi: Always call spi_finalize_current_message()
        spi: fsl-spi: use devm_ioremap_resource() to map parameter ram on CPM1
        spi: bitbang: Make setup_transfer() callback optional
        spi: check tx_buf and rx_buf in spi_unmap_msg
        spi: bcm2835: change timeout of polling driver to 1s
        spi: bcm2835: Add GPIOLIB dependency
      ef208162
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · a156e068
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
       "Three fixes have queued up:
      
         - reference count fix in the AMD IOMMUv2 driver
      
         - sign extension fix in the ARM-SMMU driver
      
         - build fix for rockchip driver with device tree"
      
      * tag 'iommu-fixes-v4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/arm-smmu: Fix sign-extension of upstream bus addresses at stage 1
        iommu/rockchip: Fix build without CONFIG_OF
        iommu/amd: Fix bug in put_pasid_state_wait
      a156e068
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 9c922a55
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "This fixes a the implementation of CRC32 on arm64 where it incorrectly
        applied negation on the result.
      
        It also fixes the arm64 implementations of SHA/SHA256 where in some
        cases it may end up finalising the result twice"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: arm64/sha2-ce - prevent asm code finalization in final() path
        crypto: arm64/sha1-ce - prevent asm code finalization in final() path
        crypto: arm64/crc32 - bring in line with generic CRC32
      9c922a55
    • Linus Torvalds's avatar
      Merge branch 'for-4.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · b3e5838a
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Rather big for fixes pull.
      
         - SCC controllers never lived to see the light of the day.  Both
           libata and ide drivers removed.
      
         - In some configurations, link power management policy changes
           sometimes cause delayed spurious PHY events which can develop into
           noticeable failures.  This has been reported several times over the
           years.  Gabriele's patches suppress PHY events for a while after
           LPM policy changes which should help most of these failures without
           causing too much problem for hotplug use cases.
      
         - A few controller specific fixes"
      
      [ Hmm.  I don't think removing SSC support is really a "fix", but hey, it
        removes a lot of lines of code.  Which I like.  So ...  good riddance ]
      
      * 'for-4.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        ahci: avoton port-disable reset-quirk
        ata: select DW_DMAC in case of SATA_DWC
        libata: Blacklist queued TRIM on all Samsung 800-series
        libata: Ignore spurious PHY event on LPM policy change
        libata: Add helper to determine when PHY events should be ignored
        ata: ahci_st: fixup layering violations / drvdata errors
        Remove celleb-only SCC PATA drivers
      b3e5838a
    • Linus Torvalds's avatar
      Merge tag 'md/4.1-rc3-fixes' of git://neil.brown.name/md · c91aa67e
      Linus Torvalds authored
      Pull md bugfixes from Neil Brown:
       "A few fixes for md.
      
        Most of these are related to the new "batched stripe writeout", but
        there are a few others"
      
      * tag 'md/4.1-rc3-fixes' of git://neil.brown.name/md:
        md/raid5: fix handling of degraded stripes in batches.
        md/raid5: fix allocation of 'scribble' array.
        md/raid5: don't record new size if resize_stripes fails.
        md/raid5: avoid reading parity blocks for full-stripe write to degraded array
        md/raid5: more incorrect BUG_ON in handle_stripe_fill.
        md/raid5: new alloc_stripe() to allocate an initialize a stripe.
        md-raid0: conditional mddev->queue access to suit dm-raid
      c91aa67e
    • David Ward's avatar
      net_sched: gred: use correct backlog value in WRED mode · 145a42b3
      David Ward authored
      In WRED mode, the backlog for a single virtual queue (VQ) should not be
      used to determine queue behavior; instead the backlog is summed across
      all VQs. This sum is currently used when calculating the average queue
      lengths. It also needs to be used when determining if the queue's hard
      limit has been reached, or when reporting each VQ's backlog via netlink.
      q->backlog will only be used if the queue switches out of WRED mode.
      Signed-off-by: default avatarDavid Ward <david.ward@ll.mit.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      145a42b3
    • Felix Fietkau's avatar
      pppoe: drop pppoe device in pppoe_unbind_sock_work · 665a6cd8
      Felix Fietkau authored
      After receiving a PADT and the socket is closed, user space will no
      longer drop the reference to the pppoe device.
      This leads to errors like this:
      
      [  488.570000] unregister_netdevice: waiting for eth0.2 to become free. Usage count = 2
      
      Fixes: 287f3a94 ("pppoe: Use workqueue to die properly when a PADT is received")
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      665a6cd8
    • Will Deacon's avatar
      iommu/arm-smmu: Fix sign-extension of upstream bus addresses at stage 1 · 5dc5616e
      Will Deacon authored
      Stage 1 translation is controlled by two sets of page tables (TTBR0 and
      TTBR1) which grow up and down from zero respectively in the ARMv8
      translation regime. For the SMMU, we only care about TTBR0 and, in the
      case of a 48-bit virtual space, we expect to map virtual addresses 0x0
      through to 0xffff_ffff_ffff.
      
      Given that some masters may be incapable of emitting virtual addresses
      targetting TTBR1 (e.g. because they sit on a 48-bit bus), the SMMU
      architecture allows bit 47 to be sign-extended, halving the virtual
      range of TTBR0 but allowing TTBR1 to be used. This is controlled by the
      SEP field in TTBCR2.
      
      The SMMU driver incorrectly enables this sign-extension feature, which
      causes problems when userspace addresses are programmed into a master
      device with the SMMU expecting to map the incoming transactions via
      TTBR0; if the top bit of address is set, we will instead get a
      translation fault since TTBR1 walks are disabled in the TTBCR.
      
      This patch fixes the issue by disabling sign-extension of a fixed
      virtual address bit and instead basing the behaviour on the upstream bus
      size: the incoming address is zero extended unless the upstream bus is
      only 49 bits wide, in which case bit 48 is used as the sign bit and is
      replicated to the upper bits.
      
      Cc: <stable@vger.kernel.org> # v4.0+
      Reported-by: default avatarVarun Sethi <varun.sethi@freescale.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      5dc5616e
    • Mark Brown's avatar
      Merge remote-tracking branches 'spi/fix/fsl-cpm', 'spi/fix/fsl-dspi' and... · c8b35042
      Mark Brown authored
      Merge remote-tracking branches 'spi/fix/fsl-cpm', 'spi/fix/fsl-dspi' and 'spi/fix/fsl-espi' into spi-linus
      c8b35042
    • Mark Brown's avatar
      Merge tag 'spi-v4.1-rc1' into spi-linus · bed5e4d8
      Mark Brown authored
      spi: Fixes for v4.1
      
      A few driver fixes plus two changes for the core, one to make the
      setup_transfer() callback optional which fixes crashes in some drivers
      which were updated to use new interfaces without apparent testing and
      one to ensure we don't expose the data buffers we use for dummy
      transfers to drivers which avoids potential issues with multiple
      accesses to them or reuse.
      
      # gpg: Signature made Sat 25 Apr 2015 10:59:47 BST using RSA key ID 5D5487D0
      # gpg: key CD7BEEBC: no public key for trusted key - skipped
      # gpg: key CD7BEEBC marked as ultimately trusted
      # gpg: key AF88CD16: no public key for trusted key - skipped
      # gpg: key AF88CD16 marked as ultimately trusted
      # gpg: key 16005C11: no public key for trusted key - skipped
      # gpg: key 16005C11 marked as ultimately trusted
      # gpg: key 5621E907: no public key for trusted key - skipped
      # gpg: key 5621E907 marked as ultimately trusted
      # gpg: key 5C6153AD: no public key for trusted key - skipped
      # gpg: key 5C6153AD marked as ultimately trusted
      # gpg: Good signature from "Mark Brown <broonie@sirena.org.uk>"
      # gpg:                 aka "Mark Brown <broonie@debian.org>"
      # gpg:                 aka "Mark Brown <broonie@kernel.org>"
      # gpg:                 aka "Mark Brown <broonie@tardis.ed.ac.uk>"
      # gpg:                 aka "Mark Brown <broonie@linaro.org>"
      # gpg:                 aka "Mark Brown <Mark.Brown@linaro.org>"
      bed5e4d8
    • Stefan Wahren's avatar
      net: qca_spi: Fix possible race during probe · 268be0f7
      Stefan Wahren authored
      Registering the netdev before setting the priv data is unsafe.
      So fix this possible race by setting the priv data first.
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Cc: <stable@vger.kernel.org> # v3.18+
      Fixes: 291ab06e (net: qualcomm: new Ethernet over SPI driver for QCA7000)
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      268be0f7
  5. 10 May, 2015 1 commit