1. 17 Nov, 2015 21 commits
    • Kosuke Tatsukawa's avatar
      tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c · 80910ccd
      Kosuke Tatsukawa authored
      commit e81107d4 upstream.
      
      My colleague ran into a program stall on a x86_64 server, where
      n_tty_read() was waiting for data even if there was data in the buffer
      in the pty.  kernel stack for the stuck process looks like below.
       #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
       #1 [ffff88303d107bd0] schedule at ffffffff815c513e
       #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
       #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
       #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
       #5 [ffff88303d107dd0] tty_read at ffffffff81368013
       #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
       #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
       #8 [ffff88303d107f00] sys_read at ffffffff811a4306
       #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
      
      There seems to be two problems causing this issue.
      
      First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
      updates ldata->commit_head using smp_store_release() and then checks
      the wait queue using waitqueue_active().  However, since there is no
      memory barrier, __receive_buf() could return without calling
      wake_up_interactive_poll(), and at the same time, n_tty_read() could
      start to wait in wait_woken() as in the following chart.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
      if (waitqueue_active(&tty->read_wait))
      /* Memory operations issued after the
         RELEASE may be completed before the
         RELEASE operation has completed */
                                              add_wait_queue(&tty->read_wait, &wait);
                                              ...
                                              if (!input_available_p(tty, 0)) {
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      The second problem is that n_tty_read() also lacks a memory barrier
      call and could also cause __receive_buf() to return without calling
      wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
      as in the chart below.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
                                              spin_lock_irqsave(&q->lock, flags);
                                              /* from add_wait_queue() */
                                              ...
                                              if (!input_available_p(tty, 0)) {
                                              /* Memory operations issued after the
                                                 RELEASE may be completed before the
                                                 RELEASE operation has completed */
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
      if (waitqueue_active(&tty->read_wait))
                                              __add_wait_queue(q, wait);
                                              spin_unlock_irqrestore(&q->lock,flags);
                                              /* from add_wait_queue() */
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      There are also other places in drivers/tty/n_tty.c which have similar
      calls to waitqueue_active(), so instead of adding many memory barrier
      calls, this patch simply removes the call to waitqueue_active(),
      leaving just wake_up*() behind.
      
      This fixes both problems because, even though the memory access before
      or after the spinlocks in both wake_up*() and add_wait_queue() can
      sneak into the critical section, it cannot go past it and the critical
      section assures that they will be serialized (please see "INTER-CPU
      ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
      better explanation).  Moreover, the resulting code is much simpler.
      
      Latency measurement using a ping-pong test over a pty doesn't show any
      visible performance drop.
      Signed-off-by: default avatarKosuke Tatsukawa <tatsu@ab.jp.nec.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2:
       - Use wake_up_interruptible(), not wake_up_interruptible_poll()
       - There are only two spurious uses of waitqueue_active() to remove]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      80910ccd
    • Vincent Palatin's avatar
      usb: Add device quirk for Logitech PTZ cameras · 32b95ad7
      Vincent Palatin authored
      commit 72194739 upstream.
      
      Add a device quirk for the Logitech PTZ Pro Camera and its sibling the
      ConferenceCam CC3000e Camera.
      This fixes the failed camera enumeration on some boot, particularly on
      machines with fast CPU.
      
      Tested by connecting a Logitech PTZ Pro Camera to a machine with a
      Haswell Core i7-4600U CPU @ 2.10GHz, and doing thousands of reboot cycles
      while recording the kernel logs and taking camera picture after each boot.
      Before the patch, more than 7% of the boots show some enumeration transfer
      failures and in a few of them, the kernel is giving up before actually
      enumerating the webcam. After the patch, the enumeration has been correct
      on every reboot.
      Signed-off-by: default avatarVincent Palatin <vpalatin@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      32b95ad7
    • Yao-Wen Mao's avatar
      USB: Add reset-resume quirk for two Plantronics usb headphones. · e365a50e
      Yao-Wen Mao authored
      commit 8484bf29 upstream.
      
      These two headphones need a reset-resume quirk to properly resume to
      original volume level.
      Signed-off-by: default avatarYao-Wen Mao <yaowen@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e365a50e
    • Dan Carpenter's avatar
      iio: accel: sca3000: memory corruption in sca3000_read_first_n_hw_rb() · 9d2d631c
      Dan Carpenter authored
      commit eda7d0f3 upstream.
      
      "num_read" is in byte units but we are write u16s so we end up write
      twice as much as intended.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9d2d631c
    • John Stultz's avatar
      clocksource: Fix abs() usage w/ 64bit values · 3f980809
      John Stultz authored
      commit 67dfae0c upstream.
      
      This patch fixes one cases where abs() was being used with 64-bit
      nanosecond values, where the result may be capped at 32-bits.
      
      This potentially could cause watchdog false negatives on 32-bit
      systems, so this patch addresses the issue by using abs64().
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3f980809
    • NeilBrown's avatar
      md/raid0: apply base queue limits *before* disk_stack_limits · e1263d46
      NeilBrown authored
      commit 66eefe5d upstream.
      
      Calling e.g. blk_queue_max_hw_sectors() after calls to
      disk_stack_limits() discards the settings determined by
      disk_stack_limits().
      So we need to make those calls first.
      
      Fixes: 199dc6ed ("md/raid0: update queue parameter in a safer location.")
      Reported-by: default avatarJes Sorensen <Jes.Sorensen@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      [bwh: Backported to 3.2: the code being moved looks a little different]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e1263d46
    • NeilBrown's avatar
      md/raid0: update queue parameter in a safer location. · 1afa0468
      NeilBrown authored
      commit 199dc6ed upstream.
      
      When a (e.g.) RAID5 array is reshaped to RAID0, the updating
      of queue parameters (e.g. max number of sectors per bio) is
      done in the wrong place.
      It should be part of ->run, but it is actually part of ->takeover.
      This means it happens before level_store() calls:
      
      	blk_set_stacking_limits(&mddev->queue->limits);
      
      and so it ineffective.  This can lead to errors from underlying
      devices.
      
      So move all the relevant settings out of create_stripe_zones()
      and into raid0_run().
      
      As this can lead to a bug-on it is suitable for any -stable
      kernel which supports reshape to RAID0.  So 2.6.35 or later.
      As the bug has been present for five years there is no urgency,
      so no need to rush into -stable.
      
      Fixes: 9af204cf ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover")
      Reported-by: default avatarYi Zhang <yizhan@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      [bwh: Backported to 3.2:
       - md has no discard or write-same support
       - md is not used by dm-raid so mddev->queue is never null
       - Open-code rdev_for_each()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1afa0468
    • Steve French's avatar
      Do not fall back to SMBWriteX in set_file_size error cases · c62cf38b
      Steve French authored
      commit 646200a0 upstream.
      
      The error paths in set_file_size for cifs and smb3 are incorrect.
      
      In the unlikely event that a server did not support set file info
      of the file size, the code incorrectly falls back to trying SMBWriteX
      (note that only the original core SMB Write, used for example by DOS,
      can set the file size this way - this actually  does not work for the more
      recent SMBWriteX).  The idea was since the old DOS SMB Write could set
      the file size if you write zero bytes at that offset then use that if
      server rejects the normal set file info call.
      
      Fortunately the SMBWriteX will never be sent on the wire (except when
      file size is zero) since the length and offset fields were reversed
      in the two places in this function that call SMBWriteX causing
      the fall back path to return an error. It is also important to never call
      an SMB request from an SMB2/sMB3 session (which theoretically would
      be possible, and can cause a brief session drop, although the client
      recovers) so this should be fixed.  In practice this path does not happen
      with modern servers but the error fall back to SMBWriteX is clearly wrong.
      
      Removing the calls to SMBWriteX in the error paths in cifs_set_file_size
      
      Pointed out by PaX/grsecurity team
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Reported-by: default avatarPaX Team <pageexec@freemail.hu>
      CC: Emese Revfy <re.emese@gmail.com>
      CC: Brad Spengler <spender@grsecurity.net>
      [bwh: Backported to 3.2: deleted code looks slightly different]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c62cf38b
    • Mel Gorman's avatar
      mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault · 846bc2d8
      Mel Gorman authored
      commit 2f84a899 upstream.
      
      SunDong reported the following on
      
        https://bugzilla.kernel.org/show_bug.cgi?id=103841
      
      	I think I find a linux bug, I have the test cases is constructed. I
      	can stable recurring problems in fedora22(4.0.4) kernel version,
      	arch for x86_64.  I construct transparent huge page, when the parent
      	and child process with MAP_SHARE, MAP_PRIVATE way to access the same
      	huge page area, it has the opportunity to lead to huge page copy on
      	write failure, and then it will munmap the child corresponding mmap
      	area, but then the child mmap area with VM_MAYSHARE attributes, child
      	process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags
      	functions (vma - > vm_flags & VM_MAYSHARE).
      
      There were a number of problems with the report (e.g.  it's hugetlbfs that
      triggers this, not transparent huge pages) but it was fundamentally
      correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that
      looks like this
      
      	 vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000
      	 next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800
      	 prot 8000000000000027 anon_vma           (null) vm_ops ffffffff8182a7a0
      	 pgoff 0 file ffff88106bdb9800 private_data           (null)
      	 flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb)
      	 ------------
      	 kernel BUG at mm/hugetlb.c:462!
      	 SMP
      	 Modules linked in: xt_pkttype xt_LOG xt_limit [..]
      	 CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1
      	 Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
      	 set_vma_resv_flags+0x2d/0x30
      
      The VM_BUG_ON is correct because private and shared mappings have
      different reservation accounting but the warning clearly shows that the
      VMA is shared.
      
      When a private COW fails to allocate a new page then only the process
      that created the VMA gets the page -- all the children unmap the page.
      If the children access that data in the future then they get killed.
      
      The problem is that the same file is mapped shared and private.  During
      the COW, the allocation fails, the VMAs are traversed to unmap the other
      private pages but a shared VMA is found and the bug is triggered.  This
      patch identifies such VMAs and skips them.
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reported-by: default avatarSunDong <sund_sky@126.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: David Rientjes <rientjes@google.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      846bc2d8
    • Ben Hutchings's avatar
      genirq: Fix race in register_irq_proc() · bde3a53c
      Ben Hutchings authored
      commit 95c2b175 upstream.
      
      Per-IRQ directories in procfs are created only when a handler is first
      added to the irqdesc, not when the irqdesc is created.  In the case of
      a shared IRQ, multiple tasks can race to create a directory.  This
      race condition seems to have been present forever, but is easier to
      hit with async probing.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.ukSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      bde3a53c
    • Thomas Gleixner's avatar
      x86/process: Add proper bound checks in 64bit get_wchan() · 5311d93d
      Thomas Gleixner authored
      commit eddd3826 upstream.
      
      Dmitry Vyukov reported the following using trinity and the memory
      error detector AddressSanitizer
      (https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel).
      
      [ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on
      address ffff88002e280000
      [ 124.576801] ffff88002e280000 is located 131938492886538 bytes to
      the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a)
      [ 124.578633] Accessed by thread T10915:
      [ 124.579295] inlined in describe_heap_address
      ./arch/x86/mm/asan/report.c:164
      [ 124.579295] #0 ffffffff810dd277 in asan_report_error
      ./arch/x86/mm/asan/report.c:278
      [ 124.580137] #1 ffffffff810dc6a0 in asan_check_region
      ./arch/x86/mm/asan/asan.c:37
      [ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0
      [ 124.581893] #3 ffffffff8107c093 in get_wchan
      ./arch/x86/kernel/process_64.c:444
      
      The address checks in the 64bit implementation of get_wchan() are
      wrong in several ways:
      
       - The lower bound of the stack is not the start of the stack
         page. It's the start of the stack page plus sizeof (struct
         thread_info)
      
       - The upper bound must be:
      
             top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long).
      
         The 2 * sizeof(unsigned long) is required because the stack pointer
         points at the frame pointer. The layout on the stack is: ... IP FP
         ... IP FP. So we need to make sure that both IP and FP are in the
         bounds.
      
      Fix the bound checks and get rid of the mix of numeric constants, u64
      and unsigned long. Making all unsigned long allows us to use the same
      function for 32bit as well.
      
      Use READ_ONCE() when accessing the stack. This does not prevent a
      concurrent wakeup of the task and the stack changing, but at least it
      avoids TOCTOU.
      
      Also check task state at the end of the loop. Again that does not
      prevent concurrent changes, but it avoids walking for nothing.
      
      Add proper comments while at it.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Based-on-patch-from: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: kasan-dev <kasan-dev@googlegroups.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
      Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [bwh: Backported to 3.2:
       - s/READ_ONCE/ACCESS_ONCE/
       - Remove use of TOP_OF_KERNEL_STACK_PADDING, not defined here and would
         be defined as 0]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5311d93d
    • James Hogan's avatar
      MIPS: dma-default: Fix 32-bit fall back to GFP_DMA · b9f15ae6
      James Hogan authored
      commit 53960059 upstream.
      
      If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32
      zone, as is the case for some 32-bit kernels, then massage_gfp_flags()
      will cause DMA memory allocated for devices with a 32..63-bit
      coherent_dma_mask to fall back to using __GFP_DMA, even though there may
      only be 32-bits of physical address available anyway.
      
      Correct that case to compare against a mask the size of phys_addr_t
      instead of always using a 64-bit mask.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Fixes: a2e715a8 ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.")
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9610/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b9f15ae6
    • shengyong's avatar
      UBI: return ENOSPC if no enough space available · 02eb3b90
      shengyong authored
      commit 7c7feb2e upstream.
      
      UBI: attaching mtd1 to ubi0
      UBI: scanning is finished
      UBI error: init_volumes: not enough PEBs, required 706, available 686
      UBI error: ubi_wl_init: no enough physical eraseblocks (-20, need 1)
      UBI error: ubi_attach_mtd_dev: failed to attach mtd1, error -12 <= NOT ENOMEM
      UBI error: ubi_init: cannot attach mtd1
      
      If available PEBs are not enough when initializing volumes, return -ENOSPC
      directly. If available PEBs are not enough when initializing WL, return
      -ENOSPC instead of -ENOMEM.
      Signed-off-by: default avatarSheng Yong <shengyong1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Reviewed-by: default avatarDavid Gstir <david@sigma-star.at>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      02eb3b90
    • Richard Weinberger's avatar
      UBI: Validate data_size · 73d95e7b
      Richard Weinberger authored
      commit 281fda27 upstream.
      
      Make sure that data_size is less than LEB size.
      Otherwise a handcrafted UBI image is able to trigger
      an out of bounds memory access in ubi_compare_lebs().
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Reviewed-by: default avatarDavid Gstir <david@sigma-star.at>
      [bwh: Backported to 3.2: drop first argument to ubi_err()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      73d95e7b
    • Malcolm Crossley's avatar
      x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map · a92a5446
      Malcolm Crossley authored
      commit 64c98e7f upstream.
      
      Sanitizing the e820 map may produce extra E820 entries which would result in
      the topmost E820 entries being removed. The removed entries would typically
      include the top E820 usable RAM region and thus result in the domain having
      signicantly less RAM available to it.
      
      Fix by allowing sanitize_e820_map to use the full size of the allocated E820
      array.
      Signed-off-by: default avatarMalcolm Crossley <malcolm.crossley@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      [bwh: Backported to 3.2:
       s/xen_e820_map_entries/memmap.nr_entries/; s/xen_e820_map/map/g]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a92a5446
    • Andreas Schwab's avatar
      m68k: Define asmlinkage_protect · d599a135
      Andreas Schwab authored
      commit 8474ba74 upstream.
      
      Make sure the compiler does not modify arguments of syscall functions.
      This can happen if the compiler generates a tailcall to another
      function.  For example, without asmlinkage_protect sys_openat is compiled
      into this function:
      
      sys_openat:
      	clr.l %d0
      	move.w 18(%sp),%d0
      	move.l %d0,16(%sp)
      	jbra do_sys_open
      
      Note how the fourth argument is modified in place, modifying the register
      %d4 that gets restored from this stack slot when the function returns to
      user-space.  The caller may expect the register to be unmodified across
      system calls.
      Signed-off-by: default avatarAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d599a135
    • Felix Fietkau's avatar
      ath9k: declare required extra tx headroom · 01352209
      Felix Fietkau authored
      commit 029cd037 upstream.
      
      ath9k inserts padding between the 802.11 header and the data area (to
      align it). Since it didn't declare this extra required headroom, this
      led to some nasty issues like randomly dropped packets in some setups.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      01352209
    • Mark Brown's avatar
      regmap: debugfs: Don't bother actually printing when calculating max length · 6767c520
      Mark Brown authored
      commit 176fc2d5 upstream.
      
      The in kernel snprintf() will conveniently return the actual length of
      the printed string even if not given an output beffer at all so just do
      that rather than relying on the user to pass in a suitable buffer,
      ensuring that we don't need to worry if the buffer was truncated due to
      the size of the buffer passed in.
      Reported-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6767c520
    • Mark Brown's avatar
      regmap: debugfs: Ensure we don't underflow when printing access masks · 9bd80295
      Mark Brown authored
      commit b763ec17 upstream.
      
      If a read is attempted which is smaller than the line length then we may
      underflow the subtraction we're doing with the unsigned size_t type so
      move some of the calculation to be additions on the right hand side
      instead in order to avoid this.
      Reported-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9bd80295
    • Peter Zijlstra's avatar
      module: Fix locking in symbol_put_addr() · 3895ff2d
      Peter Zijlstra authored
      commit 275d7d44 upstream.
      
      Poma (on the way to another bug) reported an assertion triggering:
      
        [<ffffffff81150529>] module_assert_mutex_or_preempt+0x49/0x90
        [<ffffffff81150822>] __module_address+0x32/0x150
        [<ffffffff81150956>] __module_text_address+0x16/0x70
        [<ffffffff81150f19>] symbol_put_addr+0x29/0x40
        [<ffffffffa04b77ad>] dvb_frontend_detach+0x7d/0x90 [dvb_core]
      
      Laura Abbott <labbott@redhat.com> produced a patch which lead us to
      inspect symbol_put_addr(). This function has a comment claiming it
      doesn't need to disable preemption around the module lookup
      because it holds a reference to the module it wants to find, which
      therefore cannot go away.
      
      This is wrong (and a false optimization too, preempt_disable() is really
      rather cheap, and I doubt any of this is on uber critical paths,
      otherwise it would've retained a pointer to the actual module anyway and
      avoided the second lookup).
      
      While its true that the module cannot go away while we hold a reference
      on it, the data structure we do the lookup in very much _CAN_ change
      while we do the lookup. Therefore fix the comment and add the
      required preempt_disable().
      Reported-by: default avatarpoma <pomidorabelisima@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Fixes: a6e6abd5 ("module: remove module_text_address()")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3895ff2d
    • Ben Hutchings's avatar
      Revert "KVM: MMU: fix validation of mmio page fault" · 005f90fa
      Ben Hutchings authored
      This reverts commit 41e3025e, which
      was commit 6f691251 upstream.
      
      The fix is only needed after commit f8f55942 ("KVM: MMU: fast
      invalidate all mmio sptes"), included in Linux 3.11.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      005f90fa
  2. 13 Oct, 2015 19 commits
    • Ben Hutchings's avatar
      Linux 3.2.72 · 0149138c
      Ben Hutchings authored
      0149138c
    • Ben Hutchings's avatar
      Revert "sctp: Fix race between OOTB responce and route removal" · 77d4e6b9
      Ben Hutchings authored
      This reverts commit 117b8a10, which
      was commit 29c4afc4 upstream.  The bug
      it fixes upstream clearly doesn't exist in 3.2.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      77d4e6b9
    • Jan Kara's avatar
      jbd2: avoid infinite loop when destroying aborted journal · 78ad4aa1
      Jan Kara authored
      commit 841df7df upstream.
      
      Commit 6f6a6fda "jbd2: fix ocfs2 corrupt when updating journal
      superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
      when the journal is aborted. That makes logic in
      jbd2_log_do_checkpoint() bail out which is fine, except that
      jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
      a progress in cleaning the journal. Without it jbd2_journal_destroy()
      just loops in an infinite loop.
      
      Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
      jbd2_log_do_checkpoint() fails with error.
      Reported-by: default avatarEryu Guan <guaneryu@gmail.com>
      Tested-by: default avatarEryu Guan <guaneryu@gmail.com>
      Fixes: 6f6a6fdaSigned-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Roland Dreier <roland@kernel.org>
      78ad4aa1
    • Helge Deller's avatar
      parisc: Filter out spurious interrupts in PA-RISC irq handler · c5ae4d40
      Helge Deller authored
      commit b1b4e435 upstream.
      
      When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
      long way to go to find the right IRQ line, registering it, then registering the
      serial port and the irq handler for the serial port. During this phase spurious
      interrupts for the serial port may happen which then crashes the kernel because
      the action handler might not have been set up yet.
      
      So, basically it's a race condition between the serial port hardware and the
      CPU which sets up the necessary fields in the irq sructs. The main reason for
      this race is, that we unmask the serial port irqs too early without having set
      up everything properly before (which isn't easily possible because we need the
      IRQ number to register the serial ports).
      
      This patch is a work-around for this problem. It adds checks to the CPU irq
      handler to verify if the IRQ action field has been initialized already. If not,
      we just skip this interrupt (which isn't critical for a serial port at bootup).
      The real fix would probably involve rewriting all PA-RISC specific IRQ code
      (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
      the irq chips and proper irq enabling along this line.
      
      This bug has been in the PA-RISC port since the beginning, but the crashes
      happened very rarely with currently used hardware.  But on the latest machine
      which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
      1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
      crashed at every boot because of this race. So, without this patch the machine
      would currently be unuseable.
      
      For the record, here is the flow logic:
      1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
      2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
      3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
      4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
      5. serial_init_chip() then registers the 8250 port.
      Problems:
      - In step 4 the CPU irq shouldn't have been registered yet, but after step 5
      - If serial irq happens between 4 and 5 have finished, the kernel will crash
      
      Cc: linux-parisc@vger.kernel.org
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c5ae4d40
    • Michal Kubeček's avatar
      ipv6: update ip6_rt_last_gc every time GC is run · 24277c76
      Michal Kubeček authored
      commit 49a18d86 upstream.
      
      As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
      hold the last time garbage collector was run so that we should
      update it whenever fib6_run_gc() calls fib6_clean_all(), not only
      if we got there from ip6_dst_gc().
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      24277c76
    • Michal Kubeček's avatar
      ipv6: prevent fib6_run_gc() contention · 2491f018
      Michal Kubeček authored
      commit 2ac3ac8f upstream.
      
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      2491f018
    • Kirill A. Shutemov's avatar
      perf tools: Fix build with perl 5.18 · 4e959244
      Kirill A. Shutemov authored
      commit 575bf1d0 upstream.
      
      perl.h from new Perl release doesn't like -Wundef and -Wswitch-default:
      
      /usr/lib/perl5/core_perl/CORE/perl.h:548:5: error: "SILENT_NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if SILENT_NO_TAINT_SUPPORT && !defined(NO_TAINT_SUPPORT)
           ^
      /usr/lib/perl5/core_perl/CORE/perl.h:556:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if NO_TAINT_SUPPORT
           ^
      In file included from /usr/lib/perl5/core_perl/CORE/perl.h:3471:0,
                       from util/scripting-engines/trace-event-perl.c:30:
      /usr/lib/perl5/core_perl/CORE/sv.h:1455:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if NO_TAINT_SUPPORT
           ^
      In file included from /usr/lib/perl5/core_perl/CORE/perl.h:3472:0,
                       from util/scripting-engines/trace-event-perl.c:30:
      /usr/lib/perl5/core_perl/CORE/regexp.h:436:5: error: "NO_TAINT_SUPPORT" is not defined [-Werror=undef]
       #if NO_TAINT_SUPPORT
           ^
      In file included from /usr/lib/perl5/core_perl/CORE/hv.h:592:0,
                       from /usr/lib/perl5/core_perl/CORE/perl.h:3480,
                       from util/scripting-engines/trace-event-perl.c:30:
      /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_siphash_2_4’:
      /usr/lib/perl5/core_perl/CORE/hv_func.h:222:3: error: switch missing default case [-Werror=switch-default]
         switch( left )
         ^
      /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_superfast’:
      /usr/lib/perl5/core_perl/CORE/hv_func.h:274:5: error: switch missing default case [-Werror=switch-default]
           switch (rem) { \
           ^
      /usr/lib/perl5/core_perl/CORE/hv_func.h: In function ‘S_perl_hash_murmur3’:
      /usr/lib/perl5/core_perl/CORE/hv_func.h:398:5: error: switch missing default case [-Werror=switch-default]
           switch(bytes_in_carry) { /* how many bytes in carry */
           ^
      
      Let's disable the warnings for code which uses perl.h.
      Signed-off-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1372063394-20126-1-git-send-email-kirill@shutemov.nameSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Vinson Lee <vlee@twopensource.com>
      4e959244
    • Wilson Kok's avatar
      fib_rules: fix fib rule dumps across multiple skbs · c7e9f97d
      Wilson Kok authored
      [ Upstream commit 41fc0143 ]
      
      dump_rules returns skb length and not error.
      But when family == AF_UNSPEC, the caller of dump_rules
      assumes that it returns an error. Hence, when family == AF_UNSPEC,
      we continue trying to dump on -EMSGSIZE errors resulting in
      incorrect dump idx carried between skbs belonging to the same dump.
      This results in fib rule dump always only dumping rules that fit
      into the first skb.
      
      This patch fixes dump_rules to return error so that we exit correctly
      and idx is correctly maintained between skbs that are part of the
      same dump.
      Signed-off-by: default avatarWilson Kok <wkok@cumulusnetworks.com>
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2:
       - s/portid/pid/
       - Check whether fib_nl_fill_rule() returns < 0, as it may return > 0 on
         success (thanks to Roland Dreier)]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Roland Dreier <roland@purestorage.com>
      c7e9f97d
    • Richard Laing's avatar
      net/ipv6: Correct PIM6 mrt_lock handling · ea43243c
      Richard Laing authored
      [ Upstream commit 25b4a44c ]
      
      In the IPv6 multicast routing code the mrt_lock was not being released
      correctly in the MFC iterator, as a result adding or deleting a MIF would
      cause a hang because the mrt_lock could not be acquired.
      
      This fix is a copy of the code for the IPv4 case and ensures that the lock
      is released correctly.
      Signed-off-by: default avatarRichard Laing <richard.laing@alliedtelesis.co.nz>
      Acked-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ea43243c
    • dingtianhong's avatar
      bonding: correct the MAC address for "follow" fail_over_mac policy · d2e03097
      dingtianhong authored
      [ Upstream commit a951bc1e ]
      
      The "follow" fail_over_mac policy is useful for multiport devices that
      either become confused or incur a performance penalty when multiple
      ports are programmed with the same MAC address, but the same MAC
      address still may happened by this steps for this policy:
      
      1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
         bond0 has the same mac address with eth0, it is MAC1.
      
      2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
         eth1 is backup, eth1 has MAC2.
      
      3) ifconfig eth0 down
         eth1 became active slave, bond will swap MAC for eth0 and eth1,
         so eth1 has MAC1, and eth0 has MAC2.
      
      4) ifconfig eth1 down
         there is no active slave, and eth1 still has MAC1, eth2 has MAC2.
      
      5) ifconfig eth0 up
         the eth0 became active slave again, the bond set eth0 to MAC1.
      
      Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
      MAC address, it will break this policy for ACTIVE_BACKUP mode.
      
      This patch will fix this problem by finding the old active slave and
      swap them MAC address before change active slave.
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Tested-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2:
       - bond_for_each_slave() takes an extra int paramter
       - Use compare_ether_addr() instead of ether_addr_equal()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d2e03097
    • Eric Dumazet's avatar
      ipv6: lock socket in ip6_datagram_connect() · c1a7dedb
      Eric Dumazet authored
      [ Upstream commit 03645a11 ]
      
      ip6_datagram_connect() is doing a lot of socket changes without
      socket being locked.
      
      This looks wrong, at least for udp_lib_rehash() which could corrupt
      lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c1a7dedb
    • Herbert Xu's avatar
      net: Fix skb csum races when peeking · 58a5897a
      Herbert Xu authored
      [ Upstream commit 89c22d8c ]
      
      When we calculate the checksum on the recv path, we store the
      result in the skb as an optimisation in case we need the checksum
      again down the line.
      
      This is in fact bogus for the MSG_PEEK case as this is done without
      any locking.  So multiple threads can peek and then store the result
      to the same skb, potentially resulting in bogus skb states.
      
      This patch fixes this by only storing the result if the skb is not
      shared.  This preserves the optimisations for the few cases where
      it can be done safely due to locking or other reasons, e.g., SIOCINQ.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      58a5897a
    • Oleg Nesterov's avatar
      net: pktgen: fix race between pktgen_thread_worker() and kthread_stop() · 7d41d849
      Oleg Nesterov authored
      [ Upstream commit fecdf8be ]
      
      pktgen_thread_worker() is obviously racy, kthread_stop() can come
      between the kthread_should_stop() check and set_current_state().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Reported-by: default avatarMarcelo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7d41d849
    • Stephen Smalley's avatar
      net/tipc: initialize security state for new connection socket · 79bff4bc
      Stephen Smalley authored
      [ Upstream commit fdd75ea8 ]
      
      Calling connect() with an AF_TIPC socket would trigger a series
      of error messages from SELinux along the lines of:
      SELinux: Invalid class 0
      type=AVC msg=audit(1434126658.487:34500): avc:  denied  { <unprintable> }
        for pid=292 comm="kworker/u16:5" scontext=system_u:system_r:kernel_t:s0
        tcontext=system_u:object_r:unlabeled_t:s0 tclass=<unprintable>
        permissive=0
      
      This was due to a failure to initialize the security state of the new
      connection sock by the tipc code, leaving it with junk in the security
      class field and an unlabeled secid.  Add a call to security_sk_clone()
      to inherit the security state from the parent socket.
      Reported-by: default avatarTim Shearer <tim.shearer@overturenetworks.com>
      Signed-off-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context, indentation]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      79bff4bc
    • Linus Torvalds's avatar
      Initialize msg/shm IPC objects before doing ipc_addid() · 2ef259c0
      Linus Torvalds authored
      commit b9a53227 upstream.
      
      As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
      having initialized the IPC object state.  Yes, we initialize the IPC
      object in a locked state, but with all the lockless RCU lookup work,
      that IPC object lock no longer means that the state cannot be seen.
      
      We already did this for the IPC semaphore code (see commit e8577d1f:
      "ipc/sem.c: fully initialize sem_array before making it visible") but we
      clearly forgot about msg and shm.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - The error path being moved looks a little different]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2ef259c0
    • Manfred Spraul's avatar
      ipc/sem.c: fully initialize sem_array before making it visible · 0bdf1e82
      Manfred Spraul authored
      commit e8577d1f upstream.
      
      ipc_addid() makes a new ipc identifier visible to everyone.  New objects
      start as locked, so that the caller can complete the initialization
      after the call.  Within struct sem_array, at least sma->sem_base and
      sma->sem_nsems are accessed without any locks, therefore this approach
      doesn't work.
      
      Thus: Move the ipc_addid() to the end of the initialization.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Reported-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - The error path being moved looks a little different]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0bdf1e82
    • Sasha Levin's avatar
      RDS: verify the underlying transport exists before creating a connection · 987ad6ee
      Sasha Levin authored
      commit 74e98eb0 upstream.
      
      There was no verification that an underlying transport exists when creating
      a connection, this would cause dereferencing a NULL ptr.
      
      It might happen on sockets that weren't properly bound before attempting to
      send a message, which will cause a NULL ptr deref:
      
      [135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [135546.051270] Modules linked in:
      [135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted 4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
      [135546.053217] task: ffff8800835bc000 ti: ffff8800bc708000 task.ti: ffff8800bc708000
      [135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
      [135546.055666] RSP: 0018:ffff8800bc70fab0  EFLAGS: 00010202
      [135546.056457] RAX: dffffc0000000000 RBX: 0000000000000f2c RCX: ffff8800835bc000
      [135546.057494] RDX: 0000000000000007 RSI: ffff8800835bccd8 RDI: 0000000000000038
      [135546.058530] RBP: ffff8800bc70fb18 R08: 0000000000000001 R09: 0000000000000000
      [135546.059556] R10: ffffed014d7a3a23 R11: ffffed014d7a3a21 R12: 0000000000000000
      [135546.060614] R13: 0000000000000001 R14: ffff8801ec3d0000 R15: 0000000000000000
      [135546.061668] FS:  00007faad4ffb700(0000) GS:ffff880252000000(0000) knlGS:0000000000000000
      [135546.062836] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [135546.063682] CR2: 000000000000846a CR3: 000000009d137000 CR4: 00000000000006a0
      [135546.064723] Stack:
      [135546.065048]  ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
      [135546.066247]  0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
      [135546.067438]  1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
      [135546.068629] Call Trace:
      [135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
      [135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
      [135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
      [135546.071981] rds_sendmsg (net/rds/send.c:1058)
      [135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
      [135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
      [135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
      [135546.076349] ? __might_fault (mm/memory.c:3795)
      [135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
      [135546.078856] SYSC_sendto (net/socket.c:1657)
      [135546.079596] ? SYSC_connect (net/socket.c:1628)
      [135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
      [135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
      [135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
      [135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
      [135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      987ad6ee
    • Jason Wang's avatar
      virtio-net: drop NETIF_F_FRAGLIST · e4afe1f1
      Jason Wang authored
      commit 48900cb6 upstream.
      
      virtio declares support for NETIF_F_FRAGLIST, but assumes
      that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
      always true with a fraglist.
      
      A longer fraglist in the skb will make the call to skb_to_sgvec overflow
      the sg array, leading to memory corruption.
      
      Drop NETIF_F_FRAGLIST so we only get what we can handle.
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e4afe1f1
    • Marcelo Leitner's avatar
      ipv6: addrconf: validate new MTU before applying it · 1c825dac
      Marcelo Leitner authored
      commit 77751427 upstream.
      
      Currently we don't check if the new MTU is valid or not and this allows
      one to configure a smaller than minimum allowed by RFCs or even bigger
      than interface own MTU, which is a problem as it may lead to packet
      drops.
      
      If you have a daemon like NetworkManager running, this may be exploited
      by remote attackers by forging RA packets with an invalid MTU, possibly
      leading to a DoS. (NetworkManager currently only validates for values
      too small, but not for too big ones.)
      
      The fix is just to make sure the new value is valid. That is, between
      IPV6_MIN_MTU and interface's MTU.
      
      Note that similar check is already performed at
      ndisc_router_discovery(), for when kernel itself parses the RA.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1c825dac