1. 20 Nov, 2016 40 commits
    • Al Viro's avatar
      alpha: fix copy_from_user() · 0a093364
      Al Viro authored
      commit 2561d309 upstream.
      
      it should clear the destination even when access_ok() fails.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0a093364
    • Will Deacon's avatar
      arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb() · ca88b6d9
      Will Deacon authored
      commit 872c63fb upstream.
      
      smp_mb__before_spinlock() is intended to upgrade a spin_lock() operation
      to a full barrier, such that prior stores are ordered with respect to
      loads and stores occuring inside the critical section.
      
      Unfortunately, the core code defines the barrier as smp_wmb(), which
      is insufficient to provide the required ordering guarantees when used in
      conjunction with our load-acquire-based spinlock implementation.
      
      This patch overrides the arm64 definition of smp_mb__before_spinlock()
      to map to a full smp_mb().
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reported-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ca88b6d9
    • Suzuki K Poulose's avatar
      kvm-arm: Unmap shadow pagetables properly · 21274362
      Suzuki K Poulose authored
      commit 293f2936 upstream.
      
      On arm/arm64, we depend on the kvm_unmap_hva* callbacks (via
      mmu_notifiers::invalidate_*) to unmap the stage2 pagetables when
      the userspace buffer gets unmapped. However, when the Hypervisor
      process exits without explicit unmap of the guest buffers, the only
      notifier we get is kvm_arch_flush_shadow_all() (via mmu_notifier::release
      ) which does nothing on arm. Later this causes us to access pages that
      were already released [via exit_mmap() -> unmap_vmas()] when we actually
      get to unmap the stage2 pagetable [via kvm_arch_destroy_vm() ->
      kvm_free_stage2_pgd()]. This triggers crashes with CONFIG_DEBUG_PAGEALLOC,
      which unmaps any free'd pages from the linear map.
      
       [  757.644120] Unable to handle kernel paging request at virtual address
        ffff800661e00000
       [  757.652046] pgd = ffff20000b1a2000
       [  757.655471] [ffff800661e00000] *pgd=00000047fffe3003, *pud=00000047fcd8c003,
        *pmd=00000047fcc7c003, *pte=00e8004661e00712
       [  757.666492] Internal error: Oops: 96000147 [#3] PREEMPT SMP
       [  757.672041] Modules linked in:
       [  757.675100] CPU: 7 PID: 3630 Comm: qemu-system-aar Tainted: G      D
       4.8.0-rc1 #3
       [  757.683240] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board,
        BIOS 3.06.15 Aug 19 2016
       [  757.692938] task: ffff80069cdd3580 task.stack: ffff8006adb7c000
       [  757.698840] PC is at __flush_dcache_area+0x1c/0x40
       [  757.703613] LR is at kvm_flush_dcache_pmd+0x60/0x70
       [  757.708469] pc : [<ffff20000809dbdc>] lr : [<ffff2000080b4a70>] pstate: 20000145
       ...
       [  758.357249] [<ffff20000809dbdc>] __flush_dcache_area+0x1c/0x40
       [  758.363059] [<ffff2000080b6748>] unmap_stage2_range+0x458/0x5f0
       [  758.368954] [<ffff2000080b708c>] kvm_free_stage2_pgd+0x34/0x60
       [  758.374761] [<ffff2000080b2280>] kvm_arch_destroy_vm+0x20/0x68
       [  758.380570] [<ffff2000080aa330>] kvm_put_kvm+0x210/0x358
       [  758.385860] [<ffff2000080aa524>] kvm_vm_release+0x2c/0x40
       [  758.391239] [<ffff2000082ad234>] __fput+0x114/0x2e8
       [  758.396096] [<ffff2000082ad46c>] ____fput+0xc/0x18
       [  758.400869] [<ffff200008104658>] task_work_run+0x108/0x138
       [  758.406332] [<ffff2000080dc8ec>] do_exit+0x48c/0x10e8
       [  758.411363] [<ffff2000080dd5fc>] do_group_exit+0x6c/0x130
       [  758.416739] [<ffff2000080ed924>] get_signal+0x284/0xa18
       [  758.421943] [<ffff20000808a098>] do_signal+0x158/0x860
       [  758.427060] [<ffff20000808aad4>] do_notify_resume+0x6c/0x88
       [  758.432608] [<ffff200008083624>] work_pending+0x10/0x14
       [  758.437812] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7e20)
      
      This patch fixes the issue by moving the kvm_free_stage2_pgd() to
      kvm_arch_flush_shadow_all().
      Tested-by: default avatarItaru Kitayama <itaru.kitayama@riken.jp>
      Reported-by: default avatarItaru Kitayama <itaru.kitayama@riken.jp>
      Reported-by: default avatarJames Morse <james.morse@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      21274362
    • Mathias Krause's avatar
      xfrm_user: propagate sec ctx allocation errors · a52b4c2a
      Mathias Krause authored
      commit 2f30ea50 upstream.
      
      When we fail to attach the security context in xfrm_state_construct()
      we'll return 0 as error value which, in turn, will wrongly claim success
      to userland when, in fact, we won't be adding / updating the XFRM state.
      
      This is a regression introduced by commit fd21150a ("[XFRM] netlink:
      Inline attach_encap_tmpl(), attach_sec_ctx(), and attach_one_addr()").
      
      Fix it by propagating the error returned by security_xfrm_state_alloc()
      in this case.
      
      Fixes: fd21150a ("[XFRM] netlink: Inline attach_encap_tmpl()...")
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a52b4c2a
    • Takashi Iwai's avatar
      ALSA: rawmidi: Fix possible deadlock with virmidi registration · 2c86c6eb
      Takashi Iwai authored
      commit 816f318b upstream.
      
      When a seq-virmidi driver is initialized, it registers a rawmidi
      instance with its callback to create an associated seq kernel client.
      Currently it's done throughly in rawmidi's register_mutex context.
      Recently it was found that this may lead to a deadlock another rawmidi
      device that is being attached with the sequencer is accessed, as both
      open with the same register_mutex.  This was actually triggered by
      syzkaller, as Dmitry Vyukov reported:
      
      ======================================================
       [ INFO: possible circular locking dependency detected ]
       4.8.0-rc1+ #11 Not tainted
       -------------------------------------------------------
       syz-executor/7154 is trying to acquire lock:
        (register_mutex#5){+.+.+.}, at: [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
      
       but task is already holding lock:
        (&grp->list_mutex){++++.+}, at: [<ffffffff850138bb>] check_and_subscribe_port+0x5b/0x5c0 sound/core/seq/seq_ports.c:495
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&grp->list_mutex){++++.+}:
          [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
          [<ffffffff863f6199>] down_read+0x49/0xc0 kernel/locking/rwsem.c:22
          [<     inline     >] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:681
          [<ffffffff85005c5e>] snd_seq_deliver_event+0x35e/0x890 sound/core/seq/seq_clientmgr.c:822
          [<ffffffff85006e96>] > snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418
          [<ffffffff85012c52>] snd_seq_system_broadcast+0xb2/0xf0 sound/core/seq/seq_system.c:101
          [<ffffffff84fff70a>] snd_seq_create_kernel_client+0x24a/0x330 sound/core/seq/seq_clientmgr.c:2297
          [<     inline     >] snd_virmidi_dev_attach_seq sound/core/seq/seq_virmidi.c:383
          [<ffffffff8502d29f>] snd_virmidi_dev_register+0x29f/0x750 sound/core/seq/seq_virmidi.c:450
          [<ffffffff84fd208c>] snd_rawmidi_dev_register+0x30c/0xd40 sound/core/rawmidi.c:1645
          [<ffffffff84f816d3>] __snd_device_register.part.0+0x63/0xc0 sound/core/device.c:164
          [<     inline     >] __snd_device_register sound/core/device.c:162
          [<ffffffff84f8235d>] snd_device_register_all+0xad/0x110 sound/core/device.c:212
          [<ffffffff84f7546f>] snd_card_register+0xef/0x6c0 sound/core/init.c:749
          [<ffffffff85040b7f>] snd_virmidi_probe+0x3ef/0x590 sound/drivers/virmidi.c:123
          [<ffffffff833ebf7b>] platform_drv_probe+0x8b/0x170 drivers/base/platform.c:564
          ......
      
       -> #0 (register_mutex#5){+.+.+.}:
          [<     inline     >] check_prev_add kernel/locking/lockdep.c:1829
          [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1939
          [<     inline     >] validate_chain kernel/locking/lockdep.c:2266
          [<ffffffff814791f4>] __lock_acquire+0x4d44/0x4d80 kernel/locking/lockdep.c:3335
          [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
          [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
          [<ffffffff863f0ef1>] mutex_lock_nested+0xb1/0xa20 kernel/locking/mutex.c:621
          [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
          [<ffffffff8502e7c7>] midisynth_subscribe+0xf7/0x350 sound/core/seq/seq_midi.c:188
          [<     inline     >] subscribe_port sound/core/seq/seq_ports.c:427
          [<ffffffff85013cc7>] check_and_subscribe_port+0x467/0x5c0 sound/core/seq/seq_ports.c:510
          [<ffffffff85015da9>] snd_seq_port_connect+0x2c9/0x500 sound/core/seq/seq_ports.c:579
          [<ffffffff850079b8>] snd_seq_ioctl_subscribe_port+0x1d8/0x2b0 sound/core/seq/seq_clientmgr.c:1480
          [<ffffffff84ffe9e4>] snd_seq_do_ioctl+0x184/0x1e0 sound/core/seq/seq_clientmgr.c:2225
          [<ffffffff84ffeae8>] snd_seq_kernel_client_ctl+0xa8/0x110 sound/core/seq/seq_clientmgr.c:2440
          [<ffffffff85027664>] snd_seq_oss_midi_open+0x3b4/0x610 sound/core/seq/oss/seq_oss_midi.c:375
          [<ffffffff85023d67>] snd_seq_oss_synth_setup_midi+0x107/0x4c0 sound/core/seq/oss/seq_oss_synth.c:281
          [<ffffffff8501b0a8>] snd_seq_oss_open+0x748/0x8d0 sound/core/seq/oss/seq_oss_init.c:274
          [<ffffffff85019d8a>] odev_open+0x6a/0x90 sound/core/seq/oss/seq_oss.c:138
          [<ffffffff84f7040f>] soundcore_open+0x30f/0x640 sound/sound_core.c:639
          ......
      
       other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&grp->list_mutex);
                                      lock(register_mutex#5);
                                      lock(&grp->list_mutex);
         lock(register_mutex#5);
      
       *** DEADLOCK ***
      ======================================================
      
      The fix is to simply move the registration parts in
      snd_rawmidi_dev_register() to the outside of the register_mutex lock.
      The lock is needed only to manage the linked list, and it's not
      necessarily to cover the whole initialization process.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2c86c6eb
    • Takashi Iwai's avatar
      ALSA: timer: Fix zero-division by continue of uninitialized instance · 0431fcc2
      Takashi Iwai authored
      commit 9f8a7658 upstream.
      
      When a user timer instance is continued without the explicit start
      beforehand, the system gets eventually zero-division error like:
      
        divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
        CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
         task: ffff88003c9b2280 task.stack: ffff880027280000
         RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns include/linux/ktime.h:195
         RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
        Call Trace:
         <IRQ>
         [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
         [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 kernel/time/hrtimer.c:1302
         [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
         [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 arch/x86/kernel/apic/apic.c:933
         [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:957
         [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:487
         <EOI>
         .....
      
      Although a similar issue was spotted and a fix patch was merged in
      commit [6b760bb2: ALSA: timer: fix division by zero after
      SNDRV_TIMER_IOCTL_CONTINUE], it seems covering only a part of
      iceberg.
      
      In this patch, we fix the issue a bit more drastically.  Basically the
      continue of an uninitialized timer is supposed to be a fresh start, so
      we do it for user timers.  For the direct snd_timer_continue() call,
      there is no way to pass the initial tick value, so we kick out for the
      uninitialized case.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [bwh: Backported to 3.16:
       - Adjust context
       - In _snd_timer_stop(), check the value of 'event' instead of 'stop']
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0431fcc2
    • Ard Biesheuvel's avatar
      crypto: cryptd - initialize child shash_desc on import · 151b56d3
      Ard Biesheuvel authored
      commit 0bd22235 upstream.
      
      When calling .import() on a cryptd ahash_request, the structure members
      that describe the child transform in the shash_desc need to be initialized
      like they are when calling .init()
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      151b56d3
    • Wei Yongjun's avatar
      ipv6: addrconf: fix dev refcont leak when DAD failed · fbdcb5e2
      Wei Yongjun authored
      commit 751eb6b6 upstream.
      
      In general, when DAD detected IPv6 duplicate address, ifp->state
      will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
      delayed work, the call tree should be like this:
      
      ndisc_recv_ns
        -> addrconf_dad_failure        <- missing ifp put
           -> addrconf_mod_dad_work
             -> schedule addrconf_dad_work()
               -> addrconf_dad_stop()  <- missing ifp hold before call it
      
      addrconf_dad_failure() called with ifp refcont holding but not put.
      addrconf_dad_work() call addrconf_dad_stop() without extra holding
      refcount. This will not cause any issue normally.
      
      But the race between addrconf_dad_failure() and addrconf_dad_work()
      may cause ifp refcount leak and netdevice can not be unregister,
      dmesg show the following messages:
      
      IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
      ...
      unregister_netdevice: waiting for eth0 to become free. Usage count = 1
      
      Fixes: c15b1cca ("ipv6: move DAD and addrconf_verify processing
      to workqueue")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fbdcb5e2
    • Chris Mason's avatar
      Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns · 9323870d
      Chris Mason authored
      commit cbd60aa7 upstream.
      
      We use a btrfs_log_ctx structure to pass information into the
      tree log commit, and get error values out.  It gets added to a per
      log-transaction list which we walk when things go bad.
      
      Commit d1433deb added an optimization to skip waiting for the log
      commit, but didn't take root_log_ctx out of the list.  This
      patch makes sure we remove things before exiting.
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Fixes: d1433debSigned-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9323870d
    • Forrest Liu's avatar
      Btrfs: add missing blk_finish_plug in btrfs_sync_log() · bc892cbf
      Forrest Liu authored
      commit 3da5ab56 upstream.
      
      Add missing blk_finish_plug in btrfs_sync_log()
      Signed-off-by: default avatarForrest Liu <forrestl@synology.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bc892cbf
    • Gregor Boirie's avatar
      iio:core: fix IIO_VAL_FRACTIONAL sign handling · d91d35de
      Gregor Boirie authored
      commit 171c0091 upstream.
      
      7985e7c1 ("iio: Introduce a new fractional value type") introduced a
      new IIO_VAL_FRACTIONAL value type meant to represent rational type numbers
      expressed by a numerator and denominator combination.
      
      Formating of IIO_VAL_FRACTIONAL values relies upon do_div() usage. This
      fails handling negative values properly since parameters are reevaluated
      as unsigned values.
      Fix this by using div_s64_rem() instead. Computed integer part will carry
      properly signed value. Formatted fractional part will always be positive.
      
      Fixes: 7985e7c1 ("iio: Introduce a new fractional value type")
      Signed-off-by: default avatarGregor Boirie <gregor.boirie@parrot.com>
      Reviewed-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d91d35de
    • Jeffrey Hugo's avatar
      efi/libstub: Allocate headspace in efi_get_memory_map() · b4bcdf68
      Jeffrey Hugo authored
      commit dadb57ab upstream.
      
      efi_get_memory_map() allocates a buffer to store the memory map that it
      retrieves.  This buffer may need to be reused by the client after
      ExitBootServices() is called, at which point allocations are not longer
      permitted.  To support this usecase, provide the allocated buffer size back
      to the client, and allocate some additional headroom to account for any
      reasonable growth in the map that is likely to happen between the call to
      efi_get_memory_map() and the client reusing the buffer.
      Signed-off-by: default avatarJeffrey Hugo <jhugo@codeaurora.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Leif Lindholm <leif.lindholm@linaro.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      [bwh: Backported to 3.16:
       - Adjust filenames, context
       - In allocate_new_fdt_and_exit_boot(), only fill memory_map
       - Drop changes to efi_random_alloc()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b4bcdf68
    • Yoshihiro Shimoda's avatar
      usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition · 35f6b4a9
      Yoshihiro Shimoda authored
      commit 519d8bd4 upstream.
      
      The previous driver is possible to stop the transfer wrongly.
      For example:
       1) An interrupt happens, but not BRDY interruption.
       2) Read INTSTS0. And than state->intsts0 is not set to BRDY.
       3) BRDY is set to 1 here.
       4) Read BRDYSTS.
       5) Clear the BRDYSTS. And then. the BRDY is cleared wrongly.
      
      Remarks:
       - The INTSTS0.BRDY is read only.
        - If any bits of BRDYSTS are set to 1, the BRDY is set to 1.
        - If BRDYSTS is 0, the BRDY is set to 0.
      
      So, this patch adds condition to avoid such situation. (And about
      NRDYSTS, this is not used for now. But, avoiding any side effects,
      this patch doesn't touch it.)
      
      Fixes: d5c6a1e0 ("usb: renesas_usbhs: fixup interrupt status clear method")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      35f6b4a9
    • Balbir Singh's avatar
      sched/core: Fix a race between try_to_wake_up() and a woken up task · 1a06e104
      Balbir Singh authored
      commit 135e8c92 upstream.
      
      The origin of the issue I've seen is related to
      a missing memory barrier between check for task->state and
      the check for task->on_rq.
      
      The task being woken up is already awake from a schedule()
      and is doing the following:
      
      	do {
      		schedule()
      		set_current_state(TASK_(UN)INTERRUPTIBLE);
      	} while (!cond);
      
      The waker, actually gets stuck doing the following in
      try_to_wake_up():
      
      	while (p->on_cpu)
      		cpu_relax();
      
      Analysis:
      
      The instance I've seen involves the following race:
      
       CPU1					CPU2
      
       while () {
         if (cond)
           break;
         do {
           schedule();
           set_current_state(TASK_UN..)
         } while (!cond);
      					wakeup_routine()
      					  spin_lock_irqsave(wait_lock)
         raw_spin_lock_irqsave(wait_lock)	  wake_up_process()
       }					  try_to_wake_up()
       set_current_state(TASK_RUNNING);	  ..
       list_del(&waiter.list);
      
      CPU2 wakes up CPU1, but before it can get the wait_lock and set
      current state to TASK_RUNNING the following occurs:
      
       CPU3
       wakeup_routine()
       raw_spin_lock_irqsave(wait_lock)
       if (!list_empty)
         wake_up_process()
         try_to_wake_up()
         raw_spin_lock_irqsave(p->pi_lock)
         ..
         if (p->on_rq && ttwu_wakeup())
         ..
         while (p->on_cpu)
           cpu_relax()
         ..
      
      CPU3 tries to wake up the task on CPU1 again since it finds
      it on the wait_queue, CPU1 is spinning on wait_lock, but immediately
      after CPU2, CPU3 got it.
      
      CPU3 checks the state of p on CPU1, it is TASK_UNINTERRUPTIBLE and
      the task is spinning on the wait_lock. Interestingly since p->on_rq
      is checked under pi_lock, I've noticed that try_to_wake_up() finds
      p->on_rq to be 0. This was the most confusing bit of the analysis,
      but p->on_rq is changed under runqueue lock, rq_lock, the p->on_rq
      check is not reliable without this fix IMHO. The race is visible
      (based on the analysis) only when ttwu_queue() does a remote wakeup
      via ttwu_queue_remote. In which case the p->on_rq change is not
      done uder the pi_lock.
      
      The result is that after a while the entire system locks up on
      the raw_spin_irqlock_save(wait_lock) and the holder spins infintely
      
      Reproduction of the issue:
      
      The issue can be reproduced after a long run on my system with 80
      threads and having to tweak available memory to very low and running
      memory stress-ng mmapfork test. It usually takes a long time to
      reproduce. I am trying to work on a test case that can reproduce
      the issue faster, but thats work in progress. I am still testing the
      changes on my still in a loop and the tests seem OK thus far.
      
      Big thanks to Benjamin and Nick for helping debug this as well.
      Ben helped catch the missing barrier, Nick caught every missing
      bit in my theory.
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      [ Updated comment to clarify matching barriers. Many
        architectures do not have a full barrier in switch_to()
        so that cannot be relied upon. ]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicholas Piggin <nicholas.piggin@gmail.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e02cce7b-d9ca-1ad0-7a61-ea97c7582b37@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1a06e104
    • Linus Walleij's avatar
      iio: accel: kxsd9: Fix scaling bug · 3481ab8a
      Linus Walleij authored
      commit 307fe9dd upstream.
      
      All the scaling of the KXSD9 involves multiplication with a
      fraction number < 1.
      
      However the scaling value returned from IIO_INFO_SCALE was
      unpredictable as only the micros of the value was assigned, and
      not the integer part, resulting in scaling like this:
      
      $cat in_accel_scale
      -1057462640.011978
      
      Fix this by assigning zero to the integer part.
      Tested-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3481ab8a
    • Kweh, Hock Leong's avatar
      iio: fix pressure data output unit in hid-sensor-attributes · ba041065
      Kweh, Hock Leong authored
      commit 36afb176 upstream.
      
      According to IIO ABI definition, IIO_PRESSURE data output unit is
      kilopascal:
      http://lxr.free-electrons.com/source/Documentation/ABI/testing/sysfs-bus-iio
      
      This patch fix output unit of HID pressure sensor IIO driver from pascal to
      kilopascal to follow IIO ABI definition.
      Signed-off-by: default avatarKweh, Hock Leong <hock.leong.kweh@intel.com>
      Reviewed-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ba041065
    • Sabrina Dubroca's avatar
      l2tp: fix use-after-free during module unload · 047a7e9c
      Sabrina Dubroca authored
      commit 2f86953e upstream.
      
      Tunnel deletion is delayed by both a workqueue (l2tp_tunnel_delete -> wq
       -> l2tp_tunnel_del_work) and RCU (sk_destruct -> RCU ->
      l2tp_tunnel_destruct).
      
      By the time l2tp_tunnel_destruct() runs to destroy the tunnel and finish
      destroying the socket, the private data reserved via the net_generic
      mechanism has already been freed, but l2tp_tunnel_destruct() actually
      uses this data.
      
      Make sure tunnel deletion for the netns has completed before returning
      from l2tp_exit_net() by first flushing the tunnel removal workqueue, and
      then waiting for RCU callbacks to complete.
      
      Fixes: 167eb17e ("l2tp: create tunnel sockets in the right namespace")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      047a7e9c
    • Emanuel Czirai's avatar
      x86/AMD: Apply erratum 665 on machines without a BIOS fix · b9ebc787
      Emanuel Czirai authored
      commit d1992996 upstream.
      
      AMD F12h machines have an erratum which can cause DIV/IDIV to behave
      unpredictably. The workaround is to set MSRC001_1029[31] but sometimes
      there is no BIOS update containing that workaround so let's do it
      ourselves unconditionally. It is simple enough.
      
      [ Borislav: Wrote commit message. ]
      Signed-off-by: default avatarEmanuel Czirai <icanrealizeum@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Yaowu Xu <yaowu@google.com>
      Link: http://lkml.kernel.org/r/20160902053550.18097-1-bp@alien8.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [bwh: Backported to 3.16:
       - Add an if-statement to init_amd() in place of the switch
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b9ebc787
    • Erez Shitrit's avatar
      IB/ipoib: Fix memory corruption in ipoib cm mode connect flow · 469b4997
      Erez Shitrit authored
      commit 546481c2 upstream.
      
      When a new CM connection is being requested, ipoib driver copies data
      from the path pointer in the CM/tx object, the path object might be
      invalid at the point and memory corruption will happened later when now
      the CM driver will try using that data.
      
      The next scenario demonstrates it:
      	neigh_add_path --> ipoib_cm_create_tx -->
      	queue_work (pointer to path is in the cm/tx struct)
      	#while the work is still in the queue,
      	#the port goes down and causes the ipoib_flush_paths:
      	ipoib_flush_paths --> path_free --> kfree(path)
      	#at this point the work scheduled starts.
      	ipoib_cm_tx_start --> copy from the (invalid)path pointer:
      	(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
      	 -> memory corruption.
      
      To fix that the driver now starts the CM/tx connection only if that
      specific path exists in the general paths database.
      This check is protected with the relevant locks, and uses the gid from
      the neigh member in the CM/tx object which is valid according to the ref
      count that was taken by the CM/tx.
      
      Fixes: 839fcaba ('IPoIB: Connected mode experimental support')
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      469b4997
    • Erez Shitrit's avatar
      IB/core: Fix use after free in send_leave function · 44caa356
      Erez Shitrit authored
      commit 68c6bcdd upstream.
      
      The function send_leave sets the member: group->query_id
      (group->query_id = ret) after calling the sa_query, but leave_handler
      can be executed before the setting and it might delete the group object,
      and will get a memory corruption.
      
      Additionally, this patch gets rid of group->query_id variable which is
      not used.
      
      Fixes: faec2f7b ('IB/sa: Track multicast join/leave requests')
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      44caa356
    • Steven Rostedt's avatar
      x86/paravirt: Do not trace _paravirt_ident_*() functions · 05554fc3
      Steven Rostedt authored
      commit 15301a57 upstream.
      
      Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
      after enabling function tracer. I asked him to bisect the functions within
      available_filter_functions, which he did and it came down to three:
      
        _paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()
      
      It was found that this is only an issue when noreplace-paravirt is added
      to the kernel command line.
      
      This means that those functions are most likely called within critical
      sections of the funtion tracer, and must not be traced.
      
      In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
      longer an issue.  But both _paravirt_ident_{32,64}() causes the
      following splat when they are traced:
      
       mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
       mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
       mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
       mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
       NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
       Modules linked in: e1000e
       CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
       RIP: 0010:[<ffffffff81134148>]  [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
       RSP: 0018:ffff8800d4aefb90  EFLAGS: 00000246
       RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
       RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
       RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
       R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
       R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
       FS:  00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
       Call Trace:
         _raw_spin_lock+0x27/0x30
         handle_pte_fault+0x13db/0x16b0
         handle_mm_fault+0x312/0x670
         __do_page_fault+0x1b1/0x4e0
         do_page_fault+0x22/0x30
         page_fault+0x28/0x30
         __vfs_read+0x28/0xe0
         vfs_read+0x86/0x130
         SyS_read+0x46/0xa0
         entry_SYSCALL_64_fastpath+0x1e/0xa8
       Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b
      Reported-by: default avatarŁukasz Daniluk <lukasz.daniluk@intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      05554fc3
    • Vegard Nossum's avatar
      ALSA: timer: fix NULL pointer dereference in read()/ioctl() race · e1df1219
      Vegard Nossum authored
      commit 11749e08 upstream.
      
      I got this with syzkaller:
      
          ==================================================================
          BUG: KASAN: null-ptr-deref on address 0000000000000020
          Read of size 32 by task syz-executor/22519
          CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2
          014
           0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90
           ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80
           ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68
          Call Trace:
           [<ffffffff81f9f141>] dump_stack+0x83/0xb2
           [<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0
           [<ffffffff8161ff74>] kasan_report+0x34/0x40
           [<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790
           [<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0
           [<ffffffff8161e9c1>] kasan_check_read+0x11/0x20
           [<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790
           [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
           [<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250
           [<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0
           [<ffffffff8127c278>] ? do_group_exit+0x108/0x330
           [<ffffffff8174653a>] ? fsnotify+0x72a/0xca0
           [<ffffffff81674dfe>] __vfs_read+0x10e/0x550
           [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
           [<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50
           [<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60
           [<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190
           [<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380
           [<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0
           [<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0
           [<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0
           [<ffffffff81675355>] vfs_read+0x115/0x330
           [<ffffffff81676371>] SyS_read+0xd1/0x1a0
           [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
           [<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20
           [<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0
           [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
           [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
           [<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
           [<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25
          ==================================================================
      
      There are a couple of problems that I can see:
      
       - ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets
         tu->queue/tu->tqueue to NULL on memory allocation failure, so read()
         would get a NULL pointer dereference like the above splat
      
       - the same ioctl() can free tu->queue/to->tqueue which means read()
         could potentially see (and dereference) the freed pointer
      
      We can fix both by taking the ioctl_lock mutex when dereferencing
      ->queue/->tqueue, since that's always held over all the ioctl() code.
      
      Just looking at the code I find it likely that there are more problems
      here such as tu->qhead pointing outside the buffer if the size is
      changed concurrently using SNDRV_TIMER_IOCTL_PARAMS.
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e1df1219
    • Michal Hocko's avatar
      kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd · e717abc5
      Michal Hocko authored
      commit 735f2770 upstream.
      
      Commit fec1d011 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal
      exit") has caused a subtle regression in nscd which uses
      CLONE_CHILD_CLEARTID to clear the nscd_certainly_running flag in the
      shared databases, so that the clients are notified when nscd is
      restarted.  Now, when nscd uses a non-persistent database, clients that
      have it mapped keep thinking the database is being updated by nscd, when
      in fact nscd has created a new (anonymous) one (for non-persistent
      databases it uses an unlinked file as backend).
      
      The original proposal for the CLONE_CHILD_CLEARTID change claimed
      (https://lkml.org/lkml/2006/10/25/233):
      
      : The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
      : on behalf of pthread_create() library calls.  This feature is used to
      : request that the kernel clear the thread-id in user space (at an address
      : provided in the syscall) when the thread disassociates itself from the
      : address space, which is done in mm_release().
      :
      : Unfortunately, when a multi-threaded process incurs a core dump (such as
      : from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
      : the other threads, which then proceed to clear their user-space tids
      : before synchronizing in exit_mm() with the start of core dumping.  This
      : misrepresents the state of process's address space at the time of the
      : SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
      : problems (misleading him/her to conclude that the threads had gone away
      : before the fault).
      :
      : The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
      : core dump has been initiated.
      
      The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
      seems to have a larger scope than the original patch asked for.  It
      seems that limitting the scope of the check to core dumping should work
      for SIGSEGV issue describe above.
      
      [Changelog partly based on Andreas' description]
      Fixes: fec1d011 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
      Link: http://lkml.kernel.org/r/1471968749-26173-1-git-send-email-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Tested-by: default avatarWilliam Preston <wpreston@suse.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Andreas Schwab <schwab@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e717abc5
    • Neal Cardwell's avatar
      tcp: fastopen: fix rcv_wup initialization for TFO server on SYN/data · 9e2c5a54
      Neal Cardwell authored
      commit 28b346cb upstream.
      
      Yuchung noticed that on the first TFO server data packet sent after
      the (TFO) handshake, the server echoed the TCP timestamp value in the
      SYN/data instead of the timestamp value in the final ACK of the
      handshake. This problem did not happen on regular opens.
      
      The tcp_replace_ts_recent() logic that decides whether to remember an
      incoming TS value needs tp->rcv_wup to hold the latest receive
      sequence number that we have ACKed (latest tp->rcv_nxt we have
      ACKed). This commit fixes this issue by ensuring that a TFO server
      properly updates tp->rcv_wup to match tp->rcv_nxt at the time it sends
      a SYN/ACK for the SYN/data.
      Reported-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9e2c5a54
    • Nicolas Dichtel's avatar
      ipv6: add missing netconf notif when 'all' is updated · 37129ea3
      Nicolas Dichtel authored
      commit d26c638c upstream.
      
      The 'default' value was not advertised.
      
      Fixes: f3a1bfb1 ("rtnl/ipv6: use netconf msg to advertise forwarding status")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      37129ea3
    • Jimi Damon's avatar
      serial: 8250: added acces i/o products quad and octal serial cards · 41a50683
      Jimi Damon authored
      commit c8d19242 upstream.
      
      Added devices ids for acces i/o products quad and octal serial cards
      that make use of existing Pericom PI7C9X7954 and PI7C9X7958
      configurations .
      Signed-off-by: default avatarJimi Damon <jdamon@accesio.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      41a50683
    • Takashi Sakamoto's avatar
      ALSA: fireworks: accessing to user space outside spinlock · 3aa1076b
      Takashi Sakamoto authored
      commit 6b1ca4bc upstream.
      
      In hwdep interface of fireworks driver, accessing to user space is in a
      critical section with disabled local interrupt. Depending on architecture,
      accessing to user space can cause page fault exception. Then local
      processor stores machine status and handles the synchronous event. A
      handler corresponding to the event can call task scheduler to wait for
      preparing pages. In a case of usage of single core processor, the state to
      disable local interrupt is worse because it don't handle usual interrupts
      from hardware.
      
      This commit fixes this bug, performing the accessing outside spinlock. This
      commit also gives up counting the number of queued response messages to
      simplify ring-buffer management.
      Reported-by: default avatarVaishali Thakkar <vaishali.thakkar@oracle.com>
      Fixes: 555e8a8f('ALSA: fireworks: Add command/response functionality into hwdep interface')
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3aa1076b
    • Tejun Heo's avatar
      kernfs: don't depend on d_find_any_alias() when generating notifications · 01324354
      Tejun Heo authored
      commit df6a58c5 upstream.
      
      kernfs_notify_workfn() sends out file modified events for the
      scheduled kernfs_nodes.  Because the modifications aren't from
      userland, it doesn't have the matching file struct at hand and can't
      use fsnotify_modify().  Instead, it looked up the inode and then used
      d_find_any_alias() to find the dentry and used fsnotify_parent() and
      fsnotify() directly to generate notifications.
      
      The assumption was that the relevant dentries would have been pinned
      if there are listeners, which isn't true as inotify doesn't pin
      dentries at all and watching the parent doesn't pin the child dentries
      even for dnotify.  This led to, for example, inotify watchers not
      getting notifications if the system is under memory pressure and the
      matching dentries got reclaimed.  It can also be triggered through
      /proc/sys/vm/drop_caches or a remount attempt which involves shrinking
      dcache.
      
      fsnotify_parent() only uses the dentry to access the parent inode,
      which kernfs can do easily.  Update kernfs_notify_workfn() so that it
      uses fsnotify() directly for both the parent and target inodes without
      going through d_find_any_alias().  While at it, supply the target file
      name to fsnotify() from kernfs_node->name.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarEvgeny Vereshchagin <evvers@ya.ru>
      Fixes: d911d987 ("kernfs: make kernfs_notify() trigger inotify events too")
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Robert Love <rlove@rlove.org>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      01324354
    • Eric Biggers's avatar
      dm crypt: fix free of bad values after tfm allocation failure · afa82093
      Eric Biggers authored
      commit 5d0be84e upstream.
      
      If crypt_alloc_tfms() had to allocate multiple tfms and it failed before
      the last allocation, then it would call crypt_free_tfms() and could free
      pointers from uninitialized memory -- due to the crypt_free_tfms() check
      for non-zero cc->tfms[i].  Fix by allocating zeroed memory.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      afa82093
    • Trond Myklebust's avatar
      NFSv4.x: Fix a refcount leak in nfs_callback_up_net · c512940b
      Trond Myklebust authored
      commit 98b0f80c upstream.
      
      On error, the callers expect us to return without bumping
      nn->cb_users[].
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c512940b
    • Vegard Nossum's avatar
      ALSA: timer: fix NULL pointer dereference on memory allocation failure · e5170157
      Vegard Nossum authored
      commit 8ddc0563 upstream.
      
      I hit this with syzkaller:
      
          kasan: CONFIG_KASAN_INLINE enabled
          kasan: GPF could be caused by NULL-ptr deref or user memory access
          general protection fault: 0000 [#1] PREEMPT SMP KASAN
          CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #190
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
          task: ffff88011278d600 task.stack: ffff8801120c0000
          RIP: 0010:[<ffffffff82c8ba07>]  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
          RSP: 0018:ffff8801120c7a60  EFLAGS: 00010006
          RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007
          RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048
          RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790
          R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980
          R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286
          FS:  00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0
          Stack:
           ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0
           ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000
           ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0
          Call Trace:
           [<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670
           [<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0
           [<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830
           [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
           [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
           [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
           [<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0
           [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
           [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
           [<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0
           [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
           [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
           [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
           [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
           [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
           [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
           [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
           [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
           [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
           [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46
          RIP  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
           RSP <ffff8801120c7a60>
          ---[ end trace 5955b08db7f2b029 ]---
      
      This can happen if snd_hrtimer_open() fails to allocate memory and
      returns an error, which is currently not checked by snd_timer_open():
      
          ioctl(SNDRV_TIMER_IOCTL_SELECT)
           - snd_timer_user_tselect()
      	- snd_timer_close()
      	   - snd_hrtimer_close()
      	      - (struct snd_timer *) t->private_data = NULL
              - snd_timer_open()
                 - snd_hrtimer_open()
                    - kzalloc() fails; t->private_data is still NULL
      
          ioctl(SNDRV_TIMER_IOCTL_START)
           - snd_timer_user_start()
      	- snd_timer_start()
      	   - snd_timer_start1()
      	      - snd_hrtimer_start()
      		- t->private_data == NULL // boom
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e5170157
    • Vegard Nossum's avatar
      ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE · b4782f57
      Vegard Nossum authored
      commit 6b760bb2 upstream.
      
      I got this:
      
          divide error: 0000 [#1] PREEMPT SMP KASAN
          CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #189
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
          task: ffff8801120a9580 task.stack: ffff8801120b0000
          RIP: 0010:[<ffffffff82c8bd9a>]  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
          RSP: 0018:ffff88011aa87da8  EFLAGS: 00010006
          RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
          RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
          RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
          R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
          R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
          FS:  00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
          Stack:
           0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
           ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
           00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
          Call Trace:
           <IRQ>
           [<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
           [<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
           [<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
           [<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
           [<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
           [<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
           [<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
           [<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
           [<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
           <EOI>
           [<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
           [<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
           [<ffffffff82c87015>] snd_timer_continue+0x45/0x80
           [<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
           [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
           [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
           [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
           [<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
           [<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
           [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
           [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
           [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
           [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
           [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
           [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
           [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
           [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
           [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
           [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
           [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
           [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
          RIP  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
           RSP <ffff88011aa87da8>
          ---[ end trace 6aa380f756a21074 ]---
      
      The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
      completely new/unused timer -- it will have ->sticks == 0, which causes a
      divide by 0 in snd_hrtimer_callback().
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b4782f57
    • Mukesh Ojha's avatar
      powerpc/powernv : Drop reference added by kset_find_obj() · 91ed1914
      Mukesh Ojha authored
      commit a9cbf0b2 upstream.
      
      In a situation, where Linux kernel gets notified about duplicate error log
      from OPAL, it is been observed that kernel fails to remove sysfs entries
      (/sys/firmware/opal/elog/0xXXXXXXXX) of such error logs. This is because,
      we currently search the error log/dump kobject in the kset list via
      'kset_find_obj()' routine. Which eventually increment the reference count
      by one, once it founds the kobject.
      
      So, unless we decrement the reference count by one after it found the kobject,
      we would not be able to release the kobject properly later.
      
      This patch adds the 'kobject_put()' which was missing earlier.
      Signed-off-by: default avatarMukesh Ojha <mukesh02@linux.vnet.ibm.com>
      Reviewed-by: default avatarVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      91ed1914
    • Rob Clark's avatar
      drm/msm: protect against faults from copy_from_user() in submit ioctl · 1cc8a445
      Rob Clark authored
      commit d78d383a upstream.
      
      An evil userspace could try to cause deadlock by passing an unfaulted-in
      GEM bo as submit->bos (or submit->cmds) table.  Which will trigger
      msm_gem_fault() while we already hold struct_mutex.  See:
      
      https://github.com/freedreno/msmtest/blob/master/evilsubmittest.cSigned-off-by: default avatarRob Clark <robdclark@gmail.com>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1cc8a445
    • Rob Clark's avatar
      drm/msm: fix use of copy_from_user() while holding spinlock · b7ba904b
      Rob Clark authored
      commit 89f82cbb upstream.
      
      Use instead __copy_from_user_inatomic() and fallback to slow-path where
      we drop and re-aquire the lock in case of fault.
      Reported-by: default avatarVaishali Thakkar <vaishali.thakkar@oracle.com>
      Signed-off-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b7ba904b
    • Rob Clark's avatar
      drm/msm: use mutex_lock_interruptible for submit ioctl · cc5fdd0c
      Rob Clark authored
      commit b5b4c264 upstream.
      
      Be kinder to things that do lots of signal handling (ie. Xorg)
      Signed-off-by: default avatarRob Clark <robdclark@gmail.com>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cc5fdd0c
    • Vegard Nossum's avatar
      fs/seq_file: fix out-of-bounds read · 20402281
      Vegard Nossum authored
      commit 088bf2ff upstream.
      
      seq_read() is a nasty piece of work, not to mention buggy.
      
      It has (I think) an old bug which allows unprivileged userspace to read
      beyond the end of m->buf.
      
      I was getting these:
      
          BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880
          Read of size 2713 by task trinity-c2/1329
          CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
          Call Trace:
            kasan_object_err+0x1c/0x80
            kasan_report_error+0x2cb/0x7e0
            kasan_report+0x4e/0x80
            check_memory_region+0x13e/0x1a0
            kasan_check_read+0x11/0x20
            seq_read+0xcd2/0x1480
            proc_reg_read+0x10b/0x260
            do_loop_readv_writev.part.5+0x140/0x2c0
            do_readv_writev+0x589/0x860
            vfs_readv+0x7b/0xd0
            do_readv+0xd8/0x2c0
            SyS_readv+0xb/0x10
            do_syscall_64+0x1b3/0x4b0
            entry_SYSCALL64_slow_path+0x25/0x25
          Object at ffff880116889100, in cache kmalloc-4096 size: 4096
          Allocated:
          PID = 1329
            save_stack_trace+0x26/0x80
            save_stack+0x46/0xd0
            kasan_kmalloc+0xad/0xe0
            __kmalloc+0x1aa/0x4a0
            seq_buf_alloc+0x35/0x40
            seq_read+0x7d8/0x1480
            proc_reg_read+0x10b/0x260
            do_loop_readv_writev.part.5+0x140/0x2c0
            do_readv_writev+0x589/0x860
            vfs_readv+0x7b/0xd0
            do_readv+0xd8/0x2c0
            SyS_readv+0xb/0x10
            do_syscall_64+0x1b3/0x4b0
            return_from_SYSCALL_64+0x0/0x6a
          Freed:
          PID = 0
          (stack is not available)
          Memory state around the buggy address:
           ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
           ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          >ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      		       ^
           ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
           ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
          ==================================================================
          Disabling lock debugging due to kernel taint
      
      This seems to be the same thing that Dave Jones was seeing here:
      
        https://lkml.org/lkml/2016/8/12/334
      
      There are multiple issues here:
      
        1) If we enter the function with a non-empty buffer, there is an attempt
           to flush it. But it was not clearing m->from after doing so, which
           means that if we try to do this flush twice in a row without any call
           to traverse() in between, we are going to be reading from the wrong
           place -- the splat above, fixed by this patch.
      
        2) If there's a short write to userspace because of page faults, the
           buffer may already contain multiple lines (i.e. pos has advanced by
           more than 1), but we don't save the progress that was made so the
           next call will output what we've already returned previously. Since
           that is a much less serious issue (and I have a headache after
           staring at seq_read() for the past 8 hours), I'll leave that for now.
      
      Link: http://lkml.kernel.org/r/1471447270-32093-1-git-send-email-vegard.nossum@oracle.comSigned-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      20402281
    • Nicolas Iooss's avatar
      printk: fix parsing of "brl=" option · dea3302c
      Nicolas Iooss authored
      commit ae6c33ba upstream.
      
      Commit bbeddf52 ("printk: move braille console support into separate
      braille.[ch] files") moved the parsing of braille-related options into
      _braille_console_setup(), changing the type of variable str from char*
      to char**.  In this commit, memcmp(str, "brl,", 4) was correctly updated
      to memcmp(*str, "brl,", 4) but not memcmp(str, "brl=", 4).
      
      Update the code to make "brl=" option work again and replace memcmp()
      with strncmp() to make the compiler able to detect such an issue.
      
      Fixes: bbeddf52 ("printk: move braille console support into separate braille.[ch] files")
      Link: http://lkml.kernel.org/r/20160823165700.28952-1-nicolas.iooss_linux@m4x.orgSigned-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      dea3302c
    • Russell King's avatar
      ARM: sa1100: clear reset status prior to reboot · 60a9f376
      Russell King authored
      commit da60626e upstream.
      
      Clear the current reset status prior to rebooting the platform.  This
      adds the bit missing from 04fef228 ("[ARM] pxa: introduce
      reset_status and clear_reset_status for driver's usage").
      
      Fixes: 04fef228 ("[ARM] pxa: introduce reset_status and clear_reset_status for driver's usage")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      60a9f376
    • Chen-Yu Tsai's avatar
      clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function · 4e12317c
      Chen-Yu Tsai authored
      commit b53e7d00 upstream.
      
      The bootloader (U-boot) sometimes uses this timer for various delays.
      It uses it as a ongoing counter, and does comparisons on the current
      counter value. The timer counter is never stopped.
      
      In some cases when the user interacts with the bootloader, or lets
      it idle for some time before loading Linux, the timer may expire,
      and an interrupt will be pending. This results in an unexpected
      interrupt when the timer interrupt is enabled by the kernel, at
      which point the event_handler isn't set yet. This results in a NULL
      pointer dereference exception, panic, and no way to reboot.
      
      Clear any pending interrupts after we stop the timer in the probe
      function to avoid this.
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: default avatarMaxime Ripard <maxime.ripard@free-electrons.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4e12317c