1. 23 Mar, 2017 3 commits
    • Guenter Roeck's avatar
      usb: hub: Do not attempt to autosuspend disconnected devices · f5cccf49
      Guenter Roeck authored
      While running a bind/unbind stress test with the dwc3 usb driver on rk3399,
      the following crash was observed.
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000218
      pgd = ffffffc00165f000
      [00000218] *pgd=000000000174f003, *pud=000000000174f003,
      				*pmd=0000000001750003, *pte=00e8000001751713
      Internal error: Oops: 96000005 [#1] PREEMPT SMP
      Modules linked in: uinput uvcvideo videobuf2_vmalloc cmac
      ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat rfcomm
      xt_mark fuse bridge stp llc zram btusb btrtl btbcm btintel bluetooth
      ip6table_filter mwifiex_pcie mwifiex cfg80211 cdc_ether usbnet r8152 mii joydev
      snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device ppp_async
      ppp_generic slhc tun
      CPU: 1 PID: 29814 Comm: kworker/1:1 Not tainted 4.4.52 #507
      Hardware name: Google Kevin (DT)
      Workqueue: pm pm_runtime_work
      task: ffffffc0ac540000 ti: ffffffc0af4d4000 task.ti: ffffffc0af4d4000
      PC is at autosuspend_check+0x74/0x174
      LR is at autosuspend_check+0x70/0x174
      ...
      Call trace:
      [<ffffffc00080dcc0>] autosuspend_check+0x74/0x174
      [<ffffffc000810500>] usb_runtime_idle+0x20/0x40
      [<ffffffc000785ae0>] __rpm_callback+0x48/0x7c
      [<ffffffc000786af0>] rpm_idle+0x1e8/0x498
      [<ffffffc000787cdc>] pm_runtime_work+0x88/0xcc
      [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
      [<ffffffc00024abcc>] worker_thread+0x480/0x610
      [<ffffffc000251a80>] kthread+0x164/0x178
      [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
      
      Source:
      
      (gdb) l *0xffffffc00080dcc0
      0xffffffc00080dcc0 is in autosuspend_check
      (drivers/usb/core/driver.c:1778).
      1773		/* We don't need to check interfaces that are
      1774		 * disabled for runtime PM.  Either they are unbound
      1775		 * or else their drivers don't support autosuspend
      1776		 * and so they are permanently active.
      1777		 */
      1778		if (intf->dev.power.disable_depth)
      1779			continue;
      1780		if (atomic_read(&intf->dev.power.usage_count) > 0)
      1781			return -EBUSY;
      1782		w |= intf->needs_remote_wakeup;
      
      Code analysis shows that intf is set to NULL in usb_disable_device() prior
      to setting actconfig to NULL. At the same time, usb_runtime_idle() does not
      lock the usb device, and neither does any of the functions in the
      traceback. This means that there is no protection against a race condition
      where usb_disable_device() is removing dev->actconfig->interface[] pointers
      while those are being accessed from autosuspend_check().
      
      To solve the problem, synchronize and validate device state between
      autosuspend_check() and usb_disconnect().
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5cccf49
    • Guenter Roeck's avatar
      usb: hub: Fix error loop seen after hub communication errors · 245b2eec
      Guenter Roeck authored
      While stress testing a usb controller using a bind/unbind looop, the
      following error loop was observed.
      
      usb 7-1.2: new low-speed USB device number 3 using xhci-hcd
      usb 7-1.2: hub failed to enable device, error -108
      usb 7-1-port2: cannot disable (err = -22)
      usb 7-1-port2: couldn't allocate usb_device
      usb 7-1-port2: cannot disable (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: activate --> -22
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      ** 57 printk messages dropped ** hub 7-1:1.0: activate --> -22
      ** 82 printk messages dropped ** hub 7-1:1.0: hub_ext_port_status failed (err = -22)
      
      This continues forever. After adding tracebacks into the code,
      the call sequence leading to this is found to be as follows.
      
      [<ffffffc0007fc8e0>] hub_activate+0x368/0x7b8
      [<ffffffc0007fceb4>] hub_resume+0x2c/0x3c
      [<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158
      [<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288
      [<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98
      [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c
      [<ffffffc00078217c>] rpm_callback+0xa8/0xd4
      [<ffffffc000786234>] rpm_suspend+0x84/0x758
      [<ffffffc000786ca4>] rpm_idle+0x2c8/0x498
      [<ffffffc000786ed4>] __pm_runtime_idle+0x60/0xac
      [<ffffffc00080eba8>] usb_autopm_put_interface+0x6c/0x7c
      [<ffffffc000803798>] hub_event+0x10ac/0x12ac
      [<ffffffc000249bb8>] process_one_work+0x390/0x6b8
      [<ffffffc00024abcc>] worker_thread+0x480/0x610
      [<ffffffc000251a80>] kthread+0x164/0x178
      [<ffffffc0002045d0>] ret_from_fork+0x10/0x40
      
      kick_hub_wq() is called from hub_activate() even after failures to
      communicate with the hub. This results in an endless sequence of
      hub event -> hub activate -> wq trigger -> hub event -> ...
      
      Provide two solutions for the problem.
      
      - Only trigger the hub event queue if communication with the hub
        is successful.
      - After a suspend failure, only resume already suspended interfaces
        if the communication with the device is still possible.
      
      Each of the changes fixes the observed problem. Use both to improve
      robustness.
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      245b2eec
    • Gerd Hoffmann's avatar
      ohci-pci: add qemu quirk · 21a60f6e
      Gerd Hoffmann authored
      On a loaded virtualization host (dozen guests booting at the same time)
      it may happen that the ohci controller emulation doesn't manage to do
      timely frame processing, with the result that the io watchdog fires and
      considers the controller being dead, even though it's only the emulation
      being unusual slow due to the load peak.
      
      So, add a quirk for qemu and don't use the watchdog in case we figure we
      are running on emulated ohci.  The virtual ohci controller masquerades
      as apple ohci controller, but we can identify it by subsystem id.
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21a60f6e
  2. 17 Mar, 2017 19 commits
  3. 16 Mar, 2017 8 commits
  4. 12 Mar, 2017 5 commits
    • Linus Torvalds's avatar
      Linux 4.11-rc2 · 4495c08e
      Linus Torvalds authored
      4495c08e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 56b24d1b
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
      
       - four patches to get the new cputime code in shape for s390
      
       - add the new statx system call
      
       - a few bug fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: wire up statx system call
        KVM: s390: Fix guest migration for huge guests resulting in panic
        s390/ipl: always use load normal for CCW-type re-IPL
        s390/timex: micro optimization for tod_to_ns
        s390/cputime: provide archicture specific cputime_to_nsecs
        s390/cputime: reset all accounting fields on fork
        s390/cputime: remove last traces of cputime_t
        s390: fix in-kernel program checks
        s390/crypt: fix missing unlock in ctr_paes_crypt on error path
      56b24d1b
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5a45a5a8
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
      
       - a fix for the kexec/purgatory regression which was introduced in the
         merge window via an innocent sparse fix. We could have reverted that
         commit, but on deeper inspection it turned out that the whole
         machinery is neither documented nor robust. So a proper cleanup was
         done instead
      
       - the fix for the TLB flush issue which was discovered recently
      
       - a simple typo fix for a reboot quirk
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/tlb: Fix tlb flushing when lguest clears PGE
        kexec, x86/purgatory: Unbreak it and clean it up
        x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk
      5a45a5a8
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ecade114
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
      
       - a workaround for a GIC erratum
      
       - a missing stub function for CONFIG_IRQDOMAIN=n
      
       - fixes for a couple of type inconsistencies
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/crossbar: Fix incorrect type of register size
        irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065
        irqdomain: Add empty irq_domain_check_msi_remap
        irqchip/crossbar: Fix incorrect type of local variables
      ecade114
    • Daniel Borkmann's avatar
      x86/tlb: Fix tlb flushing when lguest clears PGE · 2c4ea6e2
      Daniel Borkmann authored
      Fengguang reported random corruptions from various locations on x86-32
      after commits d2852a22 ("arch: add ARCH_HAS_SET_MEMORY config") and
      9d876e79 ("bpf: fix unlocking of jited image when module ronx not set")
      that uses the former. While x86-32 doesn't have a JIT like x86_64, the
      bpf_prog_lock_ro() and bpf_prog_unlock_ro() got enabled due to
      ARCH_HAS_SET_MEMORY, whereas Fengguang's test kernel doesn't have module
      support built in and therefore never had the DEBUG_SET_MODULE_RONX setting
      enabled.
      
      After investigating the crashes further, it turned out that using
      set_memory_ro() and set_memory_rw() didn't have the desired effect, for
      example, setting the pages as read-only on x86-32 would still let
      probe_kernel_write() succeed without error. This behavior would manifest
      itself in situations where the vmalloc'ed buffer was accessed prior to
      set_memory_*() such as in case of bpf_prog_alloc(). In cases where it
      wasn't, the page attribute changes seemed to have taken effect, leading to
      the conclusion that a TLB invalidate didn't happen. Moreover, it turned out
      that this issue reproduced with qemu in "-cpu kvm64" mode, but not for
      "-cpu host". When the issue occurs, change_page_attr_set_clr() did trigger
      a TLB flush as expected via __flush_tlb_all() through cpa_flush_range(),
      though.
      
      There are 3 variants for issuing a TLB flush: invpcid_flush_all() (depends
      on CPU feature bits X86_FEATURE_INVPCID, X86_FEATURE_PGE), cr4 based flush
      (depends on X86_FEATURE_PGE), and cr3 based flush.  For "-cpu host" case in
      my setup, the flush used invpcid_flush_all() variant, whereas for "-cpu
      kvm64", the flush was cr4 based. Switching the kvm64 case to cr3 manually
      worked fine, and further investigating the cr4 one turned out that
      X86_CR4_PGE bit was not set in cr4 register, meaning the
      __native_flush_tlb_global_irq_disabled() wrote cr4 twice with the same
      value instead of clearing X86_CR4_PGE in the first write to trigger the
      flush.
      
      It turned out that X86_CR4_PGE was cleared from cr4 during init from
      lguest_arch_host_init() via adjust_pge(). The X86_FEATURE_PGE bit is also
      cleared from there due to concerns of using PGE in guest kernel that can
      lead to hard to trace bugs (see bff672e6 ("lguest: documentation V:
      Host") in init()). The CPU feature bits are cleared in dynamic
      boot_cpu_data, but they never propagated to __flush_tlb_all() as it uses
      static_cpu_has() instead of boot_cpu_has() for testing which variant of TLB
      flushing to use, meaning they still used the old setting of the host
      kernel.
      
      Clearing via setup_clear_cpu_cap(X86_FEATURE_PGE) so this would propagate
      to static_cpu_has() checks is too late at this point as sections have been
      patched already, so for now, it seems reasonable to switch back to
      boot_cpu_has(X86_FEATURE_PGE) as it was prior to commit c109bf95
      ("x86/cpufeature: Remove cpu_has_pge"). This lets the TLB flush trigger via
      cr3 as originally intended, properly makes the new page attributes visible
      and thus fixes the crashes seen by Fengguang.
      
      Fixes: c109bf95 ("x86/cpufeature: Remove cpu_has_pge")
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: bp@suse.de
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: lkp@01.org
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernrl.org/r/20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com
      Link: http://lkml.kernel.org/r/25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.netSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      2c4ea6e2
  5. 11 Mar, 2017 5 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 106e4da6
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "ARM updates from Marc Zyngier:
         - vgic updates:
           - Honour disabling the ITS
           - Don't deadlock when deactivating own interrupts via MMIO
           - Correctly expose the lact of IRQ/FIQ bypass on GICv3
      
         - I/O virtualization:
           - Make KVM_CAP_NR_MEMSLOTS big enough for large guests with many
             PCIe devices
      
         - General bug fixes:
           - Gracefully handle exception generated with syndroms that the host
             doesn't understand
           - Properly invalidate TLBs on VHE systems
      
        x86:
         - improvements in emulation of VMCLEAR, VMX MSR bitmaps, and VCPU
           reset
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: nVMX: do not warn when MSR bitmap address is not backed
        KVM: arm64: Increase number of user memslots to 512
        KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused
        KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64
        KVM: Add documentation for KVM_CAP_NR_MEMSLOTS
        KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled
        arm64: KVM: Survive unknown traps from guests
        arm: KVM: Survive unknown traps from guests
        KVM: arm/arm64: Let vcpu thread modify its own active state
        KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset
        kvm: nVMX: VMCLEAR should not cause the vCPU to shut down
        KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass
        arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs
      106e4da6
    • Linus Torvalds's avatar
      Merge tag 'extable-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux · 4b050f22
      Linus Torvalds authored
      Pull extable.h fix from Paul Gortmaker:
       "Fixup for arch/score after extable.h introduction.
      
        It seems that Guenter is the only one on the planet doing builds for
        arch/score -- we don't have compile coverage for it in linux-next or
        in the kbuild-bot either. Guenter couldn't even recall where he got
        his toolchain, but was kind enough to share it with me so I could
        validate this change and also add arch/score to my build coverage.
      
        I sat on this a bit in case there was any other fallout in other arch
        dirs, but since this still seems to be the only one, I might as well
        send it on its way"
      
      * tag 'extable-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
        score: Fix implicit includes now failing build after extable change
      4b050f22
    • Linus Torvalds's avatar
      Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random · 84c37c16
      Linus Torvalds authored
      Pull random updates from Ted Ts'o:
       "Change get_random_{int,log} to use the CRNG used by /dev/urandom and
        getrandom(2). It's faster and arguably more secure than cut-down MD5
        that we had been using.
      
        Also do some code cleanup"
      
      * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
        random: move random_min_urandom_seed into CONFIG_SYSCTL ifdef block
        random: convert get_random_int/long into get_random_u32/u64
        random: use chacha20 for get_random_int/long
        random: fix comment for unused random_min_urandom_seed
        random: remove variable limit
        random: remove stale urandom_init_wait
        random: remove stale maybe_reseed_primary_crng
      84c37c16
    • Guenter Roeck's avatar
      score: Fix implicit includes now failing build after extable change · 0acf6119
      Guenter Roeck authored
      After changing from module.h to extable.h, score builds fail with:
      
        arch/score/kernel/traps.c: In function 'do_ri':
        arch/score/kernel/traps.c:248:4: error: implicit declaration of function 'user_disable_single_step'
        arch/score/mm/extable.c: In function 'fixup_exception':
        arch/score/mm/extable.c:32:38: error: dereferencing pointer to incomplete type
        arch/score/mm/extable.c:34:24: error: dereferencing pointer to incomplete type
      
      because extable.h doesn't drag in the same amount of headers as the
      module.h did.  Add in the headers which were implicitly expected.
      
      Fixes: 90858794 ("module.h: remove extable.h include now users have migrated")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      [PG: tweak commit log; refresh for sched header refactoring.]
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      0acf6119
    • Linus Torvalds's avatar
      Merge tag 'tty-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 434fd635
      Linus Torvalds authored
      Pull tty/serial fixes frpm Greg KH:
       "Here are two bugfixes for tty stuff for 4.11-rc2.
      
        One of them resolves the pretty bad bug in the n_hdlc code that
        Alexander Popov found and fixed and has been reported everywhere. The
        other just fixes a samsung serial driver issue when DMA fails on some
        systems.
      
        Both have been in linux-next with no reported issues"
      
      * tag 'tty-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: samsung: Continue to work if DMA request fails
        tty: n_hdlc: get rid of racy n_hdlc.tbuf
      434fd635