1. 12 Jan, 2018 4 commits
    • Eric W. Biederman's avatar
      signal/powerpc: Document conflicts with SI_USER and SIGFPE and SIGTRAP · cf4674c4
      Eric W. Biederman authored
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      that uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME and TRAP_FIXME, siginfo_layout() will now return
      SIL_FAULT and the appropriate fields will be reliably copied.
      
      Possible ABI fixes includee:
      - Send the signal without siginfo
      - Don't generate a signal
      - Possibly assign and use an appropriate si_code
      - Don't handle cases which can't happen
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Kumar Gala <kumar.gala@freescale.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc:  linuxppc-dev@lists.ozlabs.org
      Ref: 9bad068c ("[PATCH] ppc32: support for e500 and 85xx")
      Ref: 0ed70f61 ("PPC32: Provide proper siginfo information on various exceptions.")
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      cf4674c4
    • Eric W. Biederman's avatar
      signal/metag: Document a conflict with SI_USER with SIGFPE · b80328be
      Eric W. Biederman authored
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      hat uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
      appropriate fields will reliably be copied.
      
      Possible ABI fixes includee:
        - Send the signal without siginfo
        - Don't generate a signal
        - Possibly assign and use an appropriate si_code
        - Don't handle cases which can't happen
      
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Ref: ac919f08 ("metag: Traps")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      b80328be
    • Eric W. Biederman's avatar
      signal/parisc: Document a conflict with SI_USER with SIGFPE · b5daf2b9
      Eric W. Biederman authored
      Setting si_code to 0 results in a userspace seeing an si_code of 0.
      This is the same si_code as SI_USER.  Posix and common sense requires
      that SI_USER not be a signal specific si_code.  As such this use of 0
      for the si_code is a pretty horribly broken ABI.
      
      Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
      value of __SI_KILL and now sees a value of SIL_KILL with the result
      that uid and pid fields are copied and which might copying the si_addr
      field by accident but certainly not by design.  Making this a very
      flakey implementation.
      
      Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
      appropriate fields will reliably be copied.
      
      This bug is 13 years old and parsic machines are no longer being built
      so I don't know if it possible or worth fixing it.  But it is at least
      worth documenting this so other architectures don't make the same
      mistake.
      
      Possible ABI fixes includee:
        - Send the signal without siginfo
        - Don't generate a signal
        - Possibly assign and use an appropriate si_code
        - Don't handle cases which can't happen
      
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Ref: 313c01d3 ("[PATCH] PA-RISC update for 2.6.0")
      Histroy Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitSigned-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      b5daf2b9
    • Eric W. Biederman's avatar
      signal/openrisc: Fix do_unaligned_access to send the proper signal · 500d5830
      Eric W. Biederman authored
      While reviewing the signal sending on openrisc the do_unaligned_access
      function stood out because it is obviously wrong.  A comment about an
      si_code set above when actually si_code is never set.  Leading to a
      random si_code being sent to userspace in the event of an unaligned
      access.
      
      Looking further SIGBUS BUS_ADRALN is the proper pair of signal and
      si_code to send for an unaligned access. That is what other
      architectures do and what is required by posix.
      
      Given that do_unaligned_access is broken in a way that no one can be
      relying on it on openrisc fix the code to just do the right thing.
      
      Cc: stable@vger.kernel.org
      Fixes: 769a8a96 ("OpenRISC: Traps")
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: openrisc@lists.librecores.org
      Acked-by: default avatarStafford Horne <shorne@gmail.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      500d5830
  2. 06 Jan, 2018 1 commit
  3. 04 Jan, 2018 1 commit
    • Eric W. Biederman's avatar
      signal: Simplify and fix kdb_send_sig · 0b44bf9a
      Eric W. Biederman authored
      - Rename from kdb_send_sig_info to kdb_send_sig
        As there is no meaningful siginfo sent
      
      - Use SEND_SIG_PRIV instead of generating a siginfo for a kdb
        signal.  The generated siginfo had a bogus rationale and was
        not correct in the face of pid namespaces.  SEND_SIG_PRIV
        is simpler and actually correct.
      
      - As the code grabs siglock just send the signal with siglock
        held instead of dropping siglock and attempting to grab it again.
      
      - Move the sig_valid test into kdb_kill where it can generate
        a good error message.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      0b44bf9a
  4. 31 Dec, 2017 20 commits
    • Linus Torvalds's avatar
      Linux 4.15-rc6 · 30a7acd5
      Linus Torvalds authored
      30a7acd5
    • Linus Torvalds's avatar
      Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f39d7d78
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A couple of fixlets for x86:
      
         - Fix the ESPFIX double fault handling for 5-level pagetables
      
         - Fix the commandline parsing for 'apic=' on 32bit systems and update
           documentation
      
         - Make zombie stack traces reliable
      
         - Fix kexec with stack canary
      
         - Fix the delivery mode for APICs which was missed when the x86
           vector management was converted to single target delivery. Caused a
           regression due to the broken hardware which ignores affinity
           settings in lowest prio delivery mode.
      
         - Unbreak modules when AMD memory encryption is enabled
      
         - Remove an unused parameter of prepare_switch_to"
      
      * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/apic: Switch all APICs to Fixed delivery mode
        x86/apic: Update the 'apic=' description of setting APIC driver
        x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
        x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
        x86: Remove unused parameter of prepare_switch_to
        x86/stacktrace: Make zombie stack traces reliable
        x86/mm: Unbreak modules that use the DMA API
        x86/build: Make isoimage work on Debian
        x86/espfix/64: Fix espfix double-fault handling on 5-level systems
      f39d7d78
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 52c90f2d
      Linus Torvalds authored
      Pull x86 page table isolation fixes from Thomas Gleixner:
       "Four patches addressing the PTI fallout as discussed and debugged
        yesterday:
      
         - Remove stale and pointless TLB flush invocations from the hotplug
           code
      
         - Remove stale preempt_disable/enable from __native_flush_tlb()
      
         - Plug the memory leak in the write_ldt() error path"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ldt: Make LDT pgtable free conditional
        x86/ldt: Plug memory leak in error path
        x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
        x86/smpboot: Remove stale TLB flush invocations
      52c90f2d
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cea92e84
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "A pile of fixes for long standing issues with the timer wheel and the
        NOHZ code:
      
         - Prevent timer base confusion accross the nohz switch, which can
           cause unlocked access and data corruption
      
         - Reinitialize the stale base clock on cpu hotplug to prevent subtle
           side effects including rollovers on 32bit
      
         - Prevent an interrupt storm when the timer softirq is already
           pending caused by tick_nohz_stop_sched_tick()
      
         - Move the timer start tracepoint to a place where it actually makes
           sense
      
         - Add documentation to timerqueue functions as they caused confusion
           several times now"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timerqueue: Document return values of timerqueue_add/del()
        timers: Invoke timer_start_debug() where it makes sense
        nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
        timers: Reinitialize per cpu bases on hotplug
        timers: Use deferrable base independent of base::nohz_active
      cea92e84
    • Linus Torvalds's avatar
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8d517bdf
      Linus Torvalds authored
      Pull smp fixlet from Thomas Gleixner:
       "A trivial build warning fix for newer compilers"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Move inline keyword at the beginning of declaration
      8d517bdf
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4c470317
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "Three patches addressing the fallout of the CPU_ISOLATION changes
        especially with NO_HZ_FULL plus documentation of boot parameter
        dependency"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y
        sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
        sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
      4c470317
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7c632fc
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
      
       - plug a memory leak in the intel pmu init code
      
       - clang fixes
      
       - tooling fix to avoid including kernel headers
      
       - a fix for jvmti to generate correct debug information for inlined
         code
      
       - replace backtick with a regular shell function
      
       - fix the build in hardened environments
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Plug memory leak in intel_pmu_init()
        x86/asm: Allow again using asm.h when building for the 'bpf' clang target
        tools arch s390: Do not include header files from the kernel sources
        perf jvmti: Generate correct debug information for inlined code
        perf tools: Fix up build in hardened environments
        perf tools: Use shell function for perl cflags retrieval
      e7c632fc
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 88fa025d
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A rather large update after the kaisered maintainer finally found time
        to handle regression reports.
      
         - The larger part addresses a regression caused by the x86 vector
           management rework.
      
           The reservation based model does not work reliably for MSI
           interrupts, if they cannot be masked (yes, yet another hw
           engineering trainwreck). The reason is that the reservation mode
           assigns a dummy vector when the interrupt is allocated and switches
           to a real vector when the interrupt is requested.
      
           If the MSI entry cannot be masked then the initialization might
           raise an interrupt before the interrupt is requested, which ends up
           as spurious interrupt and causes device malfunction and worse. The
           fix is to exclude MSI interrupts which do not support masking from
           reservation mode and assign a real vector right away.
      
         - Extend the extra lockdep class setup for nested interrupts with a
           class for the recently added irq_desc::request_mutex so lockdep can
           differeniate and does not emit false positive warnings.
      
         - A ratelimit guard for the bad irq printout so in case a bad irq
           comes back immediately the system does not drown in dmesg spam"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
        genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
        x86/vector: Use IRQD_CAN_RESERVE flag
        genirq: Introduce IRQD_CAN_RESERVE flag
        genirq/msi: Handle reactivation only on success
        gpio: brcmstb: Make really use of the new lockdep class
        genirq: Guard handle_bad_irq log messages
        kernel/irq: Extend lockdep class for request mutex
      88fa025d
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 31336ed9
      Linus Torvalds authored
      Pull objtool fixes from Thomas Gleixner:
       "Three fixlets for objtool:
      
         - Address two segfaults related to missing parameter and clang
           objects
      
         - Make it compile clean with clang"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Fix seg fault with clang-compiled objects
        objtool: Fix seg fault caused by missing parameter
        objtool: Fix Clang enum conversion warning
      31336ed9
    • Linus Torvalds's avatar
      Merge tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 8371e5a0
      Linus Torvalds authored
      Pull char/misc fixes from Greg KH:
       "Here are six small fixes of some of the char/misc drivers that have
        been sent in to resolve reported issues.
      
        Nothing major, a binder use-after-free fix, some thunderbolt bugfixes,
        a hyper-v bugfix, and an nvmem driver fix. All of these have been in
        linux-next with no reported issues for a while"
      
      * tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        nvmem: meson-mx-efuse: fix reading from an offset other than 0
        binder: fix proc->files use-after-free
        vmbus: unregister device_obj->channels_kset
        thunderbolt: Mask ring interrupt properly when polling starts
        MAINTAINERS: Add thunderbolt.rst to the Thunderbolt driver entry
        thunderbolt: Make pathname to force_power shorter
      8371e5a0
    • Linus Torvalds's avatar
      Merge tag 'driver-core-4.15-rc6' of... · 4288e6b4
      Linus Torvalds authored
      Merge tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fixes from Greg KH:
       "Here are two driver core fixes for 4.15-rc6, resolving some reported
        issues.
      
        The first is a cacheinfo fix for DT based systems to resolve a
        reported issue that has been around for a while, and the other is to
        resolve a regression in the kobject uevent code that showed up in
        4.15-rc1.
      
        Both have been in linux-next for a while with no reported issues"
      
      * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        kobject: fix suppressing modalias in uevents delivered over netlink
        drivers: base: cacheinfo: fix cache type for non-architected system cache
      4288e6b4
    • Linus Torvalds's avatar
      Merge tag 'staging-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 29a9b000
      Linus Torvalds authored
      Pull staging fixes from Greg KH:
       "Here are three staging driver fixes for 4.15-rc6
      
        The first resolves a bug in the lustre driver that came about due to a
        broken cleanup patch, due to crazy list usage in that codebase.
      
        The remaining two are ion driver fixes, finally getting the CMA
        interaction to work properly, resolving two regressions in that area
        of the code.
      
        All have been in linux-next with no reported issues for a while"
      
      * tag 'staging-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: android: ion: Fix dma direction for dma_sync_sg_for_cpu/device
        staging: ion: Fix ion_cma_heap allocations
        staging: lustre: lnet: Fix recent breakage from list_for_each conversion
      29a9b000
    • Linus Torvalds's avatar
      Merge tag 'tty-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · bc7236fb
      Linus Torvalds authored
      Pull TTY fix from Greg KH:
       "Here is a single tty fix for a reported issue that you wrote the patch
        for :)
      
        It's been in linux-next for a week or so with no reported issues"
      
      * tag 'tty-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
      bc7236fb
    • Linus Torvalds's avatar
      Merge tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · a9746e40
      Linus Torvalds authored
      Pull USB/PHY fixes from Greg KH:
       "Here are a number of small USB and PHY driver fixes for 4.15-rc6.
      
        Nothing major, but there are a number of regression fixes in here that
        resolve issues that have been reported a bunch. There are also the
        usual xhci fixes as well as a number of new usb serial device ids.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
        xhci: Fix use-after-free in xhci debugfs
        xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate
        USB: serial: ftdi_sio: add id for Airbus DS P8GR
        usb: Add device quirk for Logitech HD Pro Webcam C925e
        usb: add RESET_RESUME for ELSA MicroLink 56K
        usbip: fix usbip bind writing random string after command in match_busid
        usbip: stub_rx: fix static checker warning on unnecessary checks
        usbip: prevent leaking socket pointer address in messages
        usbip: stub: stop printing kernel pointer addresses in messages
        usbip: vhci: stop printing kernel pointer addresses in messages
        USB: Fix off by one in type-specific length check of BOS SSP capability
        USB: serial: option: adding support for YUGA CLM920-NC5
        phy: rcar-gen3-usb2: select USB_COMMON
        phy: rockchip-typec: add pm_runtime_disable in err case
        phy: cpcap-usb: Fix platform_get_irq_byname's error checking.
        phy: tegra: fix device-tree node lookups
        USB: serial: qcserial: add Sierra Wireless EM7565
        USB: serial: option: add support for Telit ME910 PID 0x1101
        USB: chipidea: msm: fix ulpi-node lookup
      a9746e40
    • Adam Borowski's avatar
      MAINTAINERS: mark arch/blackfin/ and its gubbins as orphaned · c0b23903
      Adam Borowski authored
      The blackfin architecture has seen no maintainer action of any kind since
      April 2015.  No new code, no pull requests, no acks to patches, no response
      to mails, nothing.
      
      The web site has an expired certificate (expiration Sep 2017, issued in
      2013), the mailing list sees no answers either, with one exception:
      
        https://sourceforge.net/p/adi-buildroot/mailman/adi-buildroot-devel/
        >
        > Steven is no longer working on this for ADI. Acked by me if this works. Thanks.
        >
        > Best regards,
        > Aaron Wu
        > Analog Devices Inc.
      
      But, Aaron doesn't seem to respond to queries either.
      Signed-off-by: default avatarAdam Borowski <kilobyte@angband.pl>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0b23903
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 6bba94d0
      Linus Torvalds authored
      Pull sparc bugfix from David Miller.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: repair calling incorrect hweight function from stubs
      6bba94d0
    • Thomas Gleixner's avatar
      x86/ldt: Make LDT pgtable free conditional · 7f414195
      Thomas Gleixner authored
      Andy prefers to be paranoid about the pagetable free in the error path of
      write_ldt(). Make it conditional and warn whenever the installment of a
      secondary LDT fails.
      Requested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      7f414195
    • Thomas Gleixner's avatar
      x86/ldt: Plug memory leak in error path · a62d6985
      Thomas Gleixner authored
      The error path in write_ldt() tries to free 'old_ldt' instead of the newly
      allocated 'new_ldt', resulting in a memory leak. It also misses to clean up a
      half populated LDT pagetable, which is not a leak as it gets cleaned up
      when the process exits.
      
      Free both the potentially half populated LDT pagetable and the newly
      allocated LDT struct. This can be done unconditionally because once an LDT
      is mapped subsequent maps will succeed, because the PTE page is already
      populated and the two LDTs fit into that single page.
      Reported-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: f55f0501 ("x86/pti: Put the LDT in its own PGD if PTI is on")
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1712311121340.1899@nanosSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a62d6985
    • Thomas Gleixner's avatar
      x86/mm: Remove preempt_disable/enable() from __native_flush_tlb() · decab088
      Thomas Gleixner authored
      The preempt_disable/enable() pair in __native_flush_tlb() was added in
      commit:
      
        5cf0791d ("x86/mm: Disable preemption during CR3 read+write")
      
      ... to protect the UP variant of flush_tlb_mm_range().
      
      That preempt_disable/enable() pair should have been added to the UP variant
      of flush_tlb_mm_range() instead.
      
      The UP variant was removed with commit:
      
        ce4a4e56 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code")
      
      ... but the preempt_disable/enable() pair stayed around.
      
      The latest change to __native_flush_tlb() in commit:
      
        6fd166aa ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
      
      ... added an access to a per CPU variable outside the preempt disabled
      regions, which makes no sense at all. __native_flush_tlb() must always
      be called with at least preemption disabled.
      
      Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
      bad callers independent of the smp_processor_id() debugging.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      decab088
    • Thomas Gleixner's avatar
      x86/smpboot: Remove stale TLB flush invocations · 322f8b8b
      Thomas Gleixner authored
      smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
      invoke local_flush_tlb() for no obvious reason.
      
      Digging in history revealed that the original code in the 2.1 era added
      those because the code manipulated a swapper_pg_dir pagetable entry. The
      pagetable manipulation was removed long ago in the 2.3 timeframe, but the
      TLB flush invocations stayed around forever.
      
      Remove them along with the pointless pr_debug()s which come from the same 2.1
      change.
      Reported-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      322f8b8b
  5. 30 Dec, 2017 6 commits
  6. 29 Dec, 2017 8 commits
    • Thomas Gleixner's avatar
      timerqueue: Document return values of timerqueue_add/del() · 9f4533cd
      Thomas Gleixner authored
      The return values of timerqueue_add/del() are not documented in the kernel doc
      comment. Add proper documentation.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
      9f4533cd
    • Thomas Gleixner's avatar
      timers: Invoke timer_start_debug() where it makes sense · fd45bb77
      Thomas Gleixner authored
      The timer start debug function is called before the proper timer base is
      set. As a consequence the trace data contains the stale CPU and flags
      values.
      
      Call the debug function after setting the new base and flags.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
      fd45bb77
    • Thomas Gleixner's avatar
      nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() · 5d62c183
      Thomas Gleixner authored
      The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
      subsequently invokes tick_nohz_stop_sched_tick() are:
      
        if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
      
      If need_resched() is not set, but a timer softirq is pending then this is
      an indication that the softirq code punted and delegated the execution to
      softirqd. need_resched() is not true because the current interrupted task
      takes precedence over softirqd.
      
      Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
      timer interrupts because the timer wheel contains an expired timer, but
      softirqs are not yet executed. So it returns an immediate expiry request,
      which causes the timer to fire immediately again. Lather, rinse and
      repeat....
      
      Prevent that by adding a check for a pending timer soft interrupt to the
      conditions in tick_nohz_stop_sched_tick() which avoid calling
      get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
      prevents a repetitive programming of an already expired timer.
      Reported-by: default avatarSebastian Siewior <bigeasy@linutronix.d>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
      5d62c183
    • Thomas Gleixner's avatar
      timers: Reinitialize per cpu bases on hotplug · 26456f87
      Thomas Gleixner authored
      The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
      them with a potentially stale clk and next_expiry valuem, which can cause
      trouble then the CPU is plugged.
      
      Add a prepare callback which forwards the clock, sets next_expiry to far in
      the future and reset the control flags to a known state.
      
      Set base->must_forward_clk so the first timer which is queued will try to
      forward the clock to current jiffies.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
      26456f87
    • Anna-Maria Gleixner's avatar
      timers: Use deferrable base independent of base::nohz_active · ced6d5c1
      Anna-Maria Gleixner authored
      During boot and before base::nohz_active is set in the timer bases, deferrable
      timers are enqueued into the standard timer base. This works correctly as
      long as base::nohz_active is false.
      
      Once it base::nohz_active is set and a timer which was enqueued before that
      is accessed the lock selector code choses the lock of the deferred
      base. This causes unlocked access to the standard base and in case the
      timer is removed it does not clear the pending flag in the standard base
      bitmap which causes get_next_timer_interrupt() to return bogus values.
      
      To prevent that, the deferrable timers must be enqueued in the deferrable
      base, even when base::nohz_active is not set. Those deferrable timers also
      need to be expired unconditional.
      
      Fixes: 500462a9 ("timers: Switch to a non-cascading wheel")
      Signed-off-by: default avatarAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: rt@linutronix.de
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
      ced6d5c1
    • Thomas Gleixner's avatar
      genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI · bc976233
      Thomas Gleixner authored
      The new reservation mode for interrupts assigns a dummy vector when the
      interrupt is allocated and assigns a real vector when the interrupt is
      requested. The reservation mode prevents vector pressure when devices with
      a large amount of queues/interrupts are initialized, but only a minimal
      subset of those queues/interrupts is actually used.
      
      This mode has an issue with MSI interrupts which cannot be masked. If the
      driver is not careful or the hardware emits an interrupt before the device
      irq is requestd by the driver then the interrupt ends up on the dummy
      vector as a spurious interrupt which can cause malfunction of the device or
      in the worst case a lockup of the machine.
      
      Change the logic for the reservation mode so that the early activation of
      MSI interrupts checks whether:
      
       - the device is a PCI/MSI device
       - the reservation mode of the underlying irqdomain is activated
       - PCI/MSI masking is globally enabled
       - the PCI/MSI device uses either MSI-X, which supports masking, or
         MSI with the maskbit supported.
      
      If one of those conditions is false, then clear the reservation mode flag
      in the irq data of the interrupt and invoke irq_domain_activate_irq() with
      the reserve argument cleared. In the x86 vector code, clear the can_reserve
      flag in the vector allocation data so a subsequent free_irq() won't create
      the same situation again. The interrupt stays assigned to a real vector
      until pci_disable_msi() is invoked and all allocations are undone.
      
      Fixes: 4900be83 ("x86/vector/msi: Switch to global reservation mode")
      Reported-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
      bc976233
    • Thomas Gleixner's avatar
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq() · 702cb0a0
      Thomas Gleixner authored
      The 'early' argument of irq_domain_activate_irq() is actually used to
      denote reservation mode. To avoid confusion, rename it before abuse
      happens.
      
      No functional change.
      
      Fixes: 72491643 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      702cb0a0
    • Thomas Gleixner's avatar
      x86/vector: Use IRQD_CAN_RESERVE flag · 945f50a5
      Thomas Gleixner authored
      Set the new CAN_RESERVE flag when the initial reservation for an interrupt
      happens. The flag is used in a subsequent patch to disable reservation mode
      for a certain class of MSI devices.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      
      945f50a5