1. 30 Aug, 2018 17 commits
    • Paul E. McKenney's avatar
      rcu: Remove rcu_state structure's ->rda field · da1df50d
      Paul E. McKenney authored
      The rcu_state structure's ->rda field was used to find the per-CPU
      rcu_data structures corresponding to that rcu_state structure.  But now
      there is only one rcu_state structure (creatively named "rcu_state")
      and one set of per-CPU rcu_data structures (creatively named "rcu_data").
      Therefore, uses of the ->rda field can always be replaced by "rcu_data,
      and this commit makes that change and removes the ->rda field.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      da1df50d
    • Paul E. McKenney's avatar
      rcu: Eliminate rcu_state structure's ->call field · ec5dd444
      Paul E. McKenney authored
      The rcu_state structure's ->call field references the corresponding RCU
      flavor's call_rcu() function.  However, now that there is only ever one
      rcu_state structure in a given build of the Linux kernel, and that flavor
      uses plain old call_rcu(), there is not a lot of point in continuing to
      have the ->call field.  This commit therefore removes it.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      ec5dd444
    • Paul E. McKenney's avatar
      rcu: Remove RCU_STATE_INITIALIZER() · 358be2d3
      Paul E. McKenney authored
      Now that a given build of the Linux kernel has only one set of rcu_state,
      rcu_node, and rcu_data structures, there is no point in creating a macro
      to declare and compile-time initialize them.  This commit therefore
      just does normal declaration and compile-time initialization of these
      structures.  While in the area, this commit also removes #ifndefs of
      the no-longer-ever-defined preprocessor macro RCU_TREE_NONCORE.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      358be2d3
    • Paul E. McKenney's avatar
      rcu: Express Tiny RCU updates in terms of RCU rather than RCU-sched · 709fdce7
      Paul E. McKenney authored
      This commit renames Tiny RCU functions so that the lowest level of
      functionality is RCU (e.g., synchronize_rcu()) rather than RCU-sched
      (e.g., synchronize_sched()).  This provides greater naming compatibility
      with Tree RCU, which will in turn permit more LoC removal once
      the RCU-sched and RCU-bh update-side API is removed.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Fix Tiny call_rcu()'s EXPORT_SYMBOL() in response to a bug
        report from kbuild test robot. ]
      709fdce7
    • Paul E. McKenney's avatar
      rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds · 45975c7d
      Paul E. McKenney authored
      Now that RCU-preempt knows about preemption disabling, its implementation
      of synchronize_rcu() works for synchronize_sched(), and likewise for the
      other RCU-sched update-side API members.  This commit therefore confines
      the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines
      RCU-sched's update-side API members in terms of those of RCU-preempt.
      
      This means that any given build of the Linux kernel has only one
      update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds
      and RCU-sched for CONFIG_PREEMPT=n builds.  This in turn means that kernels
      built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      45975c7d
    • Paul E. McKenney's avatar
    • Paul E. McKenney's avatar
      rcu: Drop "wake" parameter from rcu_report_exp_rdp() · 2bbfc25b
      Paul E. McKenney authored
      The rcu_report_exp_rdp() function is always invoked with its "wake"
      argument set to "true", so this commit drops this parameter.  The only
      potential call site that would use "false" is in the code driving the
      expedited grace period, and that code uses rcu_report_exp_cpu_mult()
      instead, which therefore retains its "wake" parameter.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      2bbfc25b
    • Paul E. McKenney's avatar
      rcu: Update comments and help text for no more RCU-bh updaters · 82fcecfa
      Paul E. McKenney authored
      This commit updates comments and help text to account for the fact that
      RCU-bh update-side functions are now simple wrappers for their RCU or
      RCU-sched counterparts.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      82fcecfa
    • Paul E. McKenney's avatar
      rcu: Define RCU-bh update API in terms of RCU · 65cfe358
      Paul E. McKenney authored
      Now that the main RCU API knows about softirq disabling and softirq's
      quiescent states, the RCU-bh update code can be dispensed with.
      This commit therefore removes the RCU-bh update-side implementation and
      defines RCU-bh's update-side API in terms of that of either RCU-preempt or
      RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option.
      
      In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect
      of reducing by one the number of rcuo kthreads per CPU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      65cfe358
    • Paul E. McKenney's avatar
      rcu: Report expedited grace periods at context-switch time · ba1c64c2
      Paul E. McKenney authored
      This commit reduces the latency of expedited RCU grace periods by
      reporting a quiescent state for the CPU at context-switch time.
      In CONFIG_PREEMPT=y kernels, if the outgoing task is still within an
      RCU read-side critical section (and thus still blocking some grace
      period, perhaps including this expedited grace period), then that task
      will already have been placed on one of the leaf rcu_node structures'
      ->blkd_tasks list.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      ba1c64c2
    • Paul E. McKenney's avatar
      rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe · d28139c4
      Paul E. McKenney authored
      One necessary step towards consolidating the three flavors of RCU is to
      make sure that the resulting consolidated "one flavor to rule them all"
      correctly handles networking denial-of-service attacks.  One thing that
      allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every
      so often, and so something similar has to happen for consolidated RCU.
      
      This must be done carefully.  For example, if a preemption-disabled
      region of code takes an interrupt which does softirq processing before
      returning, consolidated RCU must ignore the resulting rcu_bh_qs()
      invocations -- preemption is still disabled, and that means an RCU
      reader for the consolidated flavor.
      
      This commit therefore creates a new rcu_softirq_qs() that is called only
      from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region
      problem.  This new rcu_softirq_qs() function invokes rcu_sched_qs(),
      rcu_preempt_qs(), and rcu_preempt_deferred_qs().  The latter call handles
      any deferred quiescent states.
      
      Note that __do_softirq() still invokes rcu_bh_qs().  It will continue to
      do so until a later stage of cleanup when the RCU-bh flavor is removed.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Fix !SMP issue located by kbuild test robot. ]
      d28139c4
    • Paul E. McKenney's avatar
      rcu: Add warning to detect half-interrupts · e11ec65c
      Paul E. McKenney authored
      RCU's dyntick-idle code is written to tolerate half-interrupts, that it,
      either an interrupt that invokes rcu_irq_enter() but never invokes the
      corresponding rcu_irq_exit() on the one hand, or an interrupt that never
      invokes rcu_irq_enter() but does invoke the "corresponding" rcu_irq_exit()
      on the other.  These things really did happen at one time, as evidenced
      by this ca-2011 LKML post:
      
      http://lkml.kernel.org/r/20111014170019.GE2428@linux.vnet.ibm.com
      
      The reason why RCU tolerates half-interrupts is that usermode helpers
      used exceptions to invoke a system call from within the kernel such that
      the system call did a normal return (not a return from exception) to
      the calling context.  This caused rcu_irq_enter() to be invoked without
      a matching rcu_irq_exit().  However, usermode helpers have since been
      rewritten to make much more housebroken use of workqueues, kernel threads,
      and do_execve(), and therefore should no longer produce half-interrupts.
      No one knows of any other source of half-interrupts, but then again,
      no one seems insane enough to go audit the entire kernel to verify that
      half-interrupts really are a relic of the past.
      
      This commit therefore adds a pair of WARN_ON_ONCE() calls that will
      trigger in the presence of half interrupts, which the code will continue
      to handle correctly.  If neither of these WARN_ON_ONCE() trigger by
      mid-2021, then perhaps RCU can stop handling half-interrupts, which
      would be a considerable simplification.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reported-by: default avatarJoel Fernandes <joel@joelfernandes.org>
      Reported-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      e11ec65c
    • Paul E. McKenney's avatar
      rcu: Remove now-unused ->b.exp_need_qs field from the rcu_special union · fcc878e4
      Paul E. McKenney authored
      The ->b.exp_need_qs field is now set only to false, so this commit
      removes it.  The job this field used to do is now done by the rcu_data
      structure's ->deferred_qs field, which is a consequence of a better
      split between task-based (the rcu_node structure's ->exp_tasks field) and
      CPU-based (the aforementioned rcu_data structure's ->deferred_qs field)
      tracking of quiescent states for RCU-preempt expedited grace periods.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      fcc878e4
    • Paul E. McKenney's avatar
      rcu: Allow processing deferred QSes for exiting RCU-preempt readers · 27c744e3
      Paul E. McKenney authored
      If an RCU-preempt read-side critical section is exiting, that is,
      ->rcu_read_lock_nesting is negative, then it is a good time to look
      at the possibility of reporting deferred quiescent states.  This
      commit therefore updates the checks in rcu_preempt_need_deferred_qs()
      to allow exiting critical sections to report deferred quiescent states.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      27c744e3
    • Paul E. McKenney's avatar
      rcutorture: Test extended "rcu" read-side critical sections · c0335743
      Paul E. McKenney authored
      This commit makes the "rcu" torture type test extended read-side
      critical sections in order to test the deferral of RCU-preempt
      quiescent-state testing.
      
      In CONFIG_PREEMPT=n kernels, this simply duplicates the setup already
      in place for the "sched" torture type.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c0335743
    • Paul E. McKenney's avatar
      rcu: Defer reporting RCU-preempt quiescent states when disabled · 3e310098
      Paul E. McKenney authored
      This commit defers reporting of RCU-preempt quiescent states at
      rcu_read_unlock_special() time when any of interrupts, softirq, or
      preemption are disabled.  These deferred quiescent states are reported
      at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug
      offline operation.  Of course, if another RCU read-side critical
      section has started in the meantime, the reporting of the quiescent
      state will be further deferred.
      
      This also means that disabling preemption, interrupts, and/or
      softirqs will act as an RCU-preempt read-side critical section.
      This is enforced by checking preempt_count() as needed.
      
      Some special cases must be handled on an ad-hoc basis, for example,
      context switch is a quiescent state even though both the scheduler and
      do_exit() disable preemption.  In these cases, additional calls to
      rcu_preempt_deferred_qs() override the preemption disabling.  Similar
      logic overrides disabled interrupts in rcu_preempt_check_callbacks()
      because in this case the quiescent state happened just before the
      corresponding scheduling-clock interrupt.
      
      In theory, this change lifts a long-standing restriction that required
      that if interrupts were disabled across a call to rcu_read_unlock()
      that the matching rcu_read_lock() also be contained within that
      interrupts-disabled region of code.  Because the reporting of the
      corresponding RCU-preempt quiescent state is now deferred until
      after interrupts have been enabled, it is no longer possible for this
      situation to result in deadlocks involving the scheduler's runqueue and
      priority-inheritance locks.  This may allow some code simplification that
      might reduce interrupt latency a bit.  Unfortunately, in practice this
      would also defer deboosting a low-priority task that had been subjected
      to RCU priority boosting, so real-time-response considerations might
      well force this restriction to remain in place.
      
      Because RCU-preempt grace periods are now blocked not only by RCU
      read-side critical sections, but also by disabling of interrupts,
      preemption, and softirqs, it will be possible to eliminate RCU-bh and
      RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels.  This may
      require some additional plumbing to provide the network denial-of-service
      guarantees that have been traditionally provided by RCU-bh.  Once these
      are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh
      into RCU-sched.  This would mean that all kernels would have but
      one flavor of RCU, which would open the door to significant code
      cleanup.
      
      Moving to a single flavor of RCU would also have the beneficial effect
      of reducing the NOCB kthreads by at least a factor of two.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback
        from Joel Fernandes. ]
      [ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in
        response to bug reports from kbuild test robot. ]
      [ paulmck: Fix bug located by kbuild test robot involving recursion
        via rcu_preempt_deferred_qs(). ]
      3e310098
    • Byungchul Park's avatar
      rcu: Refactor rcu_{nmi,irq}_{enter,exit}() · cf7614e1
      Byungchul Park authored
      When entering or exiting irq or NMI handlers, the current code uses
      ->dynticks_nmi_nesting to detect if it is in the outermost handler,
      that is, the one interrupting or returning to an RCU-idle context (the
      idle loop or nohz_full usermode execution).  When entering the outermost
      handler via an interrupt (as opposed to NMI), it is necessary to invoke
      rcu_dynticks_task_exit() just before the CPU is marked non-idle from an
      RCU perspective and to invoke rcu_cleanup_after_idle() just after the
      CPU is marked non-idle.  Similarly, when exiting the outermost handler
      via an interrupt, it is necessary to invoke rcu_prepare_for_idle() just
      before marking the CPU idle and to invoke rcu_dynticks_task_enter()
      just after marking the CPU idle.
      
      The decision to execute these four functions is currently taken in
      rcu_irq_enter() and rcu_irq_exit() as follows:
      
         rcu_irq_enter()
            /* A conditional branch with ->dynticks_nmi_nesting */
            rcu_nmi_enter()
               /* A conditional branch with ->dynticks */
            /* A conditional branch with ->dynticks_nmi_nesting */
      
         rcu_irq_exit()
            /* A conditional branch with ->dynticks_nmi_nesting */
            rcu_nmi_exit()
               /* A conditional branch with ->dynticks_nmi_nesting */
            /* A conditional branch with ->dynticks_nmi_nesting */
      
         rcu_nmi_enter()
            /* A conditional branch with ->dynticks */
      
         rcu_nmi_exit()
            /* A conditional branch with ->dynticks_nmi_nesting */
      
      This works, but the conditional branches in rcu_irq_enter() and
      rcu_irq_exit() are redundant with those in rcu_nmi_enter() and
      rcu_nmi_exit(), respectively.  Redundant branches are not something
      we want in the to/from-idle fastpaths, so this commit refactors
      rcu_{nmi,irq}_{enter,exit}() so they use a common inlined function passed
      a constant argument as follows:
      
         rcu_irq_enter() inlining rcu_nmi_enter_common(irq=true)
            /* A conditional branch with ->dynticks */
      
         rcu_irq_exit() inlining rcu_nmi_exit_common(irq=true)
            /* A conditional branch with ->dynticks_nmi_nesting */
      
         rcu_nmi_enter() inlining rcu_nmi_enter_common(irq=false)
            /* A conditional branch with ->dynticks */
      
         rcu_nmi_exit() inlining rcu_nmi_exit_common(irq=false)
            /* A conditional branch with ->dynticks_nmi_nesting */
      
      The combination of the constant function argument and the inlining allows
      the compiler to discard the conditionals that previously controlled
      execution of rcu_dynticks_task_exit(), rcu_cleanup_after_idle(),
      rcu_prepare_for_idle(), and rcu_dynticks_task_enter().  This reduces both
      the to-idle and from-idle path lengths by two conditional branches each,
      and improves readability as well.
      
      This commit also changes order of execution from this:
      
      	rcu_dynticks_task_exit();
      	rcu_dynticks_eqs_exit();
      	trace_rcu_dyntick();
      	rcu_cleanup_after_idle();
      
      To this:
      
      	rcu_dynticks_task_exit();
      	rcu_dynticks_eqs_exit();
      	rcu_cleanup_after_idle();
      	trace_rcu_dyntick();
      
      In other words, the calls to rcu_cleanup_after_idle() and
      trace_rcu_dyntick() are reversed.  This has no functional effect because
      the real concern is whether a given call is before or after the call to
      rcu_dynticks_eqs_exit(), and this patch does not change that.  Before the
      call to rcu_dynticks_eqs_exit(), RCU is not yet watching the current
      CPU and after that call RCU is watching.
      
      A similar switch in calling order happens on the idle-entry path, with
      similar lack of effect for the same reasons.
      Suggested-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarByungchul Park <byungchul.park@lge.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Applied Steven Rostedt feedback. ]
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      cf7614e1
  2. 26 Aug, 2018 10 commits
    • Linus Torvalds's avatar
      Linux 4.19-rc1 · 5b394b2d
      Linus Torvalds authored
      5b394b2d
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b933d6eb
      Linus Torvalds authored
      Pull timer update from Thomas Gleixner:
       "New defines for the compat time* types so they can be shared between
        32bit and 64bit builds. Not used yet, but merging them now allows the
        actual conversions to be merged through different maintainer trees
        without dependencies
      
        We still have compat interfaces for 32bit on 64bit even with the new
        2038 safe timespec/val variants because pointer size is different. And
        for the old style timespec/val interfaces we need yet another 'compat'
        interface for both 32bit native and 32bit on 64bit"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        y2038: Provide aliases for compat helpers
      b933d6eb
    • Linus Torvalds's avatar
      Merge branch 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax · aba16dc5
      Linus Torvalds authored
      Pull IDA updates from Matthew Wilcox:
       "A better IDA API:
      
            id = ida_alloc(ida, GFP_xxx);
            ida_free(ida, id);
      
        rather than the cumbersome ida_simple_get(), ida_simple_remove().
      
        The new IDA API is similar to ida_simple_get() but better named.  The
        internal restructuring of the IDA code removes the bitmap
        preallocation nonsense.
      
        I hope the net -200 lines of code is convincing"
      
      * 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax: (29 commits)
        ida: Change ida_get_new_above to return the id
        ida: Remove old API
        test_ida: check_ida_destroy and check_ida_alloc
        test_ida: Convert check_ida_conv to new API
        test_ida: Move ida_check_max
        test_ida: Move ida_check_leaf
        idr-test: Convert ida_check_nomem to new API
        ida: Start new test_ida module
        target/iscsi: Allocate session IDs from an IDA
        iscsi target: fix session creation failure handling
        drm/vmwgfx: Convert to new IDA API
        dmaengine: Convert to new IDA API
        ppc: Convert vas ID allocation to new IDA API
        media: Convert entity ID allocation to new IDA API
        ppc: Convert mmu context allocation to new IDA API
        Convert net_namespace to new IDA API
        cb710: Convert to new IDA API
        rsxx: Convert to new IDA API
        osd: Convert to new IDA API
        sd: Convert to new IDA API
        ...
      aba16dc5
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugins-v4.19-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · c4726e77
      Linus Torvalds authored
      Pull gcc plugin fix from Kees Cook:
       "Lift gcc test into Kconfig. This is for better behavior when the
        kernel is built with Clang, reported by Stefan Agner"
      
      * tag 'gcc-plugins-v4.19-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: Disable when building under Clang
      c4726e77
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d207ea8e
      Linus Torvalds authored
      Pull perf updates from Thomas Gleixner:
       "Kernel:
         - Improve kallsyms coverage
         - Add x86 entry trampolines to kcore
         - Fix ARM SPE handling
         - Correct PPC event post processing
      
        Tools:
         - Make the build system more robust
         - Small fixes and enhancements all over the place
         - Update kernel ABI header copies
         - Preparatory work for converting libtraceevnt to a shared library
         - License cleanups"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
        tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'
        tools arch x86: Update tools's copy of cpufeatures.h
        perf python: Fix pyrf_evlist__read_on_cpu() interface
        perf mmap: Store real cpu number in 'struct perf_mmap'
        perf tools: Remove ext from struct kmod_path
        perf tools: Add gzip_is_compressed function
        perf tools: Add lzma_is_compressed function
        perf tools: Add is_compressed callback to compressions array
        perf tools: Move the temp file processing into decompress_kmodule
        perf tools: Use compression id in decompress_kmodule()
        perf tools: Store compression id into struct dso
        perf tools: Add compression id into 'struct kmod_path'
        perf tools: Make is_supported_compression() static
        perf tools: Make decompress_to_file() function static
        perf tools: Get rid of dso__needs_decompress() call in __open_dso()
        perf tools: Get rid of dso__needs_decompress() call in symbol__disassemble()
        perf tools: Get rid of dso__needs_decompress() call in read_object_code()
        tools lib traceevent: Change to SPDX License format
        perf llvm: Allow passing options to llc in addition to clang
        perf parser: Improve error message for PMU address filters
        ...
      d207ea8e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2a8a2b7c
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
      
       - Correct the L1TF fallout on 32bit and the off by one in the 'too much
         RAM for protection' calculation.
      
       - Add a helpful kernel message for the 'too much RAM' case
      
       - Unbreak the VDSO in case that the compiler desides to use indirect
         jumps/calls and emits retpolines which cannot be resolved because the
         kernel uses its own thunks, which does not work for the VDSO. Make it
         use the builtin thunks.
      
       - Re-export start_thread() which was unexported when the 32/64bit
         implementation was unified. start_thread() is required by modular
         binfmt handlers.
      
       - Trivial cleanups
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/speculation/l1tf: Suggest what to do on systems with too much RAM
        x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM
        x86/kvm/vmx: Remove duplicate l1d flush definitions
        x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
        x86/process: Re-export start_thread()
        x86/mce: Add notifier_block forward declaration
        x86/vdso: Fix vDSO build if a retpoline is emitted
      2a8a2b7c
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · de375035
      Linus Torvalds authored
      Pull irq update from Thomas Gleixner:
       "A small set of updats/fixes for the irq subsystem:
      
         - Allow GICv3 interrupts to be configured as wake-up sources to
           enable wakeup from suspend
      
         - Make the error handling of the STM32 irqchip init function work
      
         - A set of small cleanups and improvements"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3: Allow interrupt to be configured as wake-up sources
        irqchip/tango: Set irq handler and data in one go
        dt-bindings: irqchip: renesas-irqc: Document r8a774a1 support
        irqchip/s3c24xx: Remove unneeded comparison of unsigned long to 0
        irqchip/stm32: Fix init error handling
        irqchip/bcm7038-l1: Hide cpu offline callback when building for !SMP
      de375035
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a9ce3233
      Linus Torvalds authored
      Pull licking update from Thomas Gleixner:
       "Mark the switch cases which fall through to the next case with the
        proper comment so the fallthrough compiler checks can be enabled"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        futex: Mark expected switch fall-throughs
      a9ce3233
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-4.19_dax-memory-failure' of... · 2923b27e
      Linus Torvalds authored
      Merge tag 'libnvdimm-for-4.19_dax-memory-failure' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm
      
      Pull libnvdimm memory-failure update from Dave Jiang:
       "As it stands, memory_failure() gets thoroughly confused by dev_pagemap
        backed mappings. The recovery code has specific enabling for several
        possible page states and needs new enabling to handle poison in dax
        mappings.
      
        In order to support reliable reverse mapping of user space addresses:
      
         1/ Add new locking in the memory_failure() rmap path to prevent races
            that would typically be handled by the page lock.
      
         2/ Since dev_pagemap pages are hidden from the page allocator and the
            "compound page" accounting machinery, add a mechanism to determine
            the size of the mapping that encompasses a given poisoned pfn.
      
         3/ Given pmem errors can be repaired, change the speculatively
            accessed poison protection, mce_unmap_kpfn(), to be reversible and
            otherwise allow ongoing access from the kernel.
      
        A side effect of this enabling is that MADV_HWPOISON becomes usable
        for dax mappings, however the primary motivation is to allow the
        system to survive userspace consumption of hardware-poison via dax.
        Specifically the current behavior is:
      
           mce: Uncorrected hardware memory error in user-access at af34214200
           {1}[Hardware Error]: It has been corrected by h/w and requires no further action
           mce: [Hardware Error]: Machine check events logged
           {1}[Hardware Error]: event severity: corrected
           Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users
           [..]
           Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed
           mce: Memory error not recovered
           <reboot>
      
        ...and with these changes:
      
           Injecting memory failure for pfn 0x20cb00 at process virtual address 0x7f763dd00000
           Memory failure: 0x20cb00: Killing dax-pmd:5421 due to hardware memory corruption
           Memory failure: 0x20cb00: recovery action for dax page: Recovered
      
        Given all the cross dependencies I propose taking this through
        nvdimm.git with acks from Naoya, x86/core, x86/RAS, and of course dax
        folks"
      
      * tag 'libnvdimm-for-4.19_dax-memory-failure' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm, pmem: Restore page attributes when clearing errors
        x86/memory_failure: Introduce {set, clear}_mce_nospec()
        x86/mm/pat: Prepare {reserve, free}_memtype() for "decoy" addresses
        mm, memory_failure: Teach memory_failure() about dev_pagemap pages
        filesystem-dax: Introduce dax_lock_mapping_entry()
        mm, memory_failure: Collect mapping size in collect_procs()
        mm, madvise_inject_error: Let memory_failure() optionally take a page reference
        mm, dev_pagemap: Do not clear ->mapping on final put
        mm, madvise_inject_error: Disable MADV_SOFT_OFFLINE for ZONE_DEVICE pages
        filesystem-dax: Set page->index
        device-dax: Set page->index
        device-dax: Enable page_mapping()
        device-dax: Convert to vmf_insert_mixed and vm_fault_t
      2923b27e
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm · 828bf6e9
      Linus Torvalds authored
      Pull libnvdimm updates from Dave Jiang:
       "Collection of misc libnvdimm patches for 4.19 submission:
      
         - Adding support to read locked nvdimm capacity.
      
         - Change test code to make DSM failure code injection an override.
      
         - Add support for calculate maximum contiguous area for namespace.
      
         - Add support for queueing a short ARS when there is on going ARS for
           nvdimm.
      
         - Allow NULL to be passed in to ->direct_access() for kaddr and pfn
           params.
      
         - Improve smart injection support for nvdimm emulation testing.
      
         - Fix test code that supports for emulating controller temperature.
      
         - Fix hang on error before devm_memremap_pages()
      
         - Fix a bug that causes user memory corruption when data returned to
           user for ars_status.
      
         - Maintainer updates for Ross Zwisler emails and adding Jan Kara to
           fsdax"
      
      * tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm: fix ars_status output length calculation
        device-dax: avoid hang on error before devm_memremap_pages()
        tools/testing/nvdimm: improve emulation of smart injection
        filesystem-dax: Do not request kaddr and pfn when not required
        md/dm-writecache: Don't request pointer dummy_addr when not required
        dax/super: Do not request a pointer kaddr when not required
        tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access()
        s390, dcssblk: kaddr and pfn can be NULL to ->direct_access()
        libnvdimm, pmem: kaddr and pfn can be NULL to ->direct_access()
        acpi/nfit: queue issuing of ars when an uc error notification comes in
        libnvdimm: Export max available extent
        libnvdimm: Use max contiguous area for namespace size
        MAINTAINERS: Add Jan Kara for filesystem DAX
        MAINTAINERS: update Ross Zwisler's email address
        tools/testing/nvdimm: Fix support for emulating controller temperature
        tools/testing/nvdimm: Make DSM failure code injection an override
        acpi, nfit: Prefer _DSM over _LSR for namespace label reads
        libnvdimm: Introduce locked DIMM capacity support
      828bf6e9
  3. 25 Aug, 2018 8 commits
    • Linus Torvalds's avatar
      Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · b3262720
      Linus Torvalds authored
      Pull ARM SoC late updates from Olof Johansson:
       "A couple of late-merged changes that would be useful to get in this
        merge window:
      
         - Driver support for reset of audio complex on Meson platforms. The
           audio driver went in this merge window, and these changes have been
           in -next for a while (just not in our tree).
      
         - Power management fixes for IOMMU on Rockchip platforms, getting
           closer to kexec working on them, including Chromebooks.
      
         - Another pass updating "arm,psci" -> "psci" for some properties that
           have snuck in since last time it was done"
      
      * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        iommu/rockchip: Move irq request past pm_runtime_enable
        iommu/rockchip: Handle errors returned from PM framework
        arm64: rockchip: Force CONFIG_PM on Rockchip systems
        ARM: rockchip: Force CONFIG_PM on Rockchip systems
        arm64: dts: Fix various entry-method properties to reflect documentation
        reset: imx7: Fix always writing bits as 0
        reset: meson: add meson audio arb driver
        reset: meson: add dt-bindings for meson-axg audio arb
      b3262720
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 1bc27677
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - add build_{menu,n,g,x}config targets for compile-testing Kconfig
      
       - fix and improve recursive dependency detection in Kconfig
      
       - fix parallel building of menuconfig/nconfig
      
       - fix syntax error in clang-version.sh
      
       - suppress distracting log from syncconfig
      
       - remove obsolete "rpm" target
      
       - remove VMLINUX_SYMBOL(_STR) macro entirely
      
       - fix microblaze build with CONFIG_DYNAMIC_FTRACE
      
       - move compiler test for dead code/data elimination to Kconfig
      
       - rename well-known LDFLAGS variable to KBUILD_LDFLAGS
      
       - misc fixes and cleanups
      
      * tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: rename LDFLAGS to KBUILD_LDFLAGS
        kbuild: pass LDFLAGS to recordmcount.pl
        kbuild: test dead code/data elimination support in Kconfig
        initramfs: move gen_initramfs_list.sh from scripts/ to usr/
        vmlinux.lds.h: remove stale <linux/export.h> include
        export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR()
        Coccinelle: remove pci_alloc_consistent semantic to detect in zalloc-simple.cocci
        kbuild: make sorting initramfs contents independent of locale
        kbuild: remove "rpm" target, which is alias of "rpm-pkg"
        kbuild: Fix LOADLIBES rename in Documentation/kbuild/makefiles.txt
        kconfig: suppress "configuration written to .config" for syncconfig
        kconfig: fix "Can't open ..." in parallel build
        kbuild: Add a space after `!` to prevent parsing as file pattern
        scripts: modpost: check memory allocation results
        kconfig: improve the recursive dependency report
        kconfig: report recursive dependency involving 'imply'
        kconfig: error out when seeing recursive dependency
        kconfig: add build-only configurator targets
        scripts/dtc: consolidate include path options in Makefile
      1bc27677
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180825' of git://git.kernel.dk/linux-block · b8dcdab3
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few small fixes for this merge window:
      
         - Locking imbalance fix for bcache (Shan Hai)
      
         - A few small fixes for wbt. One is a cleanup/prep, one is a fix for
           an existing issue, and the last two are fixes for changes that went
           into this merge window (me)"
      
      * tag 'for-linus-20180825' of git://git.kernel.dk/linux-block:
        blk-wbt: don't maintain inflight counts if disabled
        blk-wbt: fix has-sleeper queueing check
        blk-wbt: use wq_has_sleeper() for wq active check
        blk-wbt: move disable check into get_limit()
        bcache: release dc->writeback_lock properly in bch_writeback_thread()
      b8dcdab3
    • Linus Torvalds's avatar
      Merge tag 'upstream-4.19-rc1-fix' of git://git.infradead.org/linux-ubifs · db84abf5
      Linus Torvalds authored
      Pull UBIFS fix from Richard Weinberger:
       "Remove an empty file from UBIFS source"
      
      * tag 'upstream-4.19-rc1-fix' of git://git.infradead.org/linux-ubifs:
        ubifs: Remove empty file.h
      db84abf5
    • Linus Torvalds's avatar
      Merge tag '4.19-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6 · 04faac10
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Three small SMB3 fixes, one for stable"
      
      * tag '4.19-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module version number for cifs.ko to 2.12
        cifs: check kmalloc before use
        cifs: check if SMB2 PDU size has been padded and suppress the warning
        cifs: create a define for how many iovs we need for an SMB2_open()
      04faac10
    • Linus Torvalds's avatar
      mm/cow: don't bother write protecting already write-protected pages · 1b2de5d0
      Linus Torvalds authored
      This is not normally noticeable, but repeated forks are unnecessarily
      expensive because they repeatedly dirty the parent page tables during
      the page table copy operation.
      
      It's trivial to just avoid write protecting the page table entry if it
      was already not writable.
      
      This patch was inspired by
      
          https://bugzilla.kernel.org/show_bug.cgi?id=200447
      
      which points to an ancient "waste time re-doing fork" issue in the
      presence of lots of signals.
      
      That bug was fixed by Eric Biederman's signal handling series
      culminating in commit c3ad2c3b ("signal: Don't restart fork when
      signals come in"), but the unnecessary work for repeated forks is still
      work just fixing, particularly since the fix is trivial.
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b2de5d0
    • Colin Ian King's avatar
      hpfs: remove unnecessary checks on the value of r when assigning error code · e0fcfe1f
      Colin Ian King authored
      At the point where r is being checked for different values, r is always
      going to be equal to 2 as the previous if statements jump to end or end1
      if r is not 2.  Hence the assignment to err can be simplified to just
      err an assignment without any checks on the value or r.
      
      Detected by CoverityScan, CID#1226737 ("Logically dead code")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0fcfe1f
    • Jens Axboe's avatar
      libata: maintainership update · 7634ccd2
      Jens Axboe authored
      Tejun Heo wrote:
      >
      > I asked Jens whether he could take care of the libata tree and he
      > thankfully agreed, so, from now on, Jens will be the libata
      > maintainer.
      >
      > Thanks a lot!
      
      Thanks for your work in this area. I still remember the first linux
      storage summit we did in Vancouver 2001, Tejun was invited to talk about
      his libata error handling work. Before that, it was basically a crap
      shoot if we recovered properly or not... A lot of water has flown under
      the bridge since then!
      
      Here's an "official" patch. Linus, can you apply it?
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7634ccd2
  4. 24 Aug, 2018 5 commits
    • Linus Torvalds's avatar
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 05193597
      Linus Torvalds authored
      Pull libata updates from Tejun Heo:
       "Nothing too interesting. Mostly ahci and ahci_platform changes, many
        around power management"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
        ata: ahci_platform: enable to get and control reset
        ata: libahci_platform: add reset control support
        ata: add an extra argument to ahci_platform_get_resources()
        ata: sata_rcar: Add r8a77965 support
        ata: sata_rcar: exclude setting of PHY registers in Gen3
        ata: sata_rcar: really mask all interrupts on Gen2 and later
        Revert "ata: ahci_platform: allow disabling of hotplug to save power"
        ata: libahci: Allow reconfigure of DEVSLP register
        ata: libahci: Correct setting of DEVSLP register
        ata: ahci: Enable DEVSLP by default on x86 with SLP_S0
        ata: ahci: Support state with min power but Partial low power state
        Revert "ata: ahci_platform: convert kcalloc to devm_kcalloc"
        ata: sata_rcar: Add rudimentary Runtime PM support
        ata: sata_rcar: Provide a short-hand for &pdev->dev
        ata: Only output sg element mapped number in verbose debug
        ata: Guard ata_scsi_dump_cdb() by ATA_VERBOSE_DEBUG
        ata: ahci_platform: convert kcalloc to devm_kcalloc
        ata: ahci_platform: convert kzallloc to kcalloc
        ata: ahci_platform: correct parameter documentation for ahci_platform_shutdown
        libata: remove ata_sff_data_xfer_noirq()
        ...
      05193597
    • Linus Torvalds's avatar
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 59676610
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
       "Just one commit from Steven to take out spin lock from trace event
        handlers"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup/tracing: Move taking of spin lock out of trace event handlers
      59676610
    • Linus Torvalds's avatar
      Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 9022ada8
      Linus Torvalds authored
      Pull workqueue updates from Tejun Heo:
       "Over the lockdep cross-release churn, workqueue lost some of the
        existing annotations. Johannes Berg restored it and also improved
        them"
      
      * 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: re-add lockdep dependencies for flushing
        workqueue: skip lockdep wq dependency in cancel_work_sync()
      9022ada8
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 18b8bfdf
      Linus Torvalds authored
      Pull IOMMU updates from Joerg Roedel:
      
       - PASID table handling updates for the Intel VT-d driver. It implements
         a global PASID space now so that applications usings multiple devices
         will just have one PASID.
      
       - A new config option to make iommu passthroug mode the default.
      
       - New sysfs attribute for iommu groups to export the type of the
         default domain.
      
       - A debugfs interface (for debug only) usable by IOMMU drivers to
         export internals to user-space.
      
       - R-Car Gen3 SoCs support for the ipmmu-vmsa driver
      
       - The ARM-SMMU now aborts transactions from unknown devices and devices
         not attached to any domain.
      
       - Various cleanups and smaller fixes all over the place.
      
      * tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (42 commits)
        iommu/omap: Fix cache flushes on L2 table entries
        iommu: Remove the ->map_sg indirection
        iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel
        iommu/arm-smmu-v3: Prevent any devices access to memory without registration
        iommu/ipmmu-vmsa: Don't register as BUS IOMMU if machine doesn't have IPMMU-VMSA
        iommu/ipmmu-vmsa: Clarify supported platforms
        iommu/ipmmu-vmsa: Fix allocation in atomic context
        iommu: Add config option to set passthrough as default
        iommu: Add sysfs attribyte for domain type
        iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register
        iommu/arm-smmu: Error out only if not enough context interrupts
        iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE
        iommu/io-pgtable-arm: Fix pgtable allocation in selftest
        iommu/vt-d: Remove the obsolete per iommu pasid tables
        iommu/vt-d: Apply per pci device pasid table in SVA
        iommu/vt-d: Allocate and free pasid table
        iommu/vt-d: Per PCI device pasid table interfaces
        iommu/vt-d: Add for_each_device_domain() helper
        iommu/vt-d: Move device_domain_info to header
        iommu/vt-d: Apply global PASID in SVA
        ...
      18b8bfdf
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · d972604f
      Linus Torvalds authored
      Pull thermal management updates from Zhang Rui:
      
       - Add Daniel Lezcano as the reviewer of thermal framework and SoC
         driver changes (Daniel Lezcano).
      
       - Fix a bug in intel_dts_soc_thermal driver, which does not translate
         IO-APIC GSI (Global System Interrupt) into Linux irq number (Hans de
         Goede).
      
       - For device tree bindings, allow cooling devices sharing same trip
         point with same contribution value to share cooling map (Viresh
         Kumar).
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        dt-bindings: thermal: Allow multiple devices to share cooling map
        MAINTAINERS: Add Daniel Lezcano as designated reviewer for thermal
        Thermal: Intel SoC DTS: Translate IO-APIC GSI number to linux irq number
      d972604f