1. 26 May, 2019 1 commit
  2. 20 Apr, 2019 4 commits
    • Sebastian Andrzej Siewior's avatar
      random: add a spinlock_t to struct batched_entropy · b7d5dc21
      Sebastian Andrzej Siewior authored
      The per-CPU variable batched_entropy_uXX is protected by get_cpu_var().
      This is just a preempt_disable() which ensures that the variable is only
      from the local CPU. It does not protect against users on the same CPU
      from another context. It is possible that a preemptible context reads
      slot 0 and then an interrupt occurs and the same value is read again.
      
      The above scenario is confirmed by lockdep if we add a spinlock:
      | ================================
      | WARNING: inconsistent lock state
      | 5.1.0-rc3+ #42 Not tainted
      | --------------------------------
      | inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      | ksoftirqd/9/56 [HC0[0]:SC1[1]:HE0:SE0] takes:
      | (____ptrval____) (batched_entropy_u32.lock){+.?.}, at: get_random_u32+0x3e/0xe0
      | {SOFTIRQ-ON-W} state was registered at:
      |   _raw_spin_lock+0x2a/0x40
      |   get_random_u32+0x3e/0xe0
      |   new_slab+0x15c/0x7b0
      |   ___slab_alloc+0x492/0x620
      |   __slab_alloc.isra.73+0x53/0xa0
      |   kmem_cache_alloc_node+0xaf/0x2a0
      |   copy_process.part.41+0x1e1/0x2370
      |   _do_fork+0xdb/0x6d0
      |   kernel_thread+0x20/0x30
      |   kthreadd+0x1ba/0x220
      |   ret_from_fork+0x3a/0x50
      …
      | other info that might help us debug this:
      |  Possible unsafe locking scenario:
      |
      |        CPU0
      |        ----
      |   lock(batched_entropy_u32.lock);
      |   <Interrupt>
      |     lock(batched_entropy_u32.lock);
      |
      |  *** DEADLOCK ***
      |
      | stack backtrace:
      | Call Trace:
      …
      |  kmem_cache_alloc_trace+0x20e/0x270
      |  ipmi_alloc_recv_msg+0x16/0x40
      …
      |  __do_softirq+0xec/0x48d
      |  run_ksoftirqd+0x37/0x60
      |  smpboot_thread_fn+0x191/0x290
      |  kthread+0xfe/0x130
      |  ret_from_fork+0x3a/0x50
      
      Add a spinlock_t to the batched_entropy data structure and acquire the
      lock while accessing it. Acquire the lock with disabled interrupts
      because this function may be used from interrupt context.
      
      Remove the batched_entropy_reset_lock lock. Now that we have a lock for
      the data scructure, we can access it from a remote CPU.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      b7d5dc21
    • George Spelvin's avatar
      random: document get_random_int() family · 92e507d2
      George Spelvin authored
      Explain what these functions are for and when they offer
      an advantage over get_random_bytes().
      
      (We still need documentation on rng_is_initialized(), the
      random_ready_callback system, and early boot in general.)
      Signed-off-by: default avatarGeorge Spelvin <lkml@sdf.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      92e507d2
    • Jon DeVree's avatar
      random: fix CRNG initialization when random.trust_cpu=1 · fe6f1a6a
      Jon DeVree authored
      When the system boots with random.trust_cpu=1 it doesn't initialize the
      per-NUMA CRNGs because it skips the rest of the CRNG startup code. This
      means that the code from 1e7f583a ("random: make /dev/urandom scalable
      for silly userspace programs") is not used when random.trust_cpu=1.
      
      crash> dmesg | grep random:
      [    0.000000] random: get_random_bytes called from start_kernel+0x94/0x530 with crng_init=0
      [    0.314029] random: crng done (trusting CPU's manufacturer)
      crash> print crng_node_pool
      $6 = (struct crng_state **) 0x0
      
      After adding the missing call to numa_crng_init() the per-NUMA CRNGs are
      initialized again:
      
      crash> dmesg | grep random:
      [    0.000000] random: get_random_bytes called from start_kernel+0x94/0x530 with crng_init=0
      [    0.314031] random: crng done (trusting CPU's manufacturer)
      crash> print crng_node_pool
      $1 = (struct crng_state **) 0xffff9a915f4014a0
      
      The call to invalidate_batched_entropy() was also missing. This is
      important for architectures like PPC and S390 which only have the
      arch_get_random_seed_* functions.
      
      Fixes: 39a8883a ("random: add a config option to trust the CPU's hwrng")
      Signed-off-by: default avatarJon DeVree <nuxi@vault24.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      fe6f1a6a
    • Kees Cook's avatar
      random: move rand_initialize() earlier · d5553523
      Kees Cook authored
      Right now rand_initialize() is run as an early_initcall(), but it only
      depends on timekeeping_init() (for mixing ktime_get_real() into the
      pools). However, the call to boot_init_stack_canary() for stack canary
      initialization runs earlier, which triggers a warning at boot:
      
      random: get_random_bytes called from start_kernel+0x357/0x548 with crng_init=0
      
      Instead, this moves rand_initialize() to after timekeeping_init(), and moves
      canary initialization here as well.
      
      Note that this warning may still remain for machines that do not have
      UEFI RNG support (which initializes the RNG pools during setup_arch()),
      or for x86 machines without RDRAND (or booting without "random.trust=on"
      or CONFIG_RANDOM_TRUST_CPU=y).
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d5553523
  3. 17 Apr, 2019 4 commits
  4. 14 Apr, 2019 6 commits
    • Linus Torvalds's avatar
      Linux 5.1-rc5 · dc4060a5
      Linus Torvalds authored
      dc4060a5
    • Linus Torvalds's avatar
      Merge branch 'page-refs' (page ref overflow) · 6b3a7077
      Linus Torvalds authored
      Merge page ref overflow branch.
      
      Jann Horn reported that he can overflow the page ref count with
      sufficient memory (and a filesystem that is intentionally extremely
      slow).
      
      Admittedly it's not exactly easy.  To have more than four billion
      references to a page requires a minimum of 32GB of kernel memory just
      for the pointers to the pages, much less any metadata to keep track of
      those pointers.  Jann needed a total of 140GB of memory and a specially
      crafted filesystem that leaves all reads pending (in order to not ever
      free the page references and just keep adding more).
      
      Still, we have a fairly straightforward way to limit the two obvious
      user-controllable sources of page references: direct-IO like page
      references gotten through get_user_pages(), and the splice pipe page
      duplication.  So let's just do that.
      
      * branch page-refs:
        fs: prevent page refcount overflow in pipe_buf_get
        mm: prevent get_user_pages() from overflowing page refcount
        mm: add 'try_get_page()' helper function
        mm: make page ref count overflow check tighter and more explicit
      6b3a7077
    • Matthew Wilcox's avatar
      fs: prevent page refcount overflow in pipe_buf_get · 15fab63e
      Matthew Wilcox authored
      Change pipe_buf_get() to return a bool indicating whether it succeeded
      in raising the refcount of the page (if the thing in the pipe is a page).
      This removes another mechanism for overflowing the page refcount.  All
      callers converted to handle a failure.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      15fab63e
    • Linus Torvalds's avatar
      mm: prevent get_user_pages() from overflowing page refcount · 8fde12ca
      Linus Torvalds authored
      If the page refcount wraps around past zero, it will be freed while
      there are still four billion references to it.  One of the possible
      avenues for an attacker to try to make this happen is by doing direct IO
      on a page multiple times.  This patch makes get_user_pages() refuse to
      take a new page reference if there are already more than two billion
      references to the page.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fde12ca
    • Linus Torvalds's avatar
      mm: add 'try_get_page()' helper function · 88b1a17d
      Linus Torvalds authored
      This is the same as the traditional 'get_page()' function, but instead
      of unconditionally incrementing the reference count of the page, it only
      does so if the count was "safe".  It returns whether the reference count
      was incremented (and is marked __must_check, since the caller obviously
      has to be aware of it).
      
      Also like 'get_page()', you can't use this function unless you already
      had a reference to the page.  The intent is that you can use this
      exactly like get_page(), but in situations where you want to limit the
      maximum reference count.
      
      The code currently does an unconditional WARN_ON_ONCE() if we ever hit
      the reference count issues (either zero or negative), as a notification
      that the conditional non-increment actually happened.
      
      NOTE! The count access for the "safety" check is inherently racy, but
      that doesn't matter since the buffer we use is basically half the range
      of the reference count (ie we look at the sign of the count).
      Acked-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      88b1a17d
    • Linus Torvalds's avatar
      mm: make page ref count overflow check tighter and more explicit · f958d7b5
      Linus Torvalds authored
      We have a VM_BUG_ON() to check that the page reference count doesn't
      underflow (or get close to overflow) by checking the sign of the count.
      
      That's all fine, but we actually want to allow people to use a "get page
      ref unless it's already very high" helper function, and we want that one
      to use the sign of the page ref (without triggering this VM_BUG_ON).
      
      Change the VM_BUG_ON to only check for small underflows (or _very_ close
      to overflowing), and ignore overflows which have strayed into negative
      territory.
      Acked-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f958d7b5
  5. 13 Apr, 2019 14 commits
  6. 12 Apr, 2019 11 commits
    • Leonard Crestez's avatar
      clk: imx: Fix PLL_1416X not rounding rates · f89b9e1b
      Leonard Crestez authored
      Code which initializes the "clk_init_data.ops" checks pll->rate_table
      before that field is ever assigned to so it always picks
      "clk_pll1416x_min_ops".
      
      This breaks dynamic rate rounding for features such as cpufreq.
      
      Fix by checking pll_clk->rate_table instead, here pll_clk refers to
      the constant initialization data coming from per-soc clk driver.
      Signed-off-by: default avatarLeonard Crestez <leonard.crestez@nxp.com>
      Fixes: 8646d4dc ("clk: imx: Add PLLs driver for imx8mm soc")
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      f89b9e1b
    • Weiyi Lu's avatar
      clk: mediatek: fix clk-gate flag setting · b3cf181c
      Weiyi Lu authored
      CLK_SET_RATE_PARENT would be dropped.
      Merge two flag setting together to correct the error.
      
      Fixes: 5a1cc4c2 ("clk: mediatek: Add flags to mtk_gate")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarWeiyi Lu <weiyi.lu@mediatek.com>
      Reviewed-by: default avatarMatthias Brugger <matthias.bgg@gmail.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      b3cf181c
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.1-1' of git://git.infradead.org/users/hch/dma-mapping · 8ee15f32
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
       "Fix a sparc64 sun4v_pci regression introduced in this merged window,
        and a dma-debug stracktrace regression from the big refactor last
        merge window"
      
      * tag 'dma-mapping-5.1-1' of git://git.infradead.org/users/hch/dma-mapping:
        dma-debug: only skip one stackframe entry
        sparc64/pci_sun4v: fix ATU checks for large DMA masks
      8ee15f32
    • Linus Torvalds's avatar
      Merge tag 'iommu-fix-v5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 4876191c
      Linus Torvalds authored
      Pull IOMMU fix from Joerg Roedel:
       "Fix an AMD IOMMU issue where the driver didn't correctly setup the
        exclusion range in the hardware registers, resulting in exclusion
        ranges being one page too big.
      
        This can cause data corruption of the address of that last page is
        used by DMA operations"
      
      * tag 'iommu-fix-v5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Set exclusion range correctly
      4876191c
    • Linus Torvalds's avatar
      Merge tag 'clang-format-for-linus-v5.1-rc5' of git://github.com/ojeda/linux · 8e72d95d
      Linus Torvalds authored
      Pull clang-format update from Miguel Ojeda:
       "The usual roughly-per-release .clang-format macro list update"
      
      * tag 'clang-format-for-linus-v5.1-rc5' of git://github.com/ojeda/linux:
        clang-format: Update with the latest for_each macro list
      8e72d95d
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · ea951a94
      Linus Torvalds authored
      Pull MMC host fixes from Ulf Hansson:
      
       - alcor: Stabilize data write requests
      
       - sdhci-omap: Fix command error path during tuning
      
      * tag 'mmc-v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning
        mmc: alcor: don't write data before command has completed
      ea951a94
    • Linus Torvalds's avatar
      Merge tag 'sound-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 372686e6
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Well, this one became unpleasantly larger than previous pull requests,
        but it's a kind of usual pattern: now it contains a collection of ASoC
        fixes, and nothing to worry too much.
      
        The fixes for ASoC core (DAPM, DPCM, topology) are all small and just
        covering corner cases. The rest changes are driver-specific, many of
        which are for x86 platforms and new drivers like STM32, in addition to
        the usual fixups for HD-audio"
      
      * tag 'sound-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (66 commits)
        ASoC: wcd9335: Fix missing regmap requirement
        ALSA: hda: Fix racy display power access
        ASoC: pcm: fix error handling when try_module_get() fails.
        ASoC: stm32: sai: fix master clock management
        ASoC: Intel: kbl: fix wrong number of channels
        ALSA: hda - Add two more machines to the power_save_blacklist
        ASoC: pcm: update module refcount if module_get_upon_open is set
        ASoC: core: conditionally increase module refcount on component open
        ASoC: stm32: fix sai driver name initialisation
        ASoC: topology: Use the correct dobj to free enum control values and texts
        ALSA: seq: Fix OOB-reads from strlcpy
        ASoC: intel: skylake: add remove() callback for component driver
        ASoC: cs35l35: Disable regulators on driver removal
        ALSA: xen-front: Do not use stream buffer size before it is set
        ASoC: rockchip: pdm: change dma burst to 8
        ASoC: rockchip: pdm: fix regmap_ops hang issue
        ASoC: simple-card: don't select DPCM via simple-audio-card
        ASoC: audio-graph-card: don't select DPCM via audio-graph-card
        ASoC: tlv320aic32x4: Change author's name
        ALSA: hda/realtek - Add quirk for Tuxedo XC 1509
        ...
      372686e6
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · f2a73469
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "Fix an ACPICA issue introduced during the 4.20 development cycle and
        causing some systems to crash because of leftover operation region
        data still maintained after the operation region in question has gone
        away (Erik Schmauss)"
      
      * tag 'acpi-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPICA: Namespace: remove address node from global list after method termination
      f2a73469
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-04-12' of git://anongit.freedesktop.org/drm/drm · 58890f31
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Fixes across the driver spectrum this week, the mediatek fbdev support
        might be a bit late for this round, but I looked over it and it's not
        very large and seems like a useful feature for them.
      
        Otherwise the main thing is a regression fix for i915 5.0 bug that
        caused black screens on a bunch of Dell XPS 15s I think, I know at
        least Fedora is waiting for this to land, and the udl fix is also for
        a regression since 5.0 where unplugging the device would end badly.
      
        core:
         - make atomic hooks optional
      
        i915:
         - Revert a 5.0 regression where some eDP panels stopped working
         - DSI related fixes for platforms up to IceLake
         - GVT (regression fix, warning fix, use-after free fix)
      
        amdgpu:
         - Cursor fixes
         - missing PCI ID fix for KFD
         - XGMI fix
         - shadow buffer handling after reset fix
      
        udl:
         - fix unplugging device crashes.
      
        mediatek:
         - stabilise MT2701 HDMI support
         - fbdev support
      
        tegra:
         - fix for build regression in rc1.
      
        sun4i:
         - Allwinner A6 max freq improvements
         - null ptr deref fix
      
        dw-hdmi:
         - SCDC configuration improvements
      
        omap:
         - CEC clock management policy fix"
      
      * tag 'drm-fixes-2019-04-12' of git://anongit.freedesktop.org/drm/drm: (32 commits)
        gpu: host1x: Fix compile error when IOMMU API is not available
        drm/i915/gvt: Roundup fb->height into tile's height at calucation fb->size
        drm/i915/dp: revert back to max link rate and lane count on eDP
        drm/i915/icl: Fix port disable sequence for mipi-dsi
        drm/i915/icl: Ungate ddi clocks before IO enable
        drm/mediatek: no change parent rate in round_rate() for MT2701 hdmi phy
        drm/mediatek: using new factor for tvdpll for MT2701 hdmi phy
        drm/mediatek: remove flag CLK_SET_RATE_PARENT for MT2701 hdmi phy
        drm/mediatek: make implementation of recalc_rate() for MT2701 hdmi phy
        drm/mediatek: fix the rate and divder of hdmi phy for MT2701
        drm/mediatek: fix possible object reference leak
        drm/i915: Get power refs in encoder->get_power_domains()
        drm/i915: Fix pipe_bpp readout for BXT/GLK DSI
        drm/amd/display: Fix negative cursor pos programming (v2)
        drm/sun4i: tcon top: Fix NULL/invalid pointer dereference in sun8i_tcon_top_un/bind
        drm/udl: add a release method and delay modeset teardown
        drm/i915/gvt: Prevent use-after-free in ppgtt_free_all_spt()
        drm/i915/gvt: Annotate iomem usage
        drm/sun4i: DW HDMI: Lower max. supported rate for H6
        Revert "Documentation/gpu/meson: Remove link to meson_canvas.c"
        ...
      58890f31
    • Will Deacon's avatar
      arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value · 045afc24
      Will Deacon authored
      Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
      explicitly set the return value on the non-faulting path and instead
      leaves it holding the result of the underlying atomic operation. This
      means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
      value will be reported as having failed. Regrettably, I wrote the buggy
      code back in 2011 and it was upstreamed as part of the initial arm64
      support in 2012.
      
      The reasons we appear to get away with this are:
      
        1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
           exercised by futex() test applications
      
        2. If the result of the atomic operation is zero, the system call
           behaves correctly
      
        3. Prior to version 2.25, the only operation used by GLIBC set the
           futex to zero, and therefore worked as expected. From 2.25 onwards,
           FUTEX_WAKE_OP is not used by GLIBC at all.
      
      Fix the implementation by ensuring that the return value is either 0
      to indicate that the atomic operation completed successfully, or -EFAULT
      if we encountered a fault when accessing the user mapping.
      
      Cc: <stable@kernel.org>
      Fixes: 6170a974 ("arm64: Atomic operations")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      045afc24
    • Joerg Roedel's avatar
      iommu/amd: Set exclusion range correctly · 3c677d20
      Joerg Roedel authored
      The exlcusion range limit register needs to contain the
      base-address of the last page that is part of the range, as
      bits 0-11 of this register are treated as 0xfff by the
      hardware for comparisons.
      
      So correctly set the exclusion range in the hardware to the
      last page which is _in_ the range.
      
      Fixes: b2026aa2 ('x86, AMD IOMMU: add functions for programming IOMMU MMIO space')
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      3c677d20