1. 01 Jun, 2020 24 commits
    • Linus Torvalds's avatar
      Merge tag 'edac_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 8b11dd54
      Linus Torvalds authored
      Pull EDAC updates from Borislav Petkov:
      
       - Fix i10nm_edac loading on some Ice Lake and Tremont/Jacobsville
         steppings due to the offset change of the bus number configuration
         register, by Qiuxu Zhuo.
      
       - The usual cleanups and fixes all over the place.
      
      * tag 'edac_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/amd64: Remove redundant assignment to variable ret in hw_info_get()
        EDAC/skx: Use the mcmtr register to retrieve close_pg/bank_xor_enable
        EDAC/i10nm: Update driver to support different bus number config register offsets
        EDAC, {skx,i10nm}: Make some configurations CPU model specific
        EDAC/amd8131: Remove defined but not used bridge_str
        EDAC/thunderx: Make symbols static
        MAINTAINERS: Remove sifive_l2_cache.c from EDAC-SIFIVE pattern
        EDAC/xgene: Remove set but not used address local var
        EDAC/armada_xp: Fix some log messages
      8b11dd54
    • Linus Torvalds's avatar
      Merge tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux · ca1f5df2
      Linus Torvalds authored
      Pull printk updates from Petr Mladek:
      
       - Benjamin Herrenschmidt solved a problem with non-matched console
         aliases by first checking consoles defined on the command line. It is
         a more conservative approach than the previous attempts.
      
       - Benjamin also made sure that the console accessible via /dev/console
         always has CON_CONSDEV flag.
      
       - Andy Shevchenko added the %ptT modifier for printing struct time64_t.
         It extends the existing %ptR handling for struct rtc_time.
      
       - Bruno Meneguele fixed /dev/kmsg error value returned by unsupported
         SEEK_CUR.
      
       - Tetsuo Handa removed unused pr_cont_once().
      
      ... and a few small fixes.
      
      * tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
        printk: Remove pr_cont_once()
        printk: handle blank console arguments passed in.
        kernel/printk: add kmsg SEEK_CUR handling
        printk: Fix a typo in comment "interator"->"iterator"
        usb: pulse8-cec: Switch to use %ptT
        ARM: bcm2835: Switch to use %ptT
        lib/vsprintf: Print time64_t in human readable format
        lib/vsprintf: update comment about simple_strto<foo>() functions
        printk: Correctly set CON_CONSDEV even when preferred console was not registered
        printk: Fix preferred console selection with multiple matches
        printk: Move console matching logic into a separate function
        printk: Convert a use of sprintf to snprintf in console_unlock
      ca1f5df2
    • Linus Torvalds's avatar
      Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · 4d67829e
      Linus Torvalds authored
      Pull fsverity updates from Eric Biggers:
       "Fix kerneldoc warnings and some coding style inconsistencies.
      
        This mirrors the similar cleanups being done in fs/crypto/"
      
      * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fs-verity: remove unnecessary extern keywords
        fs-verity: fix all kerneldoc warnings
      4d67829e
    • Linus Torvalds's avatar
      Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · afdb0f2e
      Linus Torvalds authored
      Pull fscrypt updates from Eric Biggers:
      
       - Add the IV_INO_LBLK_32 encryption policy flag which modifies the
         encryption to be optimized for eMMC inline encryption hardware.
      
       - Make the test_dummy_encryption mount option for ext4 and f2fs support
         v2 encryption policies.
      
       - Fix kerneldoc warnings and some coding style inconsistencies.
      
      * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fscrypt: add support for IV_INO_LBLK_32 policies
        fscrypt: make test_dummy_encryption use v2 by default
        fscrypt: support test_dummy_encryption=v2
        fscrypt: add fscrypt_add_test_dummy_key()
        linux/parser.h: add include guards
        fscrypt: remove unnecessary extern keywords
        fscrypt: name all function parameters
        fscrypt: fix all kerneldoc warnings
      afdb0f2e
    • Linus Torvalds's avatar
      Merge tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 829f3b94
      Linus Torvalds authored
      Pull pstore updates from Kees Cook:
       "Fixes and new features for pstore.
      
        This is a pretty big set of changes (relative to past pstore pulls),
        but it has been in -next for a while. The biggest change here is the
        ability to support a block device as a pstore backend, which has been
        desired for a while. A lot of additional fixes and refactorings are
        also included, mostly in support of the new features.
      
         - refactor pstore locking for safer module unloading (Kees Cook)
      
         - remove orphaned records from pstorefs when backend unloaded (Kees
           Cook)
      
         - refactor dump_oops parameter into max_reason (Pavel Tatashin)
      
         - introduce pstore/zone for common code for contiguous storage
           (WeiXiong Liao)
      
         - introduce pstore/blk for block device backend (WeiXiong Liao)
      
         - introduce mtd backend (WeiXiong Liao)"
      
      * tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (35 commits)
        mtd: Support kmsg dumper based on pstore/blk
        pstore/blk: Introduce "best_effort" mode
        pstore/blk: Support non-block storage devices
        pstore/blk: Provide way to query pstore configuration
        pstore/zone: Provide way to skip "broken" zone for MTD devices
        Documentation: Add details for pstore/blk
        pstore/zone,blk: Add ftrace frontend support
        pstore/zone,blk: Add console frontend support
        pstore/zone,blk: Add support for pmsg frontend
        pstore/blk: Introduce backend for block devices
        pstore/zone: Introduce common layer to manage storage zones
        ramoops: Add "max-reason" optional field to ramoops DT node
        pstore/ram: Introduce max_reason and convert dump_oops
        pstore/platform: Pass max_reason to kmesg dump
        printk: Introduce kmsg_dump_reason_str()
        printk: honor the max_reason field in kmsg_dumper
        printk: Collapse shutdown types into a single dump reason
        pstore/ftrace: Provide ftrace log merging routine
        pstore/ram: Refactor ftrace buffer merging
        pstore/ram: Refactor DT size parsing
        ...
      829f3b94
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 81e8c10d
      Linus Torvalds authored
      Pull crypto updates from Herbert Xu:
       "API:
         - Introduce crypto_shash_tfm_digest() and use it wherever possible.
         - Fix use-after-free and race in crypto_spawn_alg.
         - Add support for parallel and batch requests to crypto_engine.
      
        Algorithms:
         - Update jitter RNG for SP800-90B compliance.
         - Always use jitter RNG as seed in drbg.
      
        Drivers:
         - Add Arm CryptoCell driver cctrng.
         - Add support for SEV-ES to the PSP driver in ccp"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (114 commits)
        crypto: hisilicon - fix driver compatibility issue with different versions of devices
        crypto: engine - do not requeue in case of fatal error
        crypto: cavium/nitrox - Fix a typo in a comment
        crypto: hisilicon/qm - change debugfs file name from qm_regs to regs
        crypto: hisilicon/qm - add DebugFS for xQC and xQE dump
        crypto: hisilicon/zip - add debugfs for Hisilicon ZIP
        crypto: hisilicon/hpre - add debugfs for Hisilicon HPRE
        crypto: hisilicon/sec2 - add debugfs for Hisilicon SEC
        crypto: hisilicon/qm - add debugfs to the QM state machine
        crypto: hisilicon/qm - add debugfs for QM
        crypto: stm32/crc32 - protect from concurrent accesses
        crypto: stm32/crc32 - don't sleep in runtime pm
        crypto: stm32/crc32 - fix multi-instance
        crypto: stm32/crc32 - fix run-time self test issue.
        crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
        crypto: hisilicon/zip - Use temporary sqe when doing work
        crypto: hisilicon - add device error report through abnormal irq
        crypto: hisilicon - remove codes of directly report device errors through MSI
        crypto: hisilicon - QM memory management optimization
        crypto: hisilicon - unify initial value assignment into QM
        ...
      81e8c10d
    • Linus Torvalds's avatar
      Merge tag 'i3c/for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · 729ea4e0
      Linus Torvalds authored
      Pull i3c update from Boris Brezillon:
       "Fix GETMRL's logic"
      
      * tag 'i3c/for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        i3c master: GETMRL's 3rd byte is optional even with BCR_IBI_PAYLOAD
      729ea4e0
    • Linus Torvalds's avatar
      Merge tag 'regulator-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · d30fc97c
      Linus Torvalds authored
      Pull regulator updates from Mark Brown:
       "The big change in this release is that Matti Vaittinen has factored
        out the linear ranges support into a separate library in lib/ since it
        is also useful for at least the power subsystem (and most likely
        others too), it helps subsystems which need to map register values
        into more useful real world values do so with minimal per-driver code.
      
         - Factoring out of the linear ranges support into a library in lib/
           from Matti Vaittinen.
      
         - Trace points for bypass mode.
      
         - Use the consumer name in debugfs to make it easier to understand.
      
         - New drivers for Maxim MAX77826 and MAX8998"
      
      * tag 'regulator-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (23 commits)
        regulator: max8998: max8998_set_current_limit() can be static
        dt-bindings: regulator: Convert anatop regulator to json-schema
        regulator: core: Add regulator bypass trace points
        regulator: extract voltage balancing code to the separate function
        regulator/mfd: max8998: Document charger regulator
        regulator: max8998: Add charger regulator
        MAINTAINERS: Add maintainer entry for linear ranges helper
        regulator: bd718x7: remove voltage change restriction from BD71847 LDOs
        lib: linear_ranges: Add missing MODULE_LICENSE()
        regulator: use linear_ranges helper
        power: supply: bd70528: rename linear_range to avoid collision
        lib/test_linear_ranges: add a test for the 'linear_ranges'
        lib: add linear ranges helpers
        regulator: db8500-prcmu: Use true,false for bool variable
        regulator: bd718x7: remove voltage change restriction from BD71847
        regulator: max77826: Remove erroneous additionalProperties
        regulator: qcom-rpmh: Fix typos in pm8150 and pm8150l
        regulator: Document bindings for max77826
        regulator: max77826: Add max77826 regulator driver
        regulator: tps80031: remove redundant assignment to variables ret and val
        ...
      d30fc97c
    • Linus Torvalds's avatar
      Merge tag 'spi-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · a36de5eb
      Linus Torvalds authored
      Pull spi updates from Mark Brown:
       "This has been a very active release for the DesignWare driver in
        particular - after a long period of inactivity we have had a lot of
        people actively working on it for unrelated reasons this cycle with
        some of that work still not landed.
      
        Otherwise it's been fairly quiet for the subsystem.
      
        Highlights include:
      
         - Lots of performance improvements and fixes for the DesignWare
           driver from Serge Semin, Andy Shevchenko, Wan Ahmad Zainie, Clement
           Leger, Dinh Nguyen and Jarkko Nikula.
      
         - Support for octal mode transfers in spidev.
      
         - Slave mode support for the Rockchip drivers.
      
         - Support for AMD controllers, Broadcom mspi and Raspberry Pi 4, and
           Intel Elkhart Lake"
      
      * tag 'spi-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (125 commits)
        spi: spi-fsl-dspi: fix native data copy
        spi: Convert DW SPI binding to DT schema
        spi: dw: Refactor mid_spi_dma_setup() to separate DMA and IRQ config
        spi: dw: Make DMA request line assignments explicit for Intel Medfield
        spi: bcm2835: Remove shared interrupt support
        dt-bindings: snps,dw-apb-ssi: add optional reset property
        spi: dw: add reset control
        spi: bcm2835: Enable shared interrupt support
        spi: bcm2835: Implement shutdown callback
        spi: dw: Use regset32 DebugFS method to create regdump file
        spi: dw: Add DMA support to the DW SPI MMIO driver
        spi: dw: Cleanup generic DW DMA code namings
        spi: dw: Add DW SPI DMA/PCI/MMIO dependency on the DW SPI core
        spi: dw: Remove DW DMA code dependency from DW_DMAC_PCI
        spi: dw: Move Non-DMA code to the DW PCIe-SPI driver
        spi: dw: Add core suffix to the DW APB SSI core source file
        spi: dw: Fix Rx-only DMA transfers
        spi: dw: Use DMA max burst to set the request thresholds
        spi: dw: Parameterize the DMA Rx/Tx burst length
        spi: dw: Add SPI Rx-done wait method to DMA-based transfer
        ...
      a36de5eb
    • Linus Torvalds's avatar
      Merge tag 'regmap-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 213fd09e
      Linus Torvalds authored
      Pull regmap updates from Mark Brown:
       "This has been a very active release for the regmap API for some
        reason, a lot of it due to new devices with odd requirements that can
        sensibly be handled here.
      
         - Add support for buses implementing a custom reg_update_bits()
           method in case the bus has a native operation for this.
      
         - Support 16 bit register addresses in SMBus.
      
         - Allow customization of the device attached to regmap-irq.
      
         - Helpers for bitfield operations and per-port field initializations"
      
      * tag 'regmap-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: provide helpers for simple bit operations
        regmap: add helper for per-port regfield initialization
        regmap-i2c: add 16-bit width registers support
        regmap: Simplify implementation of the regmap_field_read_poll_timeout() macro
        regmap: Simplify implementation of the regmap_read_poll_timeout() macro
        regmap: add reg_sequence helpers
        regmap-irq: make it possible to add irq_chip do a specific device node
        regmap: Add bus reg_update_bits() support
        regmap: debugfs: check count when read regmap file
      213fd09e
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 129b9a5c
      Linus Torvalds authored
      Pull hwmon updates from Guenter Roeck:
       "Infrastructure:
         - Add notification support
      
        New drivers:
         - Baikal-T1 PVT sensor driver
         - amd_energy driver to report energy counters
         - Driver for Maxim MAX16601
         - Gateworks System Controller
      
        Various:
         - applesmc: avoid overlong udelay()
         - dell-smm: Use one DMI match for all XPS models
         - ina2xx: Implement alert functions
         - lm70: Add support for ACPI
         - lm75: Fix coding-style warnings
         - lm90: Add max6654 support to lm90 driver
         - nct7802: Replace container_of() API
         - nct7904: Set default timeout
         - nct7904: Add watchdog function
         - pmbus: Improve initialization of 'currpage' and 'currphase'"
      
      * tag 'hwmon-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (24 commits)
        hwmon: Add Baikal-T1 PVT sensor driver
        hwmon: Add notification support
        dt-bindings: hwmon: Add Baikal-T1 PVT sensor binding
        hwmon: (applesmc) avoid overlong udelay()
        hwmon: (nct7904) Set default timeout
        hwmon: (amd_energy) Missing platform_driver_unregister() on error in amd_energy_init()
        MAINTAINERS: add entry for AMD energy driver
        hwmon: (amd_energy) Add documentation
        hwmon: Add amd_energy driver to report energy counters
        hwmon: (nct7802) Replace container_of() API
        hwmon: (lm90) Add max6654 support to lm90 driver
        hwmon : (nct6775) Use kobj_to_dev() API
        hwmon: (pmbus) Driver for Maxim MAX16601
        hwmon: (pmbus) Improve initialization of 'currpage' and 'currphase'
        hwmon: (adt7411) update contact email
        hwmon: (lm75) Fix all coding-style warnings on lm75 driver
        hwmon: Reduce indentation level in __hwmon_device_register()
        hwmon: (ina2xx) Implement alert functions
        hwmon: (lm70) Add support for ACPI
        hwmon: (dell-smm) Use one DMI match for all XPS models
        ...
      129b9a5c
    • Linus Torvalds's avatar
      Merge tag 'tpmdd-next-20200522' of git://git.infradead.org/users/jjs/linux-tpmdd · b6f91ab6
      Linus Torvalds authored
      Pull tpm updates from Jarkko Sakkinen.
      
      * tag 'tpmdd-next-20200522' of git://git.infradead.org/users/jjs/linux-tpmdd:
        tpm: eventlog: Replace zero-length array with flexible-array member
        tpm/tpm_ftpm_tee: Use UUID API for exporting the UUID
      b6f91ab6
    • Mark Brown's avatar
    • Mark Brown's avatar
    • kbuild test robot's avatar
      0b0c0bd8
    • Borislav Petkov's avatar
    • Petr Mladek's avatar
      8b390ab7
    • Petr Mladek's avatar
      Merge branch 'for-5.8' into for-linus · d053cf0d
      Petr Mladek authored
      d053cf0d
    • Petr Mladek's avatar
      6a0af9fc
    • WeiXiong Liao's avatar
      mtd: Support kmsg dumper based on pstore/blk · 78c08247
      WeiXiong Liao authored
      This introduces mtdpstore, which is similar to mtdoops but more
      powerful. It uses pstore/blk, and aims to store panic and oops logs to
      a flash partition, where pstore can later read back and present as files
      in the mounted pstore filesystem.
      
      To make mtdpstore work, the "blkdev" of pstore/blk should be set
      as MTD device name or MTD device number. For more details, see
      Documentation/admin-guide/pstore-blk.rst
      
      This solves a number of issues:
      - Work duplication: both of pstore and mtdoops do the same job storing
        panic/oops log. They have very similar logic, registering to kmsg
        dumper and storing logs to several chunks one by one.
      - Layer violations: drivers should provides methods instead of polices.
        MTD should provide read/write/erase operations, and allow a higher
        level drivers to provide the chunk management, kmsg dump
        configuration, etc.
      - Missing features: pstore provides many additional features, including
        presenting the logs as files, logging dump time and count, and
        supporting other frontends like pmsg, console, etc.
      Signed-off-by: default avatarWeiXiong Liao <liaoweixiong@allwinnertech.com>
      Link: https://lore.kernel.org/lkml/20200511233229.27745-11-keescook@chromium.org/
      Link: https://lore.kernel.org/r/1589266715-4168-1-git-send-email-liaoweixiong@allwinnertech.comSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      78c08247
    • Kees Cook's avatar
      pstore/blk: Introduce "best_effort" mode · f8feafea
      Kees Cook authored
      In order to use arbitrary block devices as a pstore backend, provide a
      new module param named "best_effort", which will allow using any block
      device, even if it has not provided a panic_write callback.
      
      Link: https://lore.kernel.org/lkml/20200511233229.27745-12-keescook@chromium.org/Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      f8feafea
    • WeiXiong Liao's avatar
      pstore/blk: Support non-block storage devices · 7dcb7848
      WeiXiong Liao authored
      Add support for non-block devices (e.g. MTD). A non-block driver calls
      pstore_blk_register_device() to register iself.
      
      In addition, pstore/zone is updated to handle non-block devices,
      where an erase must be done before a write. Without this, there is no
      way to remove records stored to an MTD.
      Signed-off-by: default avatarWeiXiong Liao <liaoweixiong@allwinnertech.com>
      Link: https://lore.kernel.org/lkml/20200511233229.27745-10-keescook@chromium.org/Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      7dcb7848
    • WeiXiong Liao's avatar
      pstore/blk: Provide way to query pstore configuration · 1525fb3b
      WeiXiong Liao authored
      In order to configure itself, the MTD backend needs to be able to query
      the current pstore configuration. Introduce pstore_blk_get_config() for
      this purpose.
      Signed-off-by: default avatarWeiXiong Liao <liaoweixiong@allwinnertech.com>
      Link: https://lore.kernel.org/lkml/20200511233229.27745-9-keescook@chromium.org/Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      1525fb3b
    • WeiXiong Liao's avatar
      pstore/zone: Provide way to skip "broken" zone for MTD devices · 335426c6
      WeiXiong Liao authored
      One requirement to support MTD devices in pstore/zone is having a
      way to declare certain regions as broken. Add this support to
      pstore/zone.
      
      The MTD driver should return -ENOMSG when encountering a bad region,
      which tells pstore/zone to skip and try the next one.
      Signed-off-by: default avatarWeiXiong Liao <liaoweixiong@allwinnertech.com>
      Link: https://lore.kernel.org/lkml/20200511233229.27745-8-keescook@chromium.org/Co-developed-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Link: //lore.kernel.org/lkml/20200512173801.222666-1-colin.king@canonical.com
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      335426c6
  2. 31 May, 2020 15 commits
    • Linus Torvalds's avatar
      Linux 5.7 · 3d77e6a8
      Linus Torvalds authored
      3d77e6a8
    • Joe Perches's avatar
      checkpatch/coding-style: deprecate 80-column warning · bdc48fa1
      Joe Perches authored
      Yes, staying withing 80 columns is certainly still _preferred_.  But
      it's not the hard limit that the checkpatch warnings imply, and other
      concerns can most certainly dominate.
      
      Increase the default limit to 100 characters.  Not because 100
      characters is some hard limit either, but that's certainly a "what are
      you doing" kind of value and less likely to be about the occasional
      slightly longer lines.
      
      Miscellanea:
      
       - to avoid unnecessary whitespace changes in files, checkpatch will no
         longer emit a warning about line length when scanning files unless
         --strict is also used
      
       - Add a bit to coding-style about alignment to open parenthesis
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bdc48fa1
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8fc984ae
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A pile of x86 fixes:
      
         - Prevent a memory leak in ioperm which was caused by the stupid
           assumption that the exit cleanup is always called for current,
           which is not the case when fork fails after taking a reference on
           the ioperm bitmap.
      
         - Fix an arithmething overflow in the DMA code on 32bit systems
      
         - Fill gaps in the xstate copy with defaults instead of leaving them
           uninitialized
      
         - Revert: "Make __X32_SYSCALL_BIT be unsigned long" as it turned out
           that existing user space fails to build"
      
      * tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ioperm: Prevent a memory leak when fork fails
        x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
        copy_xstate_to_kernel(): don't leave parts of destination uninitialized
        x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
      8fc984ae
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3d042823
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "A single scheduler fix preventing a crash in NUMA balancing.
      
        The current->mm check is not reliable as the mm might be temporary due
        to use_mm() in a kthread. Check for PF_KTHREAD explictly"
      
      * tag 'sched-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Don't NUMA balance for kthreads
      3d042823
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 19835b1b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Another week, another set of bug fixes:
      
         1) Fix pskb_pull length in __xfrm_transport_prep(), from Xin Long.
      
         2) Fix double xfrm_state put in esp{4,6}_gro_receive(), also from Xin
            Long.
      
         3) Re-arm discovery timer properly in mac80211 mesh code, from Linus
            Lüssing.
      
         4) Prevent buffer overflows in nf_conntrack_pptp debug code, from
            Pablo Neira Ayuso.
      
         5) Fix race in ktls code between tls_sw_recvmsg() and
            tls_decrypt_done(), from Vinay Kumar Yadav.
      
         6) Fix crashes on TCP fallback in MPTCP code, from Paolo Abeni.
      
         7) More validation is necessary of untrusted GSO packets coming from
            virtualization devices, from Willem de Bruijn.
      
         8) Fix endianness of bnxt_en firmware message length accesses, from
            Edwin Peer.
      
         9) Fix infinite loop in sch_fq_pie, from Davide Caratti.
      
        10) Fix lockdep splat in DSA by setting lockless TX in netdev features
            for slave ports, from Vladimir Oltean.
      
        11) Fix suspend/resume crashes in mlx5, from Mark Bloch.
      
        12) Fix use after free in bpf fmod_ret, from Alexei Starovoitov.
      
        13) ARP retransmit timer guard uses wrong offset, from Hongbin Liu.
      
        14) Fix leak in inetdev_init(), from Yang Yingliang.
      
        15) Don't try to use inet hash and unhash in l2tp code, results in
            crashes. From Eric Dumazet"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
        l2tp: add sk_family checks to l2tp_validate_socket
        l2tp: do not use inet_hash()/inet_unhash()
        net: qrtr: Allocate workqueue before kernel_bind
        mptcp: remove msk from the token container at destruction time.
        mptcp: fix race between MP_JOIN and close
        mptcp: fix unblocking connect()
        net/sched: act_ct: add nat mangle action only for NAT-conntrack
        devinet: fix memleak in inetdev_init()
        virtio_vsock: Fix race condition in virtio_transport_recv_pkt
        drivers/net/ibmvnic: Update VNIC protocol version reporting
        NFC: st21nfca: add missed kfree_skb() in an error path
        neigh: fix ARP retransmit timer guard
        bpf, selftests: Add a verifier test for assigning 32bit reg states to 64bit ones
        bpf, selftests: Verifier bounds tests need to be updated
        bpf: Fix a verifier issue when assigning 32bit reg states to 64bit ones
        bpf: Fix use-after-free in fmod_ret check
        net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()
        net/mlx5e: Fix MLX5_TC_CT dependencies
        net/mlx5e: Properly set default values when disabling adaptive moderation
        net/mlx5e: Fix arch depending casting issue in FEC
        ...
      19835b1b
    • Eric Dumazet's avatar
      l2tp: add sk_family checks to l2tp_validate_socket · d9a81a22
      Eric Dumazet authored
      syzbot was able to trigger a crash after using an ISDN socket
      and fool l2tp.
      
      Fix this by making sure the UDP socket is of the proper family.
      
      BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
      Write of size 1 at addr ffff88808ed0c590 by task syz-executor.5/3018
      
      CPU: 0 PID: 3018 Comm: syz-executor.5 Not tainted 5.7.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x188/0x20d lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:382
       __kasan_report.cold+0x20/0x38 mm/kasan/report.c:511
       kasan_report+0x33/0x50 mm/kasan/common.c:625
       setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
       l2tp_tunnel_register+0xb15/0xdd0 net/l2tp/l2tp_core.c:1523
       l2tp_nl_cmd_tunnel_create+0x4b2/0xa60 net/l2tp/l2tp_netlink.c:249
       genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:718 [inline]
       genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735
       netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:746
       netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
       netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
       netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x6e6/0x810 net/socket.c:2352
       ___sys_sendmsg+0x100/0x170 net/socket.c:2406
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x45ca29
      Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007effe76edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000004fe1c0 RCX: 000000000045ca29
      RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005
      RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 000000000000094e R14: 00000000004d5d00 R15: 00007effe76ee6d4
      
      Allocated by task 3018:
       save_stack+0x1b/0x40 mm/kasan/common.c:49
       set_track mm/kasan/common.c:57 [inline]
       __kasan_kmalloc mm/kasan/common.c:495 [inline]
       __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:468
       __do_kmalloc mm/slab.c:3656 [inline]
       __kmalloc+0x161/0x7a0 mm/slab.c:3665
       kmalloc include/linux/slab.h:560 [inline]
       sk_prot_alloc+0x223/0x2f0 net/core/sock.c:1612
       sk_alloc+0x36/0x1100 net/core/sock.c:1666
       data_sock_create drivers/isdn/mISDN/socket.c:600 [inline]
       mISDN_sock_create+0x272/0x400 drivers/isdn/mISDN/socket.c:796
       __sock_create+0x3cb/0x730 net/socket.c:1428
       sock_create net/socket.c:1479 [inline]
       __sys_socket+0xef/0x200 net/socket.c:1521
       __do_sys_socket net/socket.c:1530 [inline]
       __se_sys_socket net/socket.c:1528 [inline]
       __x64_sys_socket+0x6f/0xb0 net/socket.c:1528
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Freed by task 2484:
       save_stack+0x1b/0x40 mm/kasan/common.c:49
       set_track mm/kasan/common.c:57 [inline]
       kasan_set_free_info mm/kasan/common.c:317 [inline]
       __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:456
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x109/0x2b0 mm/slab.c:3757
       kvfree+0x42/0x50 mm/util.c:603
       __free_fdtable+0x2d/0x70 fs/file.c:31
       put_files_struct fs/file.c:420 [inline]
       put_files_struct+0x248/0x2e0 fs/file.c:413
       exit_files+0x7e/0xa0 fs/file.c:445
       do_exit+0xb04/0x2dd0 kernel/exit.c:791
       do_group_exit+0x125/0x340 kernel/exit.c:894
       get_signal+0x47b/0x24e0 kernel/signal.c:2739
       do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
       exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
       prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
       do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      The buggy address belongs to the object at ffff88808ed0c000
       which belongs to the cache kmalloc-2k of size 2048
      The buggy address is located 1424 bytes inside of
       2048-byte region [ffff88808ed0c000, ffff88808ed0c800)
      The buggy address belongs to the page:
      page:ffffea00023b4300 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0xfffe0000000200(slab)
      raw: 00fffe0000000200 ffffea0002838208 ffffea00015ba288 ffff8880aa000e00
      raw: 0000000000000000 ffff88808ed0c000 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808ed0c480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808ed0c500: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff88808ed0c580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                               ^
       ffff88808ed0c600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88808ed0c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      Fixes: 6b9f3423 ("l2tp: fix races in tunnel creation")
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Chapman <jchapman@katalix.com>
      Cc: Guillaume Nault <gnault@redhat.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9a81a22
    • Eric Dumazet's avatar
      l2tp: do not use inet_hash()/inet_unhash() · 02c71b14
      Eric Dumazet authored
      syzbot recently found a way to crash the kernel [1]
      
      Issue here is that inet_hash() & inet_unhash() are currently
      only meant to be used by TCP & DCCP, since only these protocols
      provide the needed hashinfo pointer.
      
      L2TP uses a single list (instead of a hash table)
      
      This old bug became an issue after commit 61023658
      ("bpf: Add new cgroup attach type to enable sock modifications")
      since after this commit, sk_common_release() can be called
      while the L2TP socket is still considered 'hashed'.
      
      general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
      CPU: 0 PID: 7063 Comm: syz-executor654 Not tainted 5.7.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600
      Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00
      RSP: 0018:ffffc90001777d30 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242
      RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008
      RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1
      R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0
      R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00
      FS:  0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       sk_common_release+0xba/0x370 net/core/sock.c:3210
       inet_create net/ipv4/af_inet.c:390 [inline]
       inet_create+0x966/0xe00 net/ipv4/af_inet.c:248
       __sock_create+0x3cb/0x730 net/socket.c:1428
       sock_create net/socket.c:1479 [inline]
       __sys_socket+0xef/0x200 net/socket.c:1521
       __do_sys_socket net/socket.c:1530 [inline]
       __se_sys_socket net/socket.c:1528 [inline]
       __x64_sys_socket+0x6f/0xb0 net/socket.c:1528
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x441e29
      Code: e8 fc b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffdce184148 EFLAGS: 00000246 ORIG_RAX: 0000000000000029
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441e29
      RDX: 0000000000000073 RSI: 0000000000000002 RDI: 0000000000000002
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000402c30 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      ---[ end trace 23b6578228ce553e ]---
      RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600
      Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00
      RSP: 0018:ffffc90001777d30 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242
      RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008
      RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1
      R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0
      R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00
      FS:  0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 0d76751f ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Chapman <jchapman@katalix.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Reported-by: syzbot+3610d489778b57cc8031@syzkaller.appspotmail.com
      02c71b14
    • Chris Lew's avatar
      net: qrtr: Allocate workqueue before kernel_bind · c6e08d62
      Chris Lew authored
      A null pointer dereference in qrtr_ns_data_ready() is seen if a client
      opens a qrtr socket before qrtr_ns_init() can bind to the control port.
      When the control port is bound, the ENETRESET error will be broadcasted
      and clients will close their sockets. This results in DEL_CLIENT
      packets being sent to the ns and qrtr_ns_data_ready() being called
      without the workqueue being allocated.
      
      Allocate the workqueue before setting sk_data_ready and binding to the
      control port. This ensures that the work and workqueue structs are
      allocated and initialized before qrtr_ns_data_ready can be called.
      
      Fixes: 0c2204a4 ("net: qrtr: Migrate nameservice to kernel from userspace")
      Signed-off-by: default avatarChris Lew <clew@codeaurora.org>
      Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Reviewed-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6e08d62
    • David S. Miller's avatar
      Merge branch 'mptcp-a-bunch-of-fixes' · e237659c
      David S. Miller authored
      Paolo Abeni says:
      
      ====================
      mptcp: a bunch of fixes
      
      This patch series pulls together a few bugfixes for MPTCP bug observed while
      doing stress-test with apache bench - forced to use MPTCP and multiple
      subflows.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e237659c
    • Paolo Abeni's avatar
      mptcp: remove msk from the token container at destruction time. · c5c79763
      Paolo Abeni authored
      Currently we remote the msk from the token container only
      via mptcp_close(). The MPTCP master socket can be destroyed
      also via other paths (e.g. if not yet accepted, when shutting
      down the listener socket). When we hit the latter scenario,
      dangling msk references are left into the token container,
      leading to memory corruption and/or UaF.
      
      This change addresses the issue by moving the token removal
      into the msk destructor.
      
      Fixes: 79c0949e ("mptcp: Add key generation and token tree")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5c79763
    • Paolo Abeni's avatar
      mptcp: fix race between MP_JOIN and close · 10f6d46c
      Paolo Abeni authored
      If a MP_JOIN subflow completes the 3whs while another
      CPU is closing the master msk, we can hit the
      following race:
      
      CPU1                                    CPU2
      
      close()
       mptcp_close
                                              subflow_syn_recv_sock
                                               mptcp_token_get_sock
                                               mptcp_finish_join
                                                inet_sk_state_load
        mptcp_token_destroy
        inet_sk_state_store(TCP_CLOSE)
        __mptcp_flush_join_list()
                                                mptcp_sock_graft
                                                list_add_tail
        sk_common_release
         sock_orphan()
       <socket free>
      
      The MP_JOIN socket will be leaked. Additionally we can hit
      UaF for the msk 'struct socket' referenced via the 'conn'
      field.
      
      This change try to address the issue introducing some
      synchronization between the MP_JOIN 3whs and mptcp_close
      via the join_list spinlock. If we detect the msk is closing
      the MP_JOIN socket is closed, too.
      
      Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10f6d46c
    • Paolo Abeni's avatar
      mptcp: fix unblocking connect() · 41be81a8
      Paolo Abeni authored
      Currently unblocking connect() on MPTCP sockets fails frequently.
      If mptcp_stream_connect() is invoked to complete a previously
      attempted unblocking connection, it will still try to create
      the first subflow via __mptcp_socket_create(). If the 3whs is
      completed and the 'can_ack' flag is already set, the latter
      will fail with -EINVAL.
      
      This change addresses the issue checking for pending connect and
      delegating the completion to the first subflow. Additionally
      do msk addresses and sk_state changes only when needed.
      
      Fixes: 2303f994 ("mptcp: Associate MPTCP context with TCP socket")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41be81a8
    • wenxu's avatar
      net/sched: act_ct: add nat mangle action only for NAT-conntrack · 05aa69e5
      wenxu authored
      Currently add nat mangle action with comparing invert and orig tuple.
      It is better to check IPS_NAT_MASK flags first to avoid non necessary
      memcmp for non-NAT conntrack.
      Signed-off-by: default avatarwenxu <wenxu@ucloud.cn>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05aa69e5
    • Yang Yingliang's avatar
      devinet: fix memleak in inetdev_init() · 1b49cd71
      Yang Yingliang authored
      When devinet_sysctl_register() failed, the memory allocated
      in neigh_parms_alloc() should be freed.
      
      Fixes: 20e61da7 ("ipv4: fail early when creating netdev named all or default")
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b49cd71
    • Jia He's avatar
      virtio_vsock: Fix race condition in virtio_transport_recv_pkt · 8692cefc
      Jia He authored
      When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the
      server on the guest, there will be a panic on a ThunderX2 (armv8a server):
      
      [  463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [  463.718848] Mem abort info:
      [  463.718849]   ESR = 0x96000044
      [  463.718852]   EC = 0x25: DABT (current EL), IL = 32 bits
      [  463.718853]   SET = 0, FnV = 0
      [  463.718854]   EA = 0, S1PTW = 0
      [  463.718855] Data abort info:
      [  463.718856]   ISV = 0, ISS = 0x00000044
      [  463.718857]   CM = 0, WnR = 1
      [  463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
      [  463.718861] [0000000000000000] pgd=0000000000000000
      [  463.718866] Internal error: Oops: 96000044 [#1] SMP
      [...]
      [  463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G           O      5.7.0-rc7+ #139
      [  463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
      [  463.718982] pstate: 60400009 (nZCv daif +PAN -UAO)
      [  463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
      [  463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
      [  463.719000] sp : ffff80002dbe3c40
      [...]
      [  463.719025] Call trace:
      [  463.719030]  virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
      [  463.719034]  vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
      [  463.719041]  vhost_worker+0x100/0x1a0 [vhost]
      [  463.719048]  kthread+0x128/0x130
      [  463.719052]  ret_from_fork+0x10/0x18
      
      The race condition is as follows:
      Task1                                Task2
      =====                                =====
      __sock_release                       virtio_transport_recv_pkt
        __vsock_release                      vsock_find_bound_socket (found sk)
          lock_sock_nested
          vsock_remove_sock
          sock_orphan
            sk_set_socket(sk, NULL)
          sk->sk_shutdown = SHUTDOWN_MASK
          ...
          release_sock
                                          lock_sock
                                             virtio_transport_recv_connecting
                                               sk->sk_socket->state (panic!)
      
      The root cause is that vsock_find_bound_socket can't hold the lock_sock,
      so there is a small race window between vsock_find_bound_socket() and
      lock_sock(). If __vsock_release() is running in another task,
      sk->sk_socket will be set to NULL inadvertently.
      
      This fixes it by checking sk->sk_shutdown(suggested by Stefano) after
      lock_sock since sk->sk_shutdown is set to SHUTDOWN_MASK under the
      protection of lock_sock_nested.
      Signed-off-by: default avatarJia He <justin.he@arm.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8692cefc
  3. 30 May, 2020 1 commit