1. 05 Mar, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 42eaf185
      Linus Torvalds authored
      Pull MMC updates from Ulf Hansson:
       "MMC core:
         - Fixup max_discard/trim calculations
         - Announce SD specs greater than 4.0
         - Add discard support for SD cards
         - Don't do retries for CMD6 (SWITCH command)
         - Various cleanups and re-structuring
      
        MMC host:
         - cqhci:
            * Add maintainers for eMMC CQHCI driver
         - sdhci:
            * Consolidate WP GPIO code
            * Add ADMA3 DMA support for V4 enabled host
            * Fixup card detect support in pci-o2micro driver
            * Add support for CMDQ and SDMMC pads auto-calibration in tegra
              driver
            * Add DCMD support and CMDQ support, support for i.MX6ULL variant,
              fixup HS400 timing issue and add HS400_ES support for i.MX8QXP
              to esdhc-imx driver
            * Avoid CRC errors by adjusting settings to speed mode and fixup
              card initialization for high speed mode in renesas_sdhi
            * Fixup timeout settings for omap
            * Enable 8 bits bus-width support in atmel-mci
            * Convert some legacy code in jz4740 driver to use modern APIs
            * Send a CMD12 to clear DPSM at errors for STM32 sdmmc mmci
              driver"
      
      * tag 'mmc-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (69 commits)
        mmc:fix a bug when max_discard is 0
        mmc: core: Add a debug print when the card may have been replaced
        mmc: core: Add sd discard timeout
        mmc: core: Add discard support to sd
        mmc: sdhci-esdhc-imx: clear the HALT bit when enable CQE
        mmc: core: do not retry CMD6 in __mmc_switch()
        mmc: core: Convert mmc_align_data_size() into an SDIO specific function
        mmc: core: Move mmc_of_parse_voltage() to host.c
        mmc: core: Convert mmc_regulator_get_ocrmask() to static
        mmc: core: Move regulator helpers to separate file
        mmc: of_mmc_spi: Convert to mmc_of_parse_voltage()
        mmc: core: Drop retries as in-parameter to mmc_wait_for_app_cmd()
        mmc: core: Convert mmc_wait_for_app_cmd() to static
        mmc: renesas_sdhi: Change HW adjustment register according to speed mode
        mmc: mmci: Send a CMD12 to clear the DPSM at errors
        mmc: sdhci-xenon: Fixup already marked switch fall-through
        mmc: sdhci-tegra: drop ->get_ro() implementation
        mmc: sdhci-omap: drop ->get_ro() implementation
        mmc: sdhci: use WP GPIO in sdhci_check_ro()
        mmc: wmt-sdmmc: Drop unused include
        ...
      42eaf185
    • Linus Torvalds's avatar
      Merge tag 'i3c/for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · c8d950ab
      Linus Torvalds authored
      Pull i3c updates from Boris Brezillon:
      
       - Add a /* fall-through */ comment in the dw-i3c-master driver
      
       - Update the I3C entries in MAINTAINERS to add an IRC chan
      
      * tag 'i3c/for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        i3c: master: dw-i3c-master: mark expected switch fall-through
        MAINTAINERS: Add an IRC channel for the I3C subsystem
      c8d950ab
    • Linus Torvalds's avatar
      Merge tag 'mtd/for-5.1' of git://git.infradead.org/linux-mtd · 811c16a2
      Linus Torvalds authored
      Pull MTD updates from Boris Brezillon:
       "Core MTD changes:
         - Use struct_size() where appropriate
         - mtd_{read,write}() as wrappers around mtd_{read,write}_oob()
         - Fix misuse of PTR_ERR() in docg3
         - Coding style improvements in mtdcore.c
      
        SPI NOR changes:
          Core changes:
           - Add support of octal mode I/O transfer
           - Add a bunch of SPI NOR entries to the flash_info table
      
          SPI NOR controller driver changes:
           - cadence-quadspi:
              * Add support for Octal SPI controller
              * write upto 8-bytes data in STIG mode
           - mtk-quadspi:
              * rename config to a common one
              * add SNOR_HWCAPS_READ to spi_nor_hwcaps mask
           - Add Tudor as SPI-NOR co-maintainer
      
        NAND changes:
          NAND core changes:
           - Fourth batch of fixes/cleanup to the raw NAND core impacting
             various controller drivers (Sunxi, Marvell, MTK, TMIO, OMAP2).
           - Check the return code of nand_reset() and nand_readid_op().
           - Remove ->legacy.erase and single_erase().
           - Simplify the locking.
           - Several implicit fall through annotations.
      
          Raw NAND controllers drivers changes:
           - Fix various possible object reference leaks (MTK, JZ4780, Atmel)
           - ST:
              * Add support for STM32 FMC2 NAND flash controller
           - Meson:
              * Add support for Amlogic NAND flash controller
           - Denali:
              * Several cleanup patches
           - Sunxi:
              * Several cleanup patches
           - FSMC:
              * Disable NAND on remove()
              * Reset NAND timings on resume()
      
          SPI-NAND drivers changes:
           - Toshiba:
              * Add support for all Toshiba products.
           - Macronix:
              * Fix ECC status read.
           - Gigadevice:
              * Add support for GD5F1GQ4UExxG"
      
      * tag 'mtd/for-5.1' of git://git.infradead.org/linux-mtd: (64 commits)
        mtd: spi-nor: Fix wrong abbreviation HWCPAS
        mtd: spi-nor: cadence-quadspi: fix spelling mistake: "Couldnt't" -> "Couldn't"
        mtd: spi-nor: Add support for en25qh64
        mtd: spi-nor: Add support for MX25V8035F
        mtd: spi-nor: Add support for EN25Q80A
        mtd: spi-nor: cadence-quadspi: Add support for Octal SPI controller
        dt-bindings: cadence-quadspi: Add new compatible for AM654 SoC
        mtd: spi-nor: split s25fl128s into s25fl128s0 and s25fl128s1
        mtd: spi-nor: cadence-quadspi: write upto 8-bytes data in STIG mode
        mtd: spi-nor: Add support for mx25u3235f
        mtd: rawnand: denali_dt: remove single anonymous clock support
        mtd: rawnand: mtk: fix possible object reference leak
        mtd: rawnand: jz4780: fix possible object reference leak
        mtd: rawnand: atmel: fix possible object reference leak
        mtd: rawnand: fsmc: Disable NAND on remove()
        mtd: rawnand: fsmc: Reset NAND timings on resume()
        mtd: spinand: Add support for GigaDevice GD5F1GQ4UExxG
        mtd: rawnand: denali: remove unused dma_addr field from denali_nand_info
        mtd: rawnand: denali: remove unused function argument 'raw'
        mtd: rawnand: denali: remove unneeded denali_reset_irq() call
        ...
      811c16a2
    • Linus Torvalds's avatar
      Merge tag 'vfio-v5.1-rc1' of git://github.com/awilliam/linux-vfio · a83b0423
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Switch mdev to generic UUID API (Andy Shevchenko)
      
       - Fixup platform reset include paths (Masahiro Yamada)
      
       - Fix usage of MINORMASK (Chengguang Xu)
      
       - Remove noise from duplicate spapr table unsets (Alexey Kardashevskiy)
      
       - Restore device state after PM reset (Alex Williamson)
      
       - Ensure memory translation enabled for PCI ROM access (Eric Auger)
      
      * tag 'vfio-v5.1-rc1' of git://github.com/awilliam/linux-vfio:
        vfio_pci: Enable memory accesses before calling pci_map_rom
        vfio/pci: Restore device state on PM transition
        vfio/spapr_tce: Skip unsetting already unset table
        samples/vfio-mdev/mtty: expand minor range when registering chrdev region
        samples/vfio-mdev/mdpy: expand minor range when registering chrdev region
        samples/vfio-mdev/mbochs: expand minor range when registering chrdev region
        vfio: expand minor range when registering chrdev region
        vfio: platform: reset: fix up include directives to remove ccflags-y
        vfio-mdev: Switch to use new generic UUID API
      a83b0423
    • Slavomir Kaslev's avatar
      fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes · ee5e0011
      Slavomir Kaslev authored
      The current implementation of splice() and tee() ignores O_NONBLOCK set
      on pipe file descriptors and checks only the SPLICE_F_NONBLOCK flag for
      blocking on pipe arguments.  This is inconsistent since splice()-ing
      from/to non-pipe file descriptors does take O_NONBLOCK into
      consideration.
      
      Fix this by promoting O_NONBLOCK, when set on a pipe, to
      SPLICE_F_NONBLOCK.
      
      Some context for how the current implementation of splice() leads to
      inconsistent behavior.  In the ongoing work[1] to add VM tracing
      capability to trace-cmd we stream tracing data over named FIFOs or
      vsockets from guests back to the host.
      
      When we receive SIGINT from user to stop tracing, we set O_NONBLOCK on
      the input file descriptor and set SPLICE_F_NONBLOCK for the next call to
      splice().  If splice() was blocked waiting on data from the input FIFO,
      after SIGINT splice() restarts with the same arguments (no
      SPLICE_F_NONBLOCK) and blocks again instead of returning -EAGAIN when no
      data is available.
      
      This differs from the splice() behavior when reading from a vsocket or
      when we're doing a traditional read()/write() loop (trace-cmd's
      --nosplice argument).
      
      With this patch applied we get the same behavior in all situations after
      setting O_NONBLOCK which also matches the behavior of doing a
      read()/write() loop instead of splice().
      
      This change does have potential of breaking users who don't expect
      EAGAIN from splice() when SPLICE_F_NONBLOCK is not set.  OTOH programs
      that set O_NONBLOCK and don't anticipate EAGAIN are arguably buggy[2].
      
       [1] https://github.com/skaslev/trace-cmd/tree/vsock
       [2] https://github.com/torvalds/linux/blob/d47e3da1759230e394096fd742aad423c291ba48/fs/read_write.c#L1425Signed-off-by: default avatarSlavomir Kaslev <kaslevs@vmware.com>
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ee5e0011
  2. 04 Mar, 2019 4 commits
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 4f9020ff
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "Assorted fixes that sat in -next for a while, all over the place"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        aio: Fix locking in aio_poll()
        exec: Fix mem leak in kernel_read_file
        copy_mount_string: Limit string length to PATH_MAX
        cgroup: saner refcounting for cgroup_root
        fix cgroup_do_mount() handling of failure exits
      4f9020ff
    • Linus Torvalds's avatar
      get rid of legacy 'get_ds()' function · 736706be
      Linus Torvalds authored
      Every in-kernel use of this function defined it to KERNEL_DS (either as
      an actual define, or as an inline function).  It's an entirely
      historical artifact, and long long long ago used to actually read the
      segment selector valueof '%ds' on x86.
      
      Which in the kernel is always KERNEL_DS.
      
      Inspired by a patch from Jann Horn that just did this for a very small
      subset of users (the ones in fs/), along with Al who suggested a script.
      I then just took it to the logical extreme and removed all the remaining
      gunk.
      
      Roughly scripted with
      
         git grep -l '(get_ds())' -- :^tools/ | xargs sed -i 's/(get_ds())/(KERNEL_DS)/'
         git grep -lw 'get_ds' -- :^tools/ | xargs sed -i '/^#define get_ds()/d'
      
      plus manual fixups to remove a few unusual usage patterns, the couple of
      inline function cases and to fix up a comment that had become stale.
      
      The 'get_ds()' function remains in an x86 kvm selftest, since in user
      space it actually does something relevant.
      Inspired-by: default avatarJann Horn <jannh@google.com>
      Inspired-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      736706be
    • Linus Torvalds's avatar
      aio: simplify - and fix - fget/fput for io_submit() · 84c4e1f8
      Linus Torvalds authored
      Al Viro root-caused a race where the IOCB_CMD_POLL handling of
      fget/fput() could cause us to access the file pointer after it had
      already been freed:
      
       "In more details - normally IOCB_CMD_POLL handling looks so:
      
         1) io_submit(2) allocates aio_kiocb instance and passes it to
            aio_poll()
      
         2) aio_poll() resolves the descriptor to struct file by req->file =
            fget(iocb->aio_fildes)
      
         3) aio_poll() sets ->woken to false and raises ->ki_refcnt of that
            aio_kiocb to 2 (bumps by 1, that is).
      
         4) aio_poll() calls vfs_poll(). After sanity checks (basically,
            "poll_wait() had been called and only once") it locks the queue.
            That's what the extra reference to iocb had been for - we know we
            can safely access it.
      
         5) With queue locked, we check if ->woken has already been set to
            true (by aio_poll_wake()) and, if it had been, we unlock the
            queue, drop a reference to aio_kiocb and bugger off - at that
            point it's a responsibility to aio_poll_wake() and the stuff
            called/scheduled by it. That code will drop the reference to file
            in req->file, along with the other reference to our aio_kiocb.
      
         6) otherwise, we see whether we need to wait. If we do, we unlock the
            queue, drop one reference to aio_kiocb and go away - eventual
            wakeup (or cancel) will deal with the reference to file and with
            the other reference to aio_kiocb
      
         7) otherwise we remove ourselves from waitqueue (still under the
            queue lock), so that wakeup won't get us. No async activity will
            be happening, so we can safely drop req->file and iocb ourselves.
      
        If wakeup happens while we are in vfs_poll(), we are fine - aio_kiocb
        won't get freed under us, so we can do all the checks and locking
        safely. And we don't touch ->file if we detect that case.
      
        However, vfs_poll() most certainly *does* touch the file it had been
        given. So wakeup coming while we are still in ->poll() might end up
        doing fput() on that file. That case is not too rare, and usually we
        are saved by the still present reference from descriptor table - that
        fput() is not the final one.
      
        But if another thread closes that descriptor right after our fget()
        and wakeup does happen before ->poll() returns, we are in trouble -
        final fput() done while we are in the middle of a method:
      
      Al also wrote a patch to take an extra reference to the file descriptor
      to fix this, but I instead suggested we just streamline the whole file
      pointer handling by submit_io() so that the generic aio submission code
      simply keeps the file pointer around until the aio has completed.
      
      Fixes: bfe4037e ("aio: implement IOCB_CMD_POLL")
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Reported-by: syzbot+503d4cc169fcec1cb18c@syzkaller.appspotmail.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      84c4e1f8
    • Linus Torvalds's avatar
      x86-64: add warning for non-canonical user access address dereferences · 00c42373
      Linus Torvalds authored
      This adds a warning (once) for any kernel dereference that has a user
      exception handler, but accesses a non-canonical address.  It basically
      is a simpler - and more limited - version of commit 9da3f2b7
      ("x86/fault: BUG() when uaccess helpers fault on kernel addresses") that
      got reverted.
      
      Note that unlike that original commit, this only causes a warning,
      because there are real situations where we currently can do this
      (notably speculative argument fetching for uprobes etc).  Also, unlike
      that original commit, this _only_ triggers for #GP accesses, so the
      cases of valid kernel pointers that cross into a non-mapped page aren't
      affected.
      
      The intent of this is two-fold:
      
       - the uprobe/tracing accesses really do need to be more careful. In
         particular, from a portability standpoint it's just wrong to think
         that "a pointer is a pointer", and use the same logic for any random
         pointer value you find on the stack. It may _work_ on x86-64, but it
         doesn't necessarily work on other architectures (where the same
         pointer value can be either a kernel pointer _or_ a user pointer, and
         you really need to be much more careful in how you try to access it)
      
         The warning can hopefully end up being a reminder that just any
         random pointer access won't do.
      
       - Kees in particular wanted a way to actually report invalid uses of
         wild pointers to user space accessors, instead of just silently
         failing them. Automated fuzzers want a way to get reports if the
         kernel ever uses invalid values that the fuzzer fed it.
      
         The non-canonical address range is a fair chunk of the address space,
         and with this you can teach syzkaller to feed in invalid pointer
         values and find cases where we do not properly validate user
         addresses (possibly due to bad uses of "set_fs()").
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00c42373
  3. 03 Mar, 2019 2 commits
  4. 02 Mar, 2019 11 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7c42a89
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Two last minute fixes:
      
         - Prevent value evaluation via functions happening in the user access
           enabled region of __put_user() (put another way: make sure to
           evaluate the value to be stored in user space _before_ enabling
           user space accesses)
      
         - Correct the definition of a Hyper-V hypercall constant"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/hyper-v: Fix definition of HV_MAX_FLUSH_REP_COUNT
        x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
      e7c42a89
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · df49fd0f
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Nine small fixes.
      
        The resume fix is a cosmetic removal of a warning with an incorrect
        condition causing it to alarm people wrongly.
      
        The other eight patches correct a thinko in Christoph Hellwig's DMA
        conversion series. Without it all these drivers end up with 32 bit DMA
        masks meaning they bounce any page over 4GB before sending it to the
        controller.
      
        Nowadays, even laptops mostly have memory above 4GB, so this can lead
        to significant performance degradation with all the bouncing"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: core: Avoid that system resume triggers a kernel warning
        scsi: hptiop: fix calls to dma_set_mask()
        scsi: hisi_sas: fix calls to dma_set_mask_and_coherent()
        scsi: csiostor: fix calls to dma_set_mask_and_coherent()
        scsi: bfa: fix calls to dma_set_mask_and_coherent()
        scsi: aic94xx: fix calls to dma_set_mask_and_coherent()
        scsi: 3w-sas: fix calls to dma_set_mask_and_coherent()
        scsi: 3w-9xxx: fix calls to dma_set_mask_and_coherent()
        scsi: lpfc: fix calls to dma_set_mask_and_coherent()
      df49fd0f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c93d9218
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix refcount leak in act_ipt during replace, from Davide Caratti.
      
       2) Set task state properly in tun during blocking reads, from Timur
          Celik.
      
       3) Leaked reference in DSA, from Wen Yang.
      
       4) NULL deref in act_tunnel_key, from Vlad Buslov.
      
       5) cipso_v4_erro can reference the skb IPCB in inappropriate contexts
          thus referencing garbage, from Nazarov Sergey.
      
       6) Don't accept RTA_VIA and RTA_GATEWAY in contexts where those
          attributes make no sense.
      
       7) Fix hung sendto in tipc, from Tung Nguyen.
      
       8) Out-of-bounds access in netlabel, from Paul Moore.
      
       9) Grant reference leak in xen-netback, from Igor Druzhinin.
      
      10) Fix tx stalls with lan743x, from Bryan Whitehead.
      
      11) Fix interrupt storm with mv88e6xxx, from Hein Kallweit.
      
      12) Memory leak in sit on device registry failure, from Mao Wenan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        net: sit: fix memory leak in sit_init_net()
        net: dsa: mv88e6xxx: Fix statistics on mv88e6161
        geneve: correctly handle ipv6.disable module parameter
        net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode
        bpf: fix sanitation rewrite in case of non-pointers
        ipv4: Add ICMPv6 support when parse route ipproto
        MIPS: eBPF: Fix icache flush end address
        lan743x: Fix TX Stall Issue
        net: phy: phylink: fix uninitialized variable in phylink_get_mac_state
        net: aquantia: regression on cpus with high cores: set mode with 8 queues
        selftests: fixes for UDP GRO
        bpf: drop refcount if bpf_map_new_fd() fails in map_create()
        net: dsa: mv88e6xxx: power serdes on/off for 10G interfaces on 6390X
        net: dsa: mv88e6xxx: Fix u64 statistics
        xen-netback: don't populate the hash cache on XenBus disconnect
        xen-netback: fix occasional leak of grant ref mappings under memory pressure
        sctp: chunk.c: correct format string for size_t in printk
        net: netem: fix skb length BUG_ON in __skb_to_sgvec
        netlabel: fix out-of-bounds memory accesses
        ipv4: Pass original device to ip_rcv_finish_core
        ...
      c93d9218
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · fa3294c5
      Linus Torvalds authored
      Pull more crypto fixes from Herbert Xu:
       "This fixes a couple of issues in arm64/chacha that was introduced in
        5.0"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: arm64/chacha - fix hchacha_block_neon() for big endian
        crypto: arm64/chacha - fix chacha_4block_xor_neon() for big endian
      fa3294c5
    • Mao Wenan's avatar
      net: sit: fix memory leak in sit_init_net() · 07f12b26
      Mao Wenan authored
      If register_netdev() is failed to register sitn->fb_tunnel_dev,
      it will go to err_reg_dev and forget to free netdev(sitn->fb_tunnel_dev).
      
      BUG: memory leak
      unreferenced object 0xffff888378daad00 (size 512):
        comm "syz-executor.1", pid 4006, jiffies 4295121142 (age 16.115s)
        hex dump (first 32 bytes):
          00 e6 ed c0 83 88 ff ff 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
          [<00000000d6dcb63e>] kvmalloc include/linux/mm.h:577 [inline]
          [<00000000d6dcb63e>] kvzalloc include/linux/mm.h:585 [inline]
          [<00000000d6dcb63e>] netif_alloc_netdev_queues net/core/dev.c:8380 [inline]
          [<00000000d6dcb63e>] alloc_netdev_mqs+0x600/0xcc0 net/core/dev.c:8970
          [<00000000867e172f>] sit_init_net+0x295/0xa40 net/ipv6/sit.c:1848
          [<00000000871019fa>] ops_init+0xad/0x3e0 net/core/net_namespace.c:129
          [<00000000319507f6>] setup_net+0x2ba/0x690 net/core/net_namespace.c:314
          [<0000000087db4f96>] copy_net_ns+0x1dc/0x330 net/core/net_namespace.c:437
          [<0000000057efc651>] create_new_namespaces+0x382/0x730 kernel/nsproxy.c:107
          [<00000000676f83de>] copy_namespaces+0x2ed/0x3d0 kernel/nsproxy.c:165
          [<0000000030b74bac>] copy_process.part.27+0x231e/0x6db0 kernel/fork.c:1919
          [<00000000fff78746>] copy_process kernel/fork.c:1713 [inline]
          [<00000000fff78746>] _do_fork+0x1bc/0xe90 kernel/fork.c:2224
          [<000000001c2e0d1c>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
          [<00000000ec48bd44>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<0000000039acff8a>] 0xffffffffffffffff
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f12b26
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Fix statistics on mv88e6161 · a6da21bb
      Andrew Lunn authored
      Despite what the datesheet says, the silicon implements the older way
      of snapshoting the statistics. Change the op.
      
      Reported-by: Chris.Healy@zii.aero
      Tested-by: Chris.Healy@zii.aero
      Fixes: 0ac64c39 ("net: dsa: mv88e6xxx: mv88e6161 uses mv88e6320 stats snapshot")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6da21bb
    • Jiri Benc's avatar
      geneve: correctly handle ipv6.disable module parameter · cf1c9ccb
      Jiri Benc authored
      When IPv6 is compiled but disabled at runtime, geneve_sock_add returns
      -EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole
      operation of bringing up the tunnel.
      
      Ignore failure of IPv6 socket creation for metadata based tunnels caused by
      IPv6 not being available.
      
      This is the same fix as what commit d074bf96 ("vxlan: correctly handle
      ipv6.disable module parameter") is doing for vxlan.
      
      Note there's also commit c0a47e44 ("geneve: should not call rt6_lookup()
      when ipv6 was disabled") which fixes a similar issue but for regular
      tunnels, while this patch is needed for metadata based tunnels.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf1c9ccb
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · f08d6114
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2019-03-01
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) fix sanitation rewrite, from Daniel.
      
      2) fix error path on map_new_fd, from Peng.
      
      3) fix icache flush address, from Paul.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f08d6114
    • Heiner Kallweit's avatar
      net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode · ed8fe202
      Heiner Kallweit authored
      When debugging another issue I faced an interrupt storm in this
      driver (88E6390, port 9 in SGMII mode), consisting of alternating
      link-up / link-down interrupts. Analysis showed that the driver
      wanted to set a cmode that was set already. But so far
      mv88e6390x_port_set_cmode() doesn't check this and powers down
      SERDES, what causes the link to break, and eventually results in
      the described interrupt storm.
      
      Fix this by checking whether the cmode actually changes. We want
      that the very first call to mv88e6390x_port_set_cmode() always
      configures the registers, therefore initialize port.cmode with
      a value that is different from any supported cmode value.
      We have to take care that we only init the ports cmode once
      chip->info->num_ports is set.
      
      v2:
      - add small helper and init the number of actual ports only
      
      Fixes: 364e9d77 ("net: dsa: mv88e6xxx: Power on/off SERDES on cmode change")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed8fe202
    • Daniel Borkmann's avatar
      bpf: fix sanitation rewrite in case of non-pointers · 3612af78
      Daniel Borkmann authored
      Marek reported that he saw an issue with the below snippet in that
      timing measurements where off when loaded as unpriv while results
      were reasonable when loaded as privileged:
      
          [...]
          uint64_t a = bpf_ktime_get_ns();
          uint64_t b = bpf_ktime_get_ns();
          uint64_t delta = b - a;
          if ((int64_t)delta > 0) {
          [...]
      
      Turns out there is a bug where a corner case is missing in the fix
      d3bd7413 ("bpf: fix sanitation of alu op with pointer / scalar
      type from different paths"), namely fixup_bpf_calls() only checks
      whether aux has a non-zero alu_state, but it also needs to test for
      the case of BPF_ALU_NON_POINTER since in both occasions we need to
      skip the masking rewrite (as there is nothing to mask).
      
      Fixes: d3bd7413 ("bpf: fix sanitation of alu op with pointer / scalar type from different paths")
      Reported-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Reported-by: default avatarArthur Fabre <afabre@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/netdev/CAJPywTJqP34cK20iLM5YmUMz9KXQOdu1-+BZrGMAGgLuBWz7fg@mail.gmail.com/T/Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      3612af78
    • Hangbin Liu's avatar
      ipv4: Add ICMPv6 support when parse route ipproto · 5e1a99ea
      Hangbin Liu authored
      For ip rules, we need to use 'ipproto ipv6-icmp' to match ICMPv6 headers.
      But for ip -6 route, currently we only support tcp, udp and icmp.
      
      Add ICMPv6 support so we can match ipv6-icmp rules for route lookup.
      
      v2: As David Ahern and Sabrina Dubroca suggested, Add an argument to
      rtm_getroute_parse_ip_proto() to handle ICMP/ICMPv6 with different family.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Fixes: eacb9384 ("ipv6: support sport, dport and ip_proto in RTM_GETROUTE")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e1a99ea
  5. 01 Mar, 2019 14 commits
    • Paul Burton's avatar
      MIPS: eBPF: Fix icache flush end address · d1a2930d
      Paul Burton authored
      The MIPS eBPF JIT calls flush_icache_range() in order to ensure the
      icache observes the code that we just wrote. Unfortunately it gets the
      end address calculation wrong due to some bad pointer arithmetic.
      
      The struct jit_ctx target field is of type pointer to u32, and as such
      adding one to it will increment the address being pointed to by 4 bytes.
      Therefore in order to find the address of the end of the code we simply
      need to add the number of 4 byte instructions emitted, but we mistakenly
      add the number of instructions multiplied by 4. This results in the call
      to flush_icache_range() operating on a memory region 4x larger than
      intended, which is always wasteful and can cause crashes if we overrun
      into an unmapped page.
      
      Fix this by correcting the pointer arithmetic to remove the bogus
      multiplication, and use braces to remove the need for a set of brackets
      whilst also making it obvious that the target field is a pointer.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Fixes: b6bd53f9 ("MIPS: Add missing file for eBPF JIT.")
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: netdev@vger.kernel.org
      Cc: bpf@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # v4.13+
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d1a2930d
    • Bryan Whitehead's avatar
      lan743x: Fix TX Stall Issue · 90490ef7
      Bryan Whitehead authored
      It has been observed that tx queue stalls while downloading
      from certain web sites (example www.speedtest.net)
      
      The cause has been tracked down to a corner case where
      dma descriptors where not setup properly. And there for a tx
      completion interrupt was not signaled.
      
      This fix corrects the problem by properly marking the end of
      a multi descriptor transmission.
      
      Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Signed-off-by: default avatarBryan Whitehead <Bryan.Whitehead@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90490ef7
    • Heiner Kallweit's avatar
      net: phy: phylink: fix uninitialized variable in phylink_get_mac_state · d25ed413
      Heiner Kallweit authored
      When debugging an issue I found implausible values in state->pause.
      Reason in that state->pause isn't initialized and later only single
      bits are changed. Also the struct itself isn't initialized in
      phylink_resolve(). So better initialize state->pause and other
      not yet initialized fields.
      
      v2:
      - use right function name in subject
      v3:
      - initialize additional fields
      
      Fixes: 9525ae83 ("phylink: add phylink infrastructure")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d25ed413
    • Dmitry Bogdanov's avatar
      net: aquantia: regression on cpus with high cores: set mode with 8 queues · 15f3ddf5
      Dmitry Bogdanov authored
      Recently the maximum number of queues was increased up to 8, but
      NIC was not fully configured for 8 queues. In setups with more than 4 CPU
      cores parts of TX traffic gets lost if the kernel routes it to queues 4th-8th.
      
      This patch sets a tx hw traffic mode with 8 queues.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202651
      
      Fixes: 71a963cf ("net: aquantia: increase max number of hw queues")
      Reported-by: default avatarNicholas Johnson <nicholas.johnson@outlook.com.au>
      Signed-off-by: default avatarDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15f3ddf5
    • Paolo Abeni's avatar
      selftests: fixes for UDP GRO · ada641ff
      Paolo Abeni authored
      The current implementation for UDP GRO tests is racy: the receiver
      may flush the RX queue while the sending is still transmitting and
      incorrectly report RX errors, with a wrong number of packet received.
      
      Add explicit timeouts to the receiver for both connection activation
      (first packet received for UDP) and reception completion, so that
      in the above critical scenario the receiver will wait for the
      transfer completion.
      
      Fixes: 3327a9c4 ("selftests: add functionals test for UDP GRO")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ada641ff
    • Linus Torvalds's avatar
      Merge tag 'iommu-fix-v5.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · a215ce8f
      Linus Torvalds authored
      Pull IOMMU fix from Joerg Roedel:
       "One important fix for a memory corruption issue in the Intel VT-d
        driver that triggers on hardware with deep PCI hierarchies"
      
      * tag 'iommu-fix-v5.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/dmar: Fix buffer overflow during PCI bus notification
      a215ce8f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 2d28e01d
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "2 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        hugetlbfs: fix races and page leaks during migration
        kasan: turn off asan-stack for clang-8 and earlier
      2d28e01d
    • Mike Kravetz's avatar
      hugetlbfs: fix races and page leaks during migration · cb6acd01
      Mike Kravetz authored
      hugetlb pages should only be migrated if they are 'active'.  The
      routines set/clear_page_huge_active() modify the active state of hugetlb
      pages.
      
      When a new hugetlb page is allocated at fault time, set_page_huge_active
      is called before the page is locked.  Therefore, another thread could
      race and migrate the page while it is being added to page table by the
      fault code.  This race is somewhat hard to trigger, but can be seen by
      strategically adding udelay to simulate worst case scheduling behavior.
      Depending on 'how' the code races, various BUG()s could be triggered.
      
      To address this issue, simply delay the set_page_huge_active call until
      after the page is successfully added to the page table.
      
      Hugetlb pages can also be leaked at migration time if the pages are
      associated with a file in an explicitly mounted hugetlbfs filesystem.
      For example, consider a two node system with 4GB worth of huge pages
      available.  A program mmaps a 2G file in a hugetlbfs filesystem.  It
      then migrates the pages associated with the file from one node to
      another.  When the program exits, huge page counts are as follows:
      
        node0
        1024    free_hugepages
        1024    nr_hugepages
      
        node1
        0       free_hugepages
        1024    nr_hugepages
      
        Filesystem                         Size  Used Avail Use% Mounted on
        nodev                              4.0G  2.0G  2.0G  50% /var/opt/hugepool
      
      That is as expected.  2G of huge pages are taken from the free_hugepages
      counts, and 2G is the size of the file in the explicitly mounted
      filesystem.  If the file is then removed, the counts become:
      
        node0
        1024    free_hugepages
        1024    nr_hugepages
      
        node1
        1024    free_hugepages
        1024    nr_hugepages
      
        Filesystem                         Size  Used Avail Use% Mounted on
        nodev                              4.0G  2.0G  2.0G  50% /var/opt/hugepool
      
      Note that the filesystem still shows 2G of pages used, while there
      actually are no huge pages in use.  The only way to 'fix' the filesystem
      accounting is to unmount the filesystem
      
      If a hugetlb page is associated with an explicitly mounted filesystem,
      this information in contained in the page_private field.  At migration
      time, this information is not preserved.  To fix, simply transfer
      page_private from old to new page at migration time if necessary.
      
      There is a related race with removing a huge page from a file and
      migration.  When a huge page is removed from the pagecache, the
      page_mapping() field is cleared, yet page_private remains set until the
      page is actually freed by free_huge_page().  A page could be migrated
      while in this state.  However, since page_mapping() is not set the
      hugetlbfs specific routine to transfer page_private is not called and we
      leak the page count in the filesystem.
      
      To fix that, check for this condition before migrating a huge page.  If
      the condition is detected, return EBUSY for the page.
      
      Link: http://lkml.kernel.org/r/74510272-7319-7372-9ea6-ec914734c179@oracle.com
      Link: http://lkml.kernel.org/r/20190212221400.3512-1-mike.kravetz@oracle.com
      Fixes: bcc54222 ("mm: hugetlb: introduce page_huge_active")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: <stable@vger.kernel.org>
      [mike.kravetz@oracle.com: v2]
        Link: http://lkml.kernel.org/r/7534d322-d782-8ac6-1c8d-a8dc380eb3ab@oracle.com
      [mike.kravetz@oracle.com: update comment and changelog]
        Link: http://lkml.kernel.org/r/420bcfd6-158b-38e4-98da-26d0cd85bd01@oracle.comSigned-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb6acd01
    • Arnd Bergmann's avatar
      kasan: turn off asan-stack for clang-8 and earlier · 6baec880
      Arnd Bergmann authored
      Building an arm64 allmodconfig kernel with clang results in over 140
      warnings about overly large stack frames, the worst ones being:
      
        drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size of 20224 bytes in function 'st7789v_prepare'
        drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable'
        drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes in function 'max3421_spi_thread'
        drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes in function 'slic_ds26522_probe'
        drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in function 'ccp_run_cmd'
        drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 bytes in function 'stv0367ter_algo'
      
      None of these happen with gcc today, and almost all of these are the
      result of a single known issue in llvm.  Hopefully it will eventually
      get fixed with the clang-9 release.
      
      In the meantime, the best idea I have is to turn off asan-stack for
      clang-8 and earlier, so we can produce a kernel that is safe to run.
      
      I have posted three patches that address the frame overflow warnings
      that are not addressed by turning off asan-stack, so in combination with
      this change, we get much closer to a clean allmodconfig build, which in
      turn is necessary to do meaningful build regression testing.
      
      It is still possible to turn on the CONFIG_ASAN_STACK option on all
      versions of clang, and it's always enabled for gcc, but when
      CONFIG_COMPILE_TEST is set, the option remains invisible, so
      allmodconfig and randconfig builds (which are normally done with a
      forced CONFIG_COMPILE_TEST) will still result in a mostly clean build.
      
      Link: http://lkml.kernel.org/r/20190222222950.3997333-1-arnd@arndb.de
      Link: https://bugs.llvm.org/show_bug.cgi?id=38809Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarQian Cai <cai@lca.pw>
      Reviewed-by: default avatarMark Brown <broonie@kernel.org>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6baec880
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-03-01' of git://anongit.freedesktop.org/drm/drm · 6357c812
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Three final fixes, one for a feature that is new in this kernel, one
        bochs fix for qemu riscv and one atomic modesetting fix.
      
        I've left a few of the other late fixes until next as I didn't want to
        throw in anything that wasn't really necessary"
      
      * tag 'drm-fixes-2019-03-01' of git://anongit.freedesktop.org/drm/drm:
        drm/bochs: Fix the ID mismatch error
        drm: Block fb changes for async plane updates
        drm/amd/display: Use vrr friendly pageflip throttling in DC.
      6357c812
    • Peng Sun's avatar
      bpf: drop refcount if bpf_map_new_fd() fails in map_create() · 352d20d6
      Peng Sun authored
      In bpf/syscall.c, map_create() first set map->usercnt to 1, a file
      descriptor is supposed to return to userspace. When bpf_map_new_fd()
      fails, drop the refcount.
      
      Fixes: bd5f5f4e ("bpf: Add BPF_MAP_GET_FD_BY_ID")
      Signed-off-by: default avatarPeng Sun <sironhide0null@gmail.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      352d20d6
    • Arnd Bergmann's avatar
      Merge tag 'qcom-fixes-for-5.0-rc8' of... · 6089e656
      Arnd Bergmann authored
      Merge tag 'qcom-fixes-for-5.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into arm/fixes
      
      Qualcomm ARM64 Fixes for 5.0-rc8
      
      * Fix TZ memory area size to avoid crashes during boot
      
      * tag 'qcom-fixes-for-5.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
        arm64: dts: qcom: msm8998: Extend TZ reserved memory area
      6089e656
    • Arnd Bergmann's avatar
      Merge tag 'tee-fix-for-v5.0' of... · 36baa6ed
      Arnd Bergmann authored
      Merge tag 'tee-fix-for-v5.0' of https://git.linaro.org/people/jens.wiklander/linux-tee into arm/fixes
      
      OP-TEE driver
      - add missing of_node_put after of_device_is_available
      
      * tag 'tee-fix-for-v5.0' of https://git.linaro.org/people/jens.wiklander/linux-tee:
        tee: optee: add missing of_node_put after of_device_is_available
      36baa6ed
    • Jiong Wu's avatar
      mmc:fix a bug when max_discard is 0 · d4721339
      Jiong Wu authored
      The original purpose of the code I fix is to replace max_discard with
      max_trim if max_trim is less than max_discard. When max_discard is 0
      we should replace max_discard with max_trim as well, because
      max_discard equals 0 happens only when the max_do_calc_max_discard
      process is overflowed, so if mmc_can_trim(card) is true, max_discard
      should be replaced by an available max_trim.
      However, in the original code, there are two lines of code interfere
      the right process.
      1) if (max_discard && mmc_can_trim(card))
      when max_discard is 0, it skips the process checking if max_discard
      needs to be replaced with max_trim.
      2) if (max_trim < max_discard)
      the condition is false when max_discard is 0. it also skips the process
      that replaces max_discard with max_trim, in fact, we should replace the
      0-valued max_discard with max_trim.
      Signed-off-by: default avatarJiong Wu <Lohengrin1024@gmail.com>
      Fixes: b305882f (mmc: core: optimize mmc_calc_max_discard)
      Cc: stable@vger.kernel.org # v4.17+
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      d4721339
  6. 28 Feb, 2019 4 commits