1. 16 Apr, 2018 11 commits
    • David S. Miller's avatar
      Merge branch 'tcp-zero-copy-receive' · 309c446c
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tcp: add zero copy receive
      
      This patch series add mmap() support to TCP sockets for RX zero copy.
      
      While tcp_mmap() patch itself is quite small (~100 LOC), optimal support
      for asynchronous mmap() required better SO_RCVLOWAT behavior, and a
      test program to demonstrate how mmap() on TCP sockets can be used.
      
      Note that mmap() (and associated munmap()) calls are adding more
      pressure on per-process VM semaphore, so might not show benefit
      for processus with high number of threads.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      309c446c
    • Eric Dumazet's avatar
      selftests: net: add tcp_mmap program · 192dc405
      Eric Dumazet authored
      This is a reference program showing how mmap() can be used
      on TCP flows to implement receive zero copy.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      192dc405
    • Eric Dumazet's avatar
      tcp: implement mmap() for zero copy receive · 93ab6cc6
      Eric Dumazet authored
      Some networks can make sure TCP payload can exactly fit 4KB pages,
      with well chosen MSS/MTU and architectures.
      
      Implement mmap() system call so that applications can avoid
      copying data without complex splice() games.
      
      Note that a successful mmap( X bytes) on TCP socket is consuming
      bytes, as if recvmsg() has been done. (tp->copied += X)
      
      Only PROT_READ mappings are accepted, as skb page frags
      are fundamentally shared and read only.
      
      If tcp_mmap() finds data that is not a full page, or a patch of
      urgent data, -EINVAL is returned, no bytes are consumed.
      
      Application must fallback to recvmsg() to read the problematic sequence.
      
      mmap() wont block,  regardless of socket being in blocking or
      non-blocking mode. If not enough bytes are in receive queue,
      mmap() would return -EAGAIN, or -EIO if socket is in a state
      where no other bytes can be added into receive queue.
      
      An application might use SO_RCVLOWAT, poll() and/or ioctl( FIONREAD)
      to efficiently use mmap()
      
      On the sender side, MSG_EOR might help to clearly separate unaligned
      headers and 4K-aligned chunks if necessary.
      
      Tested:
      
      mlx4 (cx-3) 40Gbit NIC, with tcp_mmap program provided in following patch.
      MTU set to 4168  (4096 TCP payload, 40 bytes IPv6 header, 32 bytes TCP header)
      
      Without mmap() (tcp_mmap -s)
      
      received 32768 MB (0 % mmap'ed) in 8.13342 s, 33.7961 Gbit,
        cpu usage user:0.034 sys:3.778, 116.333 usec per MB, 63062 c-switches
      received 32768 MB (0 % mmap'ed) in 8.14501 s, 33.748 Gbit,
        cpu usage user:0.029 sys:3.997, 122.864 usec per MB, 61903 c-switches
      received 32768 MB (0 % mmap'ed) in 8.11723 s, 33.8635 Gbit,
        cpu usage user:0.048 sys:3.964, 122.437 usec per MB, 62983 c-switches
      received 32768 MB (0 % mmap'ed) in 8.39189 s, 32.7552 Gbit,
        cpu usage user:0.038 sys:4.181, 128.754 usec per MB, 55834 c-switches
      
      With mmap() on receiver (tcp_mmap -s -z)
      
      received 32768 MB (100 % mmap'ed) in 8.03083 s, 34.2278 Gbit,
        cpu usage user:0.024 sys:1.466, 45.4712 usec per MB, 65479 c-switches
      received 32768 MB (100 % mmap'ed) in 7.98805 s, 34.4111 Gbit,
        cpu usage user:0.026 sys:1.401, 43.5486 usec per MB, 65447 c-switches
      received 32768 MB (100 % mmap'ed) in 7.98377 s, 34.4296 Gbit,
        cpu usage user:0.028 sys:1.452, 45.166 usec per MB, 65496 c-switches
      received 32768 MB (99.9969 % mmap'ed) in 8.01838 s, 34.281 Gbit,
        cpu usage user:0.02 sys:1.446, 44.7388 usec per MB, 65505 c-switches
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93ab6cc6
    • Eric Dumazet's avatar
      tcp: avoid extra wakeups for SO_RCVLOWAT users · 03f45c88
      Eric Dumazet authored
      SO_RCVLOWAT is properly handled in tcp_poll(), so that POLLIN is only
      generated when enough bytes are available in receive queue, after
      David change (commit c7004482 "tcp: Respect SO_RCVLOWAT in tcp_poll().")
      
      But TCP still calls sk->sk_data_ready() for each chunk added in receive
      queue, meaning thread is awaken, and goes back to sleep shortly after.
      
      Tested:
      
      tcp_mmap test program, receiving 32768 MB of data with SO_RCVLOWAT set to 512KB
      
      -> Should get ~2 wakeups (c-switches) per MB, regardless of how many
      (tiny or big) packets were received.
      
      High speed (mostly full size GRO packets)
      
      received 32768 MB (100 % mmap'ed) in 8.03112 s, 34.2266 Gbit,
        cpu usage user:0.037 sys:1.404, 43.9758 usec per MB, 65497 c-switches
      
      received 32768 MB (99.9954 % mmap'ed) in 7.98453 s, 34.4263 Gbit,
        cpu usage user:0.03 sys:1.422, 44.3115 usec per MB, 65485 c-switches
      
      Low speed (sender is ratelimited and sends 1-MSS at a time, so GRO is not helping)
      
      received 22474.5 MB (100 % mmap'ed) in 6015.35 s, 0.0313414 Gbit,
        cpu usage user:0.05 sys:1.586, 72.7952 usec per MB, 44950 c-switches
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03f45c88
    • Eric Dumazet's avatar
      tcp: fix delayed acks behavior for SO_RCVLOWAT · 796f82ea
      Eric Dumazet authored
      We should not delay acks if there are not enough bytes
      in receive queue to satisfy SO_RCVLOWAT.
      
      Since [E]POLLIN event is not going to be generated, there is little
      hope for a delayed ack to be useful.
      
      In fact, delaying ACK prevents sender from completing
      the transfer.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      796f82ea
    • Eric Dumazet's avatar
      tcp: fix SO_RCVLOWAT and RCVBUF autotuning · d1361840
      Eric Dumazet authored
      Applications might use SO_RCVLOWAT on TCP socket hoping to receive
      one [E]POLLIN event only when a given amount of bytes are ready in socket
      receive queue.
      
      Problem is that receive autotuning is not aware of this constraint,
      meaning sk_rcvbuf might be too small to allow all bytes to be stored.
      
      Add a new (struct proto_ops)->set_rcvlowat method so that a protocol
      can override the default setsockopt(SO_RCVLOWAT) behavior.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1361840
    • Roman Mashak's avatar
      tc-testing: add sample action tests · 10b19aea
      Roman Mashak authored
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10b19aea
    • Lorenzo Bianconi's avatar
      ipv6: remove unnecessary check in addrconf_prefix_rcv_add_addr() · f85f94b8
      Lorenzo Bianconi authored
      Remove unnecessary check on update_lft variable in
      addrconf_prefix_rcv_add_addr routine since it is always set to 0.
      Moreover remove update_lft re-initialization to 0
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f85f94b8
    • Masahisa KOJIMA's avatar
      net: socionext: reset hardware in ndo_stop · 9a00b697
      Masahisa KOJIMA authored
      When the interface is down, head/tail of the descriptor
      ring address is set to 0 in netsec_netdev_stop().
      But netsec hardware still keeps the previous descriptor
      ring address, so there is inconsistency between driver
      and hardware after interface is up at a later time.
      To address this inconsistency, add netsec_reset_hardware()
      when the interface is down.
      
      In addition, to minimize the reset process,
      add flag to decide whether driver loads the netsec microcode.
      Even if driver resets the netsec hardware, netsec microcode
      keeps resident on RAM, so it is ok we only load the microcode
      at initialization.
      
      This patch is critical for installation over network.
      Signed-off-by: default avatarMasahisa KOJIMA <masahisa.kojima@linaro.org>
      Fixes: 533dd11a ("net: socionext: Add Synquacer NetSec driver")
      Signed-off-by: default avatarJassi Brar <jaswinder.singh@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a00b697
    • Jassi Brar's avatar
      net: netsec: enable tx-irq during open callback · c009f413
      Jassi Brar authored
      Enable TX-irq as well during ndo_open() as we can not count upon
      RX to arrive early enough to trigger the napi. This patch is critical
      for installation over network.
      
      Fixes: 533dd11a ("net: socionext: Add Synquacer NetSec driver")
      Signed-off-by: default avatarJassi Brar <jaswinder.singh@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c009f413
    • Ryder Lee's avatar
      net: mediatek: use of_device_get_match_data() · eda7d46d
      Ryder Lee authored
      The usage of of_device_get_match_data() reduce the code size a bit.
      
      Also, the only way to call mtk_probe() is to match an entry in
      of_mtk_match[], so match cannot be NULL.
      Signed-off-by: default avatarRyder Lee <ryder.lee@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eda7d46d
  2. 12 Apr, 2018 10 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 5d136594
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) In ip_gre tunnel, handle the conflict between TUNNEL_{SEQ,CSUM} and
          GSO/LLTX properly. From Sabrina Dubroca.
      
       2) Stop properly on error in lan78xx_read_otp(), from Phil Elwell.
      
       3) Don't uncompress in slip before rstate is initialized, from Tejaswi
          Tanikella.
      
       4) When using 1.x firmware on aquantia, issue a deinit before we
          hardware reset the chip, otherwise we break dirty wake WOL. From
          Igor Russkikh.
      
       5) Correct log check in vhost_vq_access_ok(), from Stefan Hajnoczi.
      
       6) Fix ethtool -x crashes in bnxt_en, from Michael Chan.
      
       7) Fix races in l2tp tunnel creation and duplicate tunnel detection,
          from Guillaume Nault.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
        l2tp: fix race in duplicate tunnel detection
        l2tp: fix races in tunnel creation
        tun: send netlink notification when the device is modified
        tun: set the flags before registering the netdevice
        lan78xx: Don't reset the interface on open
        bnxt_en: Fix NULL pointer dereference at bnxt_free_irq().
        bnxt_en: Need to include RDMA rings in bnxt_check_rings().
        bnxt_en: Support max-mtu with VF-reps
        bnxt_en: Ignore src port field in decap filter nodes
        bnxt_en: do not allow wildcard matches for L2 flows
        bnxt_en: Fix ethtool -x crash when device is down.
        vhost: return bool from *_access_ok() functions
        vhost: fix vhost_vq_access_ok() log check
        vhost: Fix vhost_copy_to_user()
        net: aquantia: oops when shutdown on already stopped device
        net: aquantia: Regression on reset with 1.x firmware
        cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
        slip: Check if rstate is initialized before uncompressing
        lan78xx: Avoid spurious kevent 4 "error"
        lan78xx: Correctly indicate invalid OTP
        ...
      5d136594
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 67a7a8ff
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
       "A few fixes of Xen related core code and drivers"
      
      * tag 'for-linus-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen
        xen/acpi: off by one in read_acpi_id()
        xen/acpi: upload _PSD info for non Dom0 CPUs too
        x86/xen: Delay get_cpu_cap until stack canary is established
        xen: xenbus_dev_frontend: Verify body of XS_TRANSACTION_END
        xen: xenbus: Catch closing of non existent transactions
        xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
      67a7a8ff
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-4.17-2' of git://git.infradead.org/users/hch/dma-mapping · c5c177c5
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
       "Fix for one swiotlb regression in 2.16 from Takashi"
      
      * tag 'dma-mapping-4.17-2' of git://git.infradead.org/users/hch/dma-mapping:
        swiotlb: fix unexpected swiotlb_alloc_coherent failures
      c5c177c5
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · d1cb7718
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC core:
         - Prevent bus reference leak in mmc_blk_init()
      
        MMC host:
         - tmio: Fix error handling when issuing CMD23
         - jz4740: Fix race condition in IRQ mask update"
      
      * tag 'mmc-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: tmio: Fix error handling when issuing CMD23
        mmc: core: Prevent bus reference leak in mmc_blk_init()
        mmc: jz4740: Fix race condition in IRQ mask update
      d1cb7718
    • Linus Torvalds's avatar
      Merge tag 'for_linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb · cb098d50
      Linus Torvalds authored
      Pull kdb updates from Jason Wessel:
      
       - fix 2032 time access issues and new compiler warnings
      
       - minor regression test cleanup
      
       - formatting fixes for end user use of kdb
      
      * tag 'for_linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
        kdb: use memmove instead of overlapping memcpy
        kdb: use ktime_get_mono_fast_ns() instead of ktime_get_ts()
        kdb: bl: don't use tab character in output
        kdb: drop newline in unknown command output
        kdb: make "mdr" command repeat
        kdb: use __ktime_get_real_seconds instead of __current_kernel_time
        misc: kgdbts: Display progress of asynchronous tests
      cb098d50
    • Linus Torvalds's avatar
      Merge tag 'microblaze-4.17-rc1' of git://git.monstr.eu/linux-2.6-microblaze · 07820c3b
      Linus Torvalds authored
      Pull microblaze updates from Michal Simek:
       "Use generic pci_mmap_resource_range()"
      
      * tag 'microblaze-4.17-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: Use generic pci_mmap_resource_range()
        microblaze: Provide pgprot_device/writecombine macros for nommu
      07820c3b
    • Linus Torvalds's avatar
      Merge tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic · c17b0aad
      Linus Torvalds authored
      Pull asm-generic fixes from Arnd Bergmann:
       "I have one regression fix for a minor build problem after the
        architecture removal series, plus a rework of the barriers in the
        readl/writel functions, thanks to work by Sinan Kaya:
      
        This started from a discussion on the linuxpcc and rdma mailing
        lists[1]. To summarize, we decided that architectures are responsible
        to serialize readl() and writel() accesses on a device MMIO space
        relative to DMA performed by that device.
      
        This series provides a pessimistic implementation of that behavior for
        asm-generic/io.h, which is in turn used by a number of architectures
        (h8300, microblaze, nios2, openrisc, s390, sparc, um, unicore32, and
        xtensa). Some of those presumably need no extra barriers, or something
        weaker than rmb()/wmb(), and they are advised to override the new
        default for better performance.
      
        For inb()/outb(), the same barriers are used, but architectures might
        want to add another barrier to outb() here if that can guarantee
        non-posted behavior (some architectures can, others cannot do that).
      
        The readl_relaxed()/writel_relaxed() family of functions retains the
        existing behavior with no extra barriers"
      
      [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-March/170481.html
      
      * tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
        io: change writeX_relaxed() to remove barriers
        io: change readX_relaxed() to remove barriers
        dts: remove cris & metag dts hard link file
        io: change inX() to have their own IO barrier overrides
        io: change outX() to have their own IO barrier overrides
        io: define stronger ordering for the default writeX() implementation
        io: define stronger ordering for the default readX() implementation
        io: define several IO & PIO barrier types for the asm-generic version
      c17b0aad
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · e241e3f2
      Linus Torvalds authored
      Pull virtio update from Michael Tsirkin:
       "This adds reporting hugepage stats to virtio-balloon"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_balloon: export hugetlb page allocation counts
      e241e3f2
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · e5c37228
      Linus Torvalds authored
      Pull IOMMU updates from Joerg Roedel:
      
       - OF_IOMMU support for the Rockchip iommu driver so that it can use
         generic DT bindings
      
       - rework of locking in the AMD IOMMU interrupt remapping code to make
         it work better in RT kernels
      
       - support for improved iotlb flushing in the AMD IOMMU driver
      
       - support for 52-bit physical and virtual addressing in the ARM-SMMU
      
       - various other small fixes and cleanups
      
      * tag 'iommu-updates-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
        iommu/io-pgtable-arm: Avoid warning with 32-bit phys_addr_t
        iommu/rockchip: Support sharing IOMMU between masters
        iommu/rockchip: Add runtime PM support
        iommu/rockchip: Fix error handling in init
        iommu/rockchip: Use OF_IOMMU to attach devices automatically
        iommu/rockchip: Use IOMMU device for dma mapping operations
        dt-bindings: iommu/rockchip: Add clock property
        iommu/rockchip: Control clocks needed to access the IOMMU
        iommu/rockchip: Fix TLB flush of secondary IOMMUs
        iommu/rockchip: Use iopoll helpers to wait for hardware
        iommu/rockchip: Fix error handling in attach
        iommu/rockchip: Request irqs in rk_iommu_probe()
        iommu/rockchip: Fix error handling in probe
        iommu/rockchip: Prohibit unbind and remove
        iommu/amd: Return proper error code in irq_remapping_alloc()
        iommu/amd: Make amd_iommu_devtable_lock a spin_lock
        iommu/amd: Drop the lock while allocating new irq remap table
        iommu/amd: Factor out setting the remap table for a devid
        iommu/amd: Use `table' instead `irt' as variable name in amd_iommu_update_ga()
        iommu/amd: Remove the special case from alloc_irq_table()
        ...
      e5c37228
    • Linus Torvalds's avatar
      Merge tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 1fe43114
      Linus Torvalds authored
      Pull more power management updates from Rafael Wysocki:
       "These include one big-ticket item which is the rework of the idle loop
        in order to prevent CPUs from spending too much time in shallow idle
        states. It reduces idle power on some systems by 10% or more and may
        improve performance of workloads in which the idle loop overhead
        matters. This has been in the works for several weeks and it has been
        tested and reviewed quite thoroughly.
      
        Also included are changes that finalize the cpufreq cleanup moving
        frequency table validation from drivers to the core, a few fixes and
        cleanups of cpufreq drivers, a cpuidle documentation update and a PM
        QoS core update to mark the expected switch fall-throughs in it.
      
        Specifics:
      
         - Rework the idle loop in order to prevent CPUs from spending too
           much time in shallow idle states by making it stop the scheduler
           tick before putting the CPU into an idle state only if the idle
           duration predicted by the idle governor is long enough.
      
           That required the code to be reordered to invoke the idle governor
           before stopping the tick, among other things (Rafael Wysocki,
           Frederic Weisbecker, Arnd Bergmann).
      
         - Add the missing description of the residency sysfs attribute to the
           cpuidle documentation (Prashanth Prakash).
      
         - Finalize the cpufreq cleanup moving frequency table validation from
           drivers to the core (Viresh Kumar).
      
         - Fix a clock leak regression in the armada-37xx cpufreq driver
           (Gregory Clement).
      
         - Fix the initialization of the CPU performance data structures for
           shared policies in the CPPC cpufreq driver (Shunyong Yang).
      
         - Clean up the ti-cpufreq, intel_pstate and CPPC cpufreq drivers a
           bit (Viresh Kumar, Rafael Wysocki).
      
         - Mark the expected switch fall-throughs in the PM QoS core (Gustavo
           Silva)"
      
      * tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
        tick-sched: avoid a maybe-uninitialized warning
        cpufreq: Drop cpufreq_table_validate_and_show()
        cpufreq: SCMI: Don't validate the frequency table twice
        cpufreq: CPPC: Initialize shared perf capabilities of CPUs
        cpufreq: armada-37xx: Fix clock leak
        cpufreq: CPPC: Don't set transition_latency
        cpufreq: ti-cpufreq: Use builtin_platform_driver()
        cpufreq: intel_pstate: Do not include debugfs.h
        PM / QoS: mark expected switch fall-throughs
        cpuidle: Add definition of residency to sysfs documentation
        time: hrtimer: Use timerqueue_iterate_next() to get to the next timer
        nohz: Avoid duplication of code related to got_idle_tick
        nohz: Gather tick_sched booleans under a common flag field
        cpuidle: menu: Avoid selecting shallow states with stopped tick
        cpuidle: menu: Refine idle state selection for running tick
        sched: idle: Select idle state before stopping the tick
        time: hrtimer: Introduce hrtimer_next_event_without()
        time: tick-sched: Split tick_nohz_stop_sched_tick()
        cpuidle: Return nohz hint from cpuidle_select()
        jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC
        ...
      1fe43114
  3. 11 Apr, 2018 19 commits
    • Linus Torvalds's avatar
      Merge tag 'ktest-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest · 96973767
      Linus Torvalds authored
      Pull ktest updates from Steven Rostedt:
       "These commits have either been sitting in my INBOX or have been in my
        local tree for some time. I need to push them upstream:
      
         - Separate out config-bisect.pl from ktest.pl.
      
           This allows users to do config bisects without full ktest setup.
      
         - Email on status change.
      
           Allow the user to be emailed on test start, finish, failure, etc.
      
         - Other small fixes and enhancements"
      
      * tag 'ktest-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: (24 commits)
        ktest: Take submenu into account for grub2 menus
        ktest.pl: Add MAIL_COMMAND option to define how to send email
        ktest.pl: Use run_command to execute sending mail
        ktest.pl: Allow dodie be recursive
        ktest.pl: Kill test if mailer is not supported
        ktest.pl: Add MAIL_PATH option to define where to find the mailer
        ktest.pl: No need to print no mailer is specified when mailto is not
        Ktest: add email options to sample.config
        Ktest: Use dodie for critical falures
        Ktest: Add SigInt handling
        Ktest: Add email support
        ktest.pl: Detect if a config-bisect was interrupted
        ktest.pl: Make finding config-bisect.pl dynamic
        ktest.pl: Have ktest.pl pass -r to config-bisect.pl to reset bisect
        ktest.pl: Use diffconfig if available for failed config bisects
        ktest.pl: Allow for the config-bisect.pl output to display to console
        ktest: Use config-bisect.pl in ktest.pl
        ktest: Add standalone config-bisect.pl program
        ktest: Set do_not_reboot=y for CONFIG_BISECT_TYPE=build
        ktest: Set buildonly=1 for CONFIG_BISECT_TYPE=build
        ...
      96973767
    • Linus Torvalds's avatar
      Merge tag 'tags/upstream-4.17-rc1' of git://git.infradead.org/linux-ubifs · 77cb51e6
      Linus Torvalds authored
      Pull UBI and UBIFS updates from Richard Weinberger:
       "Minor bug fixes and improvements"
      
      * tag 'tags/upstream-4.17-rc1' of git://git.infradead.org/linux-ubifs:
        ubi: Reject MLC NAND
        ubifs: Remove useless parameter of lpt_heap_replace
        ubifs: Constify struct ubifs_lprops in scan_for_leb_for_idx
        ubifs: remove unnecessary assignment
        ubi: Fix error for write access
        ubi: fastmap: Don't flush fastmap work on detach
        ubifs: Check ubifs_wbuf_sync() return code
      77cb51e6
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 375479c3
      Linus Torvalds authored
      Pull UML updates from Richard Weinberger:
      
       - a new and faster epoll based IRQ controller and NIC driver
      
       - misc fixes and janitorial updates
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
        Fix vector raw inintialization logic
        Migrate vector timers to new timer API
        um: Compile with modern headers
        um: vector: Fix an error handling path in 'vector_parse()'
        um: vector: Fix a memory allocation check
        um: vector: fix missing unlock on error in vector_net_open()
        um: Add missing EXPORT for free_irq_by_fd()
        High Performance UML Vector Network Driver
        Epoll based IRQ controller
        um: Use POSIX ucontext_t instead of struct ucontext
        um: time: Use timespec64 for persistent clock
        um: Restore symbol versions for __memcpy and memcpy
      375479c3
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 45df60cd
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "Here is a very small set of fixes for inclusion in linux-4.17-rc1: Two
        changes for the maintainer file, and one more fix for the newly added
        npcm platform, to enable the level 2 cache controller"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        MAINTAINERS: Update ASPEED entry with details
        MAINTAINERS: Migrate oxnas list to groups.io
        arm: npcm: enable L2 cache in NPCM7xx architecture
      45df60cd
    • Linus Torvalds's avatar
      Merge tag 'nios2-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · b82b6813
      Linus Torvalds authored
      Pull nios2 update from Ley Foon Tan:
       "Use read_persistent_clock64() instead of read_persistent_clock()"
      
      * tag 'nios2-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: Use read_persistent_clock64() instead of read_persistent_clock()
      b82b6813
    • David S. Miller's avatar
      Merge branch 'l2tp-tunnel-creation-fixes' · 0c84cee8
      David S. Miller authored
      Guillaume Nault says:
      
      ====================
      l2tp: tunnel creation fixes
      
      L2TP tunnel creation is racy. We need to make sure that the tunnel
      returned by l2tp_tunnel_create() isn't going to be freed while the
      caller is using it. This is done in patch #1, by separating tunnel
      creation from tunnel registration.
      
      With the tunnel registration code in place, we can now check for
      duplicate tunnels in a race-free way. This is done in patch #2, which
      incidentally removes the last use of l2tp_tunnel_find().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c84cee8
    • Guillaume Nault's avatar
      l2tp: fix race in duplicate tunnel detection · f6cd651b
      Guillaume Nault authored
      We can't use l2tp_tunnel_find() to prevent l2tp_nl_cmd_tunnel_create()
      from creating a duplicate tunnel. A tunnel can be concurrently
      registered after l2tp_tunnel_find() returns. Therefore, searching for
      duplicates must be done at registration time.
      
      Finally, remove l2tp_tunnel_find() entirely as it isn't use anywhere
      anymore.
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6cd651b
    • Guillaume Nault's avatar
      l2tp: fix races in tunnel creation · 6b9f3423
      Guillaume Nault authored
      l2tp_tunnel_create() inserts the new tunnel into the namespace's tunnel
      list and sets the socket's ->sk_user_data field, before returning it to
      the caller. Therefore, there are two ways the tunnel can be accessed
      and freed, before the caller even had the opportunity to take a
      reference. In practice, syzbot could crash the module by closing the
      socket right after a new tunnel was returned to pppol2tp_create().
      
      This patch moves tunnel registration out of l2tp_tunnel_create(), so
      that the caller can safely hold a reference before publishing the
      tunnel. This second step is done with the new l2tp_tunnel_register()
      function, which is now responsible for associating the tunnel to its
      socket and for inserting it into the namespace's list.
      
      While moving the code to l2tp_tunnel_register(), a few modifications
      have been done. First, the socket validation tests are done in a helper
      function, for clarity. Also, modifying the socket is now done after
      having inserted the tunnel to the namespace's tunnels list. This will
      allow insertion to fail, without having to revert theses modifications
      in the error path (a followup patch will check for duplicate tunnels
      before insertion). Either the socket is a kernel socket which we
      control, or it is a user-space socket for which we have a reference on
      the file descriptor. In any case, the socket isn't going to be closed
      from under us.
      
      Reported-by: syzbot+fbeeb5c3b538e8545644@syzkaller.appspotmail.com
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b9f3423
    • Sabrina Dubroca's avatar
      tun: send netlink notification when the device is modified · 83c1f36f
      Sabrina Dubroca authored
      I added dumping of link information about tun devices over netlink in
      commit 1ec010e7 ("tun: export flags, uid, gid, queue information
      over netlink"), but didn't add the missing netlink notifications when
      the device's exported properties change.
      
      This patch adds notifications when owner/group or flags are modified,
      when queues are attached/detached, and when a tun fd is closed.
      Reported-by: default avatarThomas Haller <thaller@redhat.com>
      Fixes: 1ec010e7 ("tun: export flags, uid, gid, queue information over netlink")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83c1f36f
    • Sabrina Dubroca's avatar
      tun: set the flags before registering the netdevice · 9fffc5c6
      Sabrina Dubroca authored
      Otherwise, register_netdevice advertises the creation of the device with
      the default flags, instead of what the user requested.
      Reported-by: default avatarThomas Haller <thaller@redhat.com>
      Fixes: 1ec010e7 ("tun: export flags, uid, gid, queue information over netlink")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fffc5c6
    • Phil Elwell's avatar
      lan78xx: Don't reset the interface on open · 47b99865
      Phil Elwell authored
      Commit 92571a1a ("lan78xx: Connect phy early") moves the PHY
      initialisation into lan78xx_probe, but lan78xx_open subsequently calls
      lan78xx_reset. As well as forcing a second round of link negotiation,
      this reset frequently prevents the phy interrupt from being generated
      (even though the link is up), rendering the interface unusable.
      
      Fix this issue by removing the lan78xx_reset call from lan78xx_open.
      
      Fixes: 92571a1a ("lan78xx: Connect phy early")
      Signed-off-by: default avatarPhil Elwell <phil@raspberrypi.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47b99865
    • David S. Miller's avatar
      Merge branch 'bnxt_en-Fixes-for-net' · 9cf74f59
      David S. Miller authored
      Michael Chan says:
      
      ====================
      bnxt_en: Fixes for net.
      
      This bug fix series include NULL pointer fixes in ethtool -x code path
      and in the error clean up path when freeing IRQs, a ring accounting bug
      that missed rings used by the RDMA driver, and 3 bug fixes related to TC
      Flower and VF-reps.
      
      v2: Fixed commit message of patch 4.  Changed the pound sign to $ sign
      in front of the ip command.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cf74f59
    • Michael Chan's avatar
      bnxt_en: Fix NULL pointer dereference at bnxt_free_irq(). · cb98526b
      Michael Chan authored
      When open fails during ethtool -L ring change, for example, the driver
      may crash at bnxt_free_irq() because bp->bnapi is NULL.
      
      If we fail to allocate all the new rings, bnxt_open_nic() will free
      all the memory including bp->bnapi.  Subsequent call to bnxt_close_nic()
      will try to dereference bp->bnapi in bnxt_free_irq().
      
      Fix it by checking for !bp->bnapi in bnxt_free_irq().
      
      Fixes: e5811b8c ("bnxt_en: Add IRQ remapping logic.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb98526b
    • Michael Chan's avatar
      bnxt_en: Need to include RDMA rings in bnxt_check_rings(). · 11c3ec7b
      Michael Chan authored
      With recent changes to reserve both L2 and RDMA rings, we need to include
      the RDMA rings in bnxt_check_rings().  Otherwise we will under-estimate
      the rings we need during ethtool -L and may lead to failure.
      
      Fixes: fbcfc8e4 ("bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11c3ec7b
    • Sriharsha Basavapatna's avatar
      bnxt_en: Support max-mtu with VF-reps · 9d96465b
      Sriharsha Basavapatna authored
      While a VF is configured with a bigger mtu (> 1500), any packets that
      are punted to the VF-rep (slow-path) get dropped by OVS kernel-datapath
      with the following message: "dropped over-mtu packet". Fix this by
      returning the max-mtu value for a VF-rep derived from its corresponding VF.
      VF-rep's mtu can be changed using 'ip' command as shown in this example:
      
      	$ ip link set bnxt0_pf0vf0 mtu 9000
      Signed-off-by: default avatarSriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d96465b
    • Sriharsha Basavapatna's avatar
      bnxt_en: Ignore src port field in decap filter nodes · 479ca3bf
      Sriharsha Basavapatna authored
      The driver currently uses src port field (along with other fields) in the
      decap tunnel key, while looking up and adding tunnel nodes. This leads to
      redundant cfa_decap_filter_alloc() requests to the FW and flow-miss in the
      flow engine. Fix this by ignoring the src port field in decap tunnel nodes.
      
      Fixes: f484f678 ("bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter")
      Signed-off-by: default avatarSriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      479ca3bf
    • Andy Gospodarek's avatar
      bnxt_en: do not allow wildcard matches for L2 flows · e85a9be9
      Andy Gospodarek authored
      Before this patch the following commands would succeed as far as the
      user was concerned:
      
      $ tc qdisc add dev p1p1 ingress
      $ tc filter add dev p1p1 parent ffff: protocol all \
      	flower skip_sw action drop
      $ tc filter add dev p1p1 parent ffff: protocol ipv4 \
      	flower skip_sw src_mac 00:02:00:00:00:01/44 action drop
      
      The current flow offload infrastructure used does not support wildcard
      matching for ethernet headers, so do not allow the second or third
      commands to succeed.  If a user wants to drop traffic on that interface
      the protocol and MAC addresses need to be specified explicitly:
      
      $ tc qdisc add dev p1p1 ingress
      $ tc filter add dev p1p1 parent ffff: protocol arp \
      	flower skip_sw action drop
      $ tc filter add dev p1p1 parent ffff: protocol ipv4 \
      	flower skip_sw action drop
      ...
      $ tc filter add dev p1p1 parent ffff: protocol ipv4 \
      	flower skip_sw src_mac 00:02:00:00:00:01 action drop
      $ tc filter add dev p1p1 parent ffff: protocol ipv4 \
      	flower skip_sw src_mac 00:02:00:00:00:02 action drop
      ...
      
      There are also checks for VLAN parameters in this patch as other callers
      may wildcard those parameters even if tc does not.  Using different
      flow infrastructure could allow this to work in the future for L2 flows,
      but for now it does not.
      
      Fixes: 2ae7408f ("bnxt_en: bnxt: add TC flower filter offload support")
      Signed-off-by: default avatarAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e85a9be9
    • Michael Chan's avatar
      bnxt_en: Fix ethtool -x crash when device is down. · 7991cb9c
      Michael Chan authored
      Fix ethtool .get_rxfh() crash by checking for valid indirection table
      address before copying the data.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7991cb9c
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 8837c70d
      Linus Torvalds authored
      Merge more updates from Andrew Morton:
      
       - almost all of the rest of MM
      
       - kasan updates
      
       - lots of procfs work
      
       - misc things
      
       - lib/ updates
      
       - checkpatch
      
       - rapidio
      
       - ipc/shm updates
      
       - the start of willy's XArray conversion
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (140 commits)
        page cache: use xa_lock
        xarray: add the xa_lock to the radix_tree_root
        fscache: use appropriate radix tree accessors
        export __set_page_dirty
        unicore32: turn flush_dcache_mmap_lock into a no-op
        arm64: turn flush_dcache_mmap_lock into a no-op
        mac80211_hwsim: use DEFINE_IDA
        radix tree: use GFP_ZONEMASK bits of gfp_t for flags
        linux/const.h: refactor _BITUL and _BITULL a bit
        linux/const.h: move UL() macro to include/linux/const.h
        linux/const.h: prefix include guard of uapi/linux/const.h with _UAPI
        xen, mm: allow deferred page initialization for xen pv domains
        elf: enforce MAP_FIXED on overlaying elf segments
        fs, elf: drop MAP_FIXED usage from elf_map
        mm: introduce MAP_FIXED_NOREPLACE
        MAINTAINERS: update bouncing aacraid@adaptec.com addresses
        fs/dcache.c: add cond_resched() in shrink_dentry_list()
        include/linux/kfifo.h: fix comment
        ipc/shm.c: shm_split(): remove unneeded test for NULL shm_file_data.vm_ops
        kernel/sysctl.c: add kdoc comments to do_proc_do{u}intvec_minmax_conv_param
        ...
      8837c70d