1. 19 Nov, 2021 24 commits
  2. 18 Nov, 2021 16 commits
    • Jakub Kicinski's avatar
    • Linus Torvalds's avatar
      Merge tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8d0112ac
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bpf, mac80211.
      
        Current release - regressions:
      
         - devlink: don't throw an error if flash notification sent before
           devlink visible
      
         - page_pool: Revert "page_pool: disable dma mapping support...",
           turns out there are active arches who need it
      
        Current release - new code bugs:
      
         - amt: cancel delayed_work synchronously in amt_fini()
      
        Previous releases - regressions:
      
         - xsk: fix crash on double free in buffer pool
      
         - bpf: fix inner map state pruning regression causing program
           rejections
      
         - mac80211: drop check for DONT_REORDER in __ieee80211_select_queue,
           preventing mis-selecting the best effort queue
      
         - mac80211: do not access the IV when it was stripped
      
         - mac80211: fix radiotap header generation, off-by-one
      
         - nl80211: fix getting radio statistics in survey dump
      
         - e100: fix device suspend/resume
      
        Previous releases - always broken:
      
         - tcp: fix uninitialized access in skb frags array for Rx 0cp
      
         - bpf: fix toctou on read-only map's constant scalar tracking
      
         - bpf: forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing
           progs
      
         - tipc: only accept encrypted MSG_CRYPTO msgs
      
         - smc: transfer remaining wait queue entries during fallback, fix
           missing wake ups
      
         - udp: validate checksum in udp_read_sock() (when sockmap is used)
      
         - sched: act_mirred: drop dst for the direction from egress to
           ingress
      
         - virtio_net_hdr_to_skb: count transport header in UFO, prevent
           allowing bad skbs into the stack
      
         - nfc: reorder the logic in nfc_{un,}register_device, fix unregister
      
         - ipsec: check return value of ipv6_skip_exthdr
      
         - usb: r8152: add MAC passthrough support for more Lenovo Docks"
      
      * tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
        ptp: ocp: Fix a couple NULL vs IS_ERR() checks
        net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
        net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
        ipv6: check return value of ipv6_skip_exthdr
        e100: fix device suspend/resume
        devlink: Don't throw an error if flash notification sent before devlink visible
        page_pool: Revert "page_pool: disable dma mapping support..."
        ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
        octeontx2-af: debugfs: don't corrupt user memory
        NFC: add NCI_UNREG flag to eliminate the race
        NFC: reorder the logic in nfc_{un,}register_device
        NFC: reorganize the functions in nci_request
        tipc: check for null after calling kmemdup
        i40e: Fix display error code in dmesg
        i40e: Fix creation of first queue by omitting it if is not power of two
        i40e: Fix warning message and call stack during rmmod i40e driver
        i40e: Fix ping is lost after configuring ADq on VF
        i40e: Fix changing previously set num_queue_pairs for PFs
        i40e: Fix NULL ptr dereference on VSI filter sync
        i40e: Fix correct max_pkt_size on VF RX queue
        ...
      8d0112ac
    • Linus Torvalds's avatar
      Merge tag 'for-5.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 6fdf8864
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Several xes and one old ioctl deprecation. Namely there's fix for
        crashes/warnings with lzo compression that was suspected to be caused
        by first pull merge resolution, but it was a different bug.
      
        Summary:
      
         - regression fix for a crash in lzo due to missing boundary checks of
           the page array
      
         - fix crashes on ARM64 due to missing barriers when synchronizing
           status bits between work queues
      
         - silence lockdep when reading chunk tree during mount
      
         - fix false positive warning in integrity checker on devices with
           disabled write caching
      
         - fix signedness of bitfields in scrub
      
         - start deprecation of balance v1 ioctl"
      
      * tag 'for-5.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: deprecate BTRFS_IOC_BALANCE ioctl
        btrfs: make 1-bit bit-fields of scrub_page unsigned int
        btrfs: check-integrity: fix a warning on write caching disabled disk
        btrfs: silence lockdep when reading chunk tree during mount
        btrfs: fix memory ordering between normal and ordered work functions
        btrfs: fix a out-of-bound access in copy_compressed_data_to_page()
      6fdf8864
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · db850a9b
      Linus Torvalds authored
      Pull UDF fix from Jan Kara:
       "A fix for a long-standing UDF bug where we were not properly
        validating directory position inside readdir"
      
      * tag 'fs_for_v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        udf: Fix crash after seekdir
      db850a9b
    • Linus Torvalds's avatar
      Merge tag 'fs.idmapped.v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 7cf7eed1
      Linus Torvalds authored
      Pull setattr idmapping fix from Christian Brauner:
       "This contains a simple fix for setattr. When determining the validity
        of the attributes the ia_{g,u}id fields contain the value that will be
        written to inode->i_{g,u}id. When the {g,u}id attribute of the file
        isn't altered and the caller's fs{g,u}id matches the current {g,u}id
        attribute the attribute change is allowed.
      
        The value in ia_{g,u}id does already account for idmapped mounts and
        will have taken the relevant idmapping into account. So in order to
        verify that the {g,u}id attribute isn't changed we simple need to
        compare the ia_{g,u}id value against the inode's i_{g,u}id value.
      
        This only has any meaning for idmapped mounts as idmapping helpers are
        idempotent without them. And for idmapped mounts this really only has
        a meaning when circular idmappings are used, i.e. mappings where e.g.
        id 1000 is mapped to id 1001 and id 1001 is mapped to id 1000. Such
        ciruclar mappings can e.g. be useful when sharing the same home
        directory between multiple users at the same time.
      
        Before this patch we could end up denying legitimate attribute changes
        and allowing invalid attribute changes when circular mappings are
        used. To even get into this situation the caller must've been
        privileged both to create that mapping and to create that idmapped
        mount.
      
        This hasn't been seen in the wild anywhere but came up when expanding
        the fstest suite during work on a series of hardening patches. All
        idmapped fstests pass without any regressions and we're adding new
        tests to verify the behavior of circular mappings.
      
        The new tests can be found at [1]"
      
      Link: https://lore.kernel.org/linux-fsdevel/20211109145713.1868404-2-brauner@kernel.org [1]
      
      * tag 'fs.idmapped.v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        fs: handle circular mappings correctly
      7cf7eed1
    • Linus Torvalds's avatar
      Merge tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · a6a6d227
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
       "parisc bug and warning fixes and wire up futex_waitv.
      
        Fix some warnings which showed up with allmodconfig builds, a revert
        of a change to the sigreturn trampoline which broke signal handling,
        wire up futex_waitv and add CONFIG_PRINTK_TIME=y to 32bit defconfig"
      
      * tag 'for-5.16/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfig
        Revert "parisc: Reduce sigreturn trampoline to 3 instructions"
        parisc: Wrap assembler related defines inside __ASSEMBLY__
        parisc: Wire up futex_waitv
        parisc: Include stringify.h to avoid build error in crypto/api.c
        parisc/sticon: fix reverse colors
      a6a6d227
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · c46e8ece
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Selftest changes:
      
         - Cleanups for the perf test infrastructure and mapping hugepages
      
         - Avoid contention on mmap_sem when the guests start to run
      
         - Add event channel upcall support to xen_shinfo_test
      
        x86 changes:
      
         - Fixes for Xen emulation
      
         - Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
      
         - Fixes for migration of 32-bit nested guests on 64-bit hypervisor
      
         - Compilation fixes
      
         - More SEV cleanups
      
        Generic:
      
         - Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS
           and num_online_cpus(). Most architectures were only using one of
           the two"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
        KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
        KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
        KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
        KVM: x86: Assume a 64-bit hypercall for guests with protected state
        selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
        riscv: kvm: fix non-kernel-doc comment block
        KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()
        KVM: SEV: Drop a redundant setting of sev->asid during initialization
        KVM: SEV: WARN if SEV-ES is marked active but SEV is not
        KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()
        KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs
        KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache
        KVM: nVMX: Use a gfn_to_hva_cache for vmptrld
        KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check
        KVM: x86/xen: Use sizeof_field() instead of open-coding it
        KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12
        KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO
        ...
      c46e8ece
    • Linus Torvalds's avatar
      Merge tag 'docs-5.16-2' of git://git.lwn.net/linux · 4ae275bc
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A handful of documentation fixes for 5.16"
      
      * tag 'docs-5.16-2' of git://git.lwn.net/linux:
        Documentation/process: fix a cross reference
        Documentation: update vcpu-requests.rst reference
        docs: accounting: update delay-accounting.rst reference
        libbpf: update index.rst reference
        docs: filesystems: Fix grammatical error "with" to "which"
        doc/zh_CN: fix a translation error in management-style
        docs: ftrace: fix the wrong path of tracefs
        Documentation: arm: marvell: Fix link to armada_1000_pb.pdf document
        Documentation: arm: marvell: Put Armada XP section between Armada 370 and 375
        Documentation: arm: marvell: Add some links to homepage / product infos
        docs: Update Sphinx requirements
      4ae275bc
    • Linus Torvalds's avatar
      Merge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux · 7d5775d4
      Linus Torvalds authored
      Pull printk fixes from Petr Mladek:
      
       - Try to flush backtraces from other CPUs also on the local one. This
         was a regression caused by printk_safe buffers removal.
      
       - Remove header dependency warning.
      
      * tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
        printk: Remove printk.h inclusion in percpu.h
        printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
      7d5775d4
    • Dan Carpenter's avatar
      ptp: ocp: Fix a couple NULL vs IS_ERR() checks · c7521d3a
      Dan Carpenter authored
      The ptp_ocp_get_mem() function does not return NULL, it returns error
      pointers.
      
      Fixes: 773bda96 ("ptp: ocp: Expose various resources on the timecard.")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7521d3a
    • David S. Miller's avatar
      Merge branch 'lan78xx-napi' · bb8cecf8
      David S. Miller authored
      John Efstathiades says:
      
      ===================
      lan78xx NAPI Performance Improvements
      
      This patch set introduces a set of changes to the lan78xx driver
      that were originally developed as part of an investigation into
      the performance of TCP and UDP transfers on an Android system.
      The changes increase the throughput of both UDP and TCP transfers
      and reduce the overall CPU load.
      
      These improvements are also seen on a standard Linux kernel. Typical
      results are included at the end of this document.
      
      The changes to the driver evolved over time. The patches presented
      here attempt to organise the changes in to coherent blocks that
      affect logically connected parts of the driver. The patches do not
      reflect the way in which the code evolved during the performance
      investigation.
      
      Each patch produces a working driver that has an incremental
      improvement but patches 2, 3 and 6 should be considered a single
      update.
      
      The changes affect the following parts of the driver:
      
      1. Deferred URB processing
      
      The deferred URB processing that was originally done by a tasklet
      is now done by a NAPI polling routine. The NAPI cycle has a fixed
      work budget that controls how many received frames are passed to
      the network stack.
      
      Patch 6 introduces the NAPI polling but depends on preceding patches.
      
      The new NAPI polling routine is also responsible for submitting
      Rx and Tx URBs to the USB host controller.
      
      Moving the URB processing to a NAPI-based system "smoothed"
      incoming and outgoing data flows on the Android system under
      investigation. However, taken in isolation, moving from a tasklet
      approach to a NAPI approach made little or no difference to the
      overall performance.
      
      2. URB buffer management
      
      The driver creates a pool of Tx and a pool of Rx URB buffers. Each
      buffer is large enough to accommodate a packet with the maximum MTU
      data. URBs are allocated from these pools as required.
      
      Patch 2 introduces the new Tx buffer pool.
      Patch 3 introduces the new Rx buffer pool.
      
      3. Tx pending data
      
      SKBs containing data to be transmitted are added to a queue. The
      driver tracks free Tx URBs and the corresponding free Tx URB space.
      When new Tx URBs are submitted, pending data is copied into the
      URB buffer until the URB buffer is filled or there is no more
      pending data. This maximises utilisation the LAN78xx internal
      USB and network frame buffers.
      
      New Tx URBs are submitted to the USB host controller as part of the
      NAPI polling cycle.
      
      Patch 2 introduces these changes.
      
      4. Rx URB completion
      
      A new URB is no longer submitted as part of the URB completion
      callback.
      New URBs are submitted during the NAPI polling cycle.
      
      Patch 3 introduces these changes.
      
      5. Rx URB processing
      
      Completed URBs are put on to queue for processing (as is done in the
      current driver). Network packets in completed URBs are copied from
      the URB buffer in to dynamically allocated SKBs and passed to
      the network stack.
      
      The emptied URBs are resubmitted to the USB host controller.
      
      Patch 3 introduces this change. Patch 6 updates the change to use
      NAPI SKBs.
      
      Each packet passed to the network stack is a single NAPI work item.
      If the NAPI work budget is exhausted the remaining packets in the
      URB are put onto an overflow queue that is processed at the start
      of the next NAPI cycle.
      
      Patch 6 introduces this change.
      
      6. Driver-specific hard_header_len
      
      The driver-specific hard_header_len adjustment was removed as it
      broke generic receive offload (GRO) processing. Moreover, it was no
      longer required due the change in Tx pending data management (see
      point 3. above).
      
      Patch 5 introduces this change.
      
      The modification has been tested on four different target machines:
      
      Target           |    CPU     |   ARCH  | cores | kernel |  RAM  |
      -----------------+------------+---------+-------+--------+-------|
      Raspberry Pi 4B  | Cortex-A72 | aarch64 |   4   | 64-bit |  2 GB |
      Nitrogen8M SBC   | Cortex-A53 | aarch64 |   4   | 64-bit |  2 GB |
      Compaq Pressario | Pentium D  | i686    |   2   | 32-bit |  4 GB |
      Dell T3620       | Core i3    | x86_64  |  2+2  | 64-bit | 16 GB |
      
      The targets, apart from the Compaq, each have an on-chip USB3 host
      controller. A PCIe-based USB3 host controller card was added to the
      Compaq to provide the necessary USB3 host interface.
      
      The network throughput was measured using iperf3. The peer device was
      a second Dell T3620 fitted with an Intel i210 network interface. The
      target machine and the peer device were connected via a Netgear GS105
      gigabit switch.
      
      The CPU load was measured using mpstat running on the target machine.
      
      The tables below summarise the throughput and CPU load improvements
      achieved by the updated driver.
      
      The bandwidth is the average bandwidth reported by iperf3 at the end
      of a 60-second test.
      
      The percentage idle figure is the average idle reported across all
      CPU cores on the target machine for the duration of the test.
      
      TCP Rx (target receiving, peer transmitting)
      
                       |   Standard Driver  |   NAPI Driver      |
      Target           | Bandwidth | % Idle | Bandwidth | % Idle |
      -----------------+-----------+--------+--------------------|
      RPi4 Model B     |    941    |  74.9  |    941    |  91.5  |
      Nitrogen8M       |    941    |  76.2  |    941    |  92.7  |
      Compaq Pressario |    941    |  44.5  |    941    |  82.1  |
      Dell T3620       |    941    |  88.9  |    941    |  98.3  |
      
      TCP Tx (target transmitting, peer receiving)
      
                       |   Standard Driver  |   NAPI Driver      |
      Target           | Bandwidth | % Idle | Bandwidth | % Idle |
      -----------------+-----------+--------+--------------------|
      RPi4 Model B     |    683    |  80.1  |    942    |  97.6  |
      Nitrogen8M       |    942    |  97.8  |    942    |  97.3  |
      Compaq Pressario |    939    |  80.0  |    942    |  91.2  |
      Dell T3620       |    942    |  95.3  |    942    |  97.6  |
      
      UDP Rx (target receiving, peer transmitting)
      
                       |   Standard Driver  |   NAPI Driver      |
      Target           | Bandwidth | % Idle | Bandwidth | % Idle |
      -----------------+-----------+--------+--------------------|
      RPi4 Model B     |     -     |    -   | 958 (0%)  |  76.2  |
      Nitrogen8M       | 690 (25%) |  57.7  | 937 (0%)  |  68.5  |
      Compaq Pressario | 958 (0%)  |  50.2  | 958 (0%)  |  61.6  |
      Dell T3620       | 958 (0%)  |  89.6  | 958 (0%)  |  85.3  |
      
      The figure in brackets is the percentage packet loss.
      
      UDP Tx (target transmitting, peer receiving)
      
                       |   Standard Driver  |   NAPI Driver      |
      Target           | Bandwidth | % Idle | Bandwidth | % Idle |
      -----------------+-----------+--------+--------------------|
      RPi4 Model B     |    370    |  75.0  |    886    |  78.9  |
      Nitrogen8M       |    710    |  75.0  |    958    |  85.3  |
      Compaq Pressario |    958    |  65.5  |    958    |  76.6  |
      Dell T3620       |    958    |  97.0  |    958    |  97.3  |
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb8cecf8
    • John Efstathiades's avatar
      lan78xx: Introduce NAPI polling support · ec4c7e12
      John Efstathiades authored
      This patch introduces a NAPI-style approach for processing completed
      Rx URBs that contributes to improving driver throughput and reducing
      CPU load.
      
      Packets in completed URBs are copied to NAPI SKBs and passed to the
      network stack for processing. Each frame passed to the stack is one
      work item in the NAPI budget.
      
      If the NAPI budget is consumed and frames remain, they are added to
      an overflow queue that is processed at the start of the next NAPI
      polling cycle.
      
      The NAPI handler is also responsible for copying pending Tx data to
      Tx URBs and submitting them to the USB host controller for
      transmission.
      Signed-off-by: default avatarJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec4c7e12
    • John Efstathiades's avatar
      lan78xx: Remove hardware-specific header update · 0dd87266
      John Efstathiades authored
      Remove hardware-specific header length adjustment as it is no longer
      required. It also breaks generic receive offload (GRO) processing of
      received TCP frames that results in a TCP ACK being sent for each
      received frame.
      Signed-off-by: default avatarJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0dd87266
    • John Efstathiades's avatar
      lan78xx: Re-order rx_submit() to remove forward declaration · 9d2da721
      John Efstathiades authored
      Move position of rx_submit() to remove forward declaration of
      rx_complete() which is now no longer required.
      Signed-off-by: default avatarJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d2da721
    • John Efstathiades's avatar
      lan78xx: Introduce Rx URB processing improvements · c450a8eb
      John Efstathiades authored
      This patch introduces a new approach to allocating and managing
      Rx URBs that contributes to improving driver throughput and reducing
      CPU load.
      
      A pool of Rx URBs is created during driver instantiation. All the
      URBs are initially submitted to the USB host controller for
      processing.
      
      The default URB buffer size is different for each USB bus speed.
      The chosen sizes provide good USB utilisation with little impact on
      overall packet latency.
      
      Completed URBs are processed in the driver bottom half. The URB
      buffer contents are copied to a dynamically allocated SKB, which is
      then passed to the network stack. The URB is then re-submitted to
      the USB host controller.
      
      NOTE: the call to skb_copy() in rx_process() that copies the URB
      contents to a new SKB is a temporary change to make this patch work
      in its own right. This call will be removed when the NAPI processing
      is introduced by patch 6 in this patch set.
      Signed-off-by: default avatarJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c450a8eb
    • John Efstathiades's avatar
      lan78xx: Introduce Tx URB processing improvements · d383216a
      John Efstathiades authored
      This patch introduces a new approach to allocating and managing
      Tx URBs that contributes to improving driver throughput and reducing
      CPU load.
      
      A pool of Tx URBs is created during driver instantiation. A URB is
      allocated from the pool when there is data to transmit. The URB is
      released back to the pool when the data has been transmitted by the
      device.
      
      The default URB buffer size is different for each USB bus speed.
      The chosen sizes provide good USB utilisation with little impact on
      overall packet latency.
      
      SKBs to be transmitted are added to a pending queue for processing.
      The driver tracks the available Tx URB buffer space and copies as
      much pending data as possible into each free URB. Each full URB
      is then submitted to the USB host controller for transmission.
      Signed-off-by: default avatarJohn Efstathiades <john.efstathiades@pebblebay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d383216a