1. 07 May, 2020 30 commits
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 192ffb75
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix bootconfig causing kernels to fail with CONFIG_BLK_DEV_RAM
         enabled
      
       - Fix allocation leaks in bootconfig tool
      
       - Fix a double initialization of a variable
      
       - Fix API bootconfig usage from kprobe boot time events
      
       - Reject NULL location for kprobes
      
       - Fix crash caused by preempt delay module not cleaning up kthread
         correctly
      
       - Add vmalloc_sync_mappings() to prevent x86_64 page faults from
         recursively faulting from tracing page faults
      
       - Fix comment in gpu/trace kerneldoc header
      
       - Fix documentation of how to create a trace event class
      
       - Make the local tracing_snapshot_instance_cond() function static
      
      * tag 'trace-v5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tools/bootconfig: Fix resource leak in apply_xbc()
        tracing: Make tracing_snapshot_instance_cond() static
        tracing: Fix doc mistakes in trace sample
        gpu/trace: Minor comment updates for gpu_mem_total tracepoint
        tracing: Add a vmalloc_sync_mappings() for safe measure
        tracing: Wait for preempt irq delay thread to finish
        tracing/kprobes: Reject new event if loc is NULL
        tracing/boottime: Fix kprobe event API usage
        tracing/kprobes: Fix a double initialization typo
        bootconfig: Fix to remove bootconfig data from initrd while boot
      192ffb75
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-5.7-rc5' of... · 9ecc4d77
      Linus Torvalds authored
      Merge tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fixes from Shuah Khan:
       "ftrace test fixes and a fix to kvm Makefile for relocatable
        native/cross builds and installs"
      
      * tag 'linux-kselftest-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: fix kvm relocatable native/cross builds and installs
        selftests/ftrace: Make XFAIL green color
        ftrace/selftest: make unresolved cases cause failure if --fail-unresolved set
        ftrace/selftests: workaround cgroup RT scheduling issues
      9ecc4d77
    • Yunfeng Ye's avatar
      tools/bootconfig: Fix resource leak in apply_xbc() · 88426044
      Yunfeng Ye authored
      Fix the @data and @fd allocations that are leaked in the error path of
      apply_xbc().
      
      Link: http://lkml.kernel.org/r/583a49c9-c27a-931d-e6c2-6f63a4b18bea@huawei.comAcked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarYunfeng Ye <yeyunfeng@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      88426044
    • Zou Wei's avatar
      tracing: Make tracing_snapshot_instance_cond() static · 192b7993
      Zou Wei authored
      Fix the following sparse warning:
      
      kernel/trace/trace.c:950:6: warning: symbol 'tracing_snapshot_instance_cond'
      was not declared. Should it be static?
      
      Link: http://lkml.kernel.org/r/1587614905-48692-1-git-send-email-zou_wei@huawei.comReported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZou Wei <zou_wei@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      192b7993
    • Wei Yang's avatar
      tracing: Fix doc mistakes in trace sample · f094a233
      Wei Yang authored
      As the example below shows, DECLARE_EVENT_CLASS() is used instead of
      DEFINE_EVENT_CLASS().
      
      Link: http://lkml.kernel.org/r/20200428214959.11259-1-richard.weiyang@gmail.comSigned-off-by: default avatarWei Yang <richard.weiyang@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      f094a233
    • Yiwei Zhang's avatar
      gpu/trace: Minor comment updates for gpu_mem_total tracepoint · 386c82a7
      Yiwei Zhang authored
      This change updates the improper comment for the 'size' attribute in the
      tracepoint definition. Most gfx drivers pre-fault in physical pages
      instead of making virtual allocations. So we drop the 'Virtual' keyword
      here and leave this to the implementations.
      
      Link: http://lkml.kernel.org/r/20200428220825.169606-1-zzyiwei@google.comSigned-off-by: default avatarYiwei Zhang <zzyiwei@google.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      386c82a7
    • Steven Rostedt (VMware)'s avatar
      tracing: Add a vmalloc_sync_mappings() for safe measure · 11f5efc3
      Steven Rostedt (VMware) authored
      x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu
      areas can be complex, to say the least. Mappings may happen at boot up, and
      if nothing synchronizes the page tables, those page mappings may not be
      synced till they are used. This causes issues for anything that might touch
      one of those mappings in the path of the page fault handler. When one of
      those unmapped mappings is touched in the page fault handler, it will cause
      another page fault, which in turn will cause a page fault, and leave us in
      a loop of page faults.
      
      Commit 763802b5 ("x86/mm: split vmalloc_sync_all()") split
      vmalloc_sync_all() into vmalloc_sync_unmappings() and
      vmalloc_sync_mappings(), as on system exit, it did not need to do a full
      sync on x86_64 (although it still needed to be done on x86_32). By chance,
      the vmalloc_sync_all() would synchronize the page mappings done at boot up
      and prevent the per cpu area from being a problem for tracing in the page
      fault handler. But when that synchronization in the exit of a task became a
      nop, it caused the problem to appear.
      
      Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home
      
      Cc: stable@vger.kernel.org
      Fixes: 737223fb ("tracing: Consolidate buffer allocation code")
      Reported-by: default avatar"Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
      Suggested-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      11f5efc3
    • Steven Rostedt (VMware)'s avatar
      tracing: Wait for preempt irq delay thread to finish · d16a8c31
      Steven Rostedt (VMware) authored
      Running on a slower machine, it is possible that the preempt delay kernel
      thread may still be executing if the module was immediately removed after
      added, and this can cause the kernel to crash as the kernel thread might be
      executing after its code has been removed.
      
      There's no reason that the caller of the code shouldn't just wait for the
      delay thread to finish, as the thread can also be created by a trigger in
      the sysfs code, which also has the same issues.
      
      Link: http://lore.kernel.org/r/5EA2B0C8.2080706@cn.fujitsu.com
      
      Cc: stable@vger.kernel.org
      Fixes: 79393723 ("lib: Add module for testing preemptoff/irqsoff latency tracers")
      Reported-by: default avatarXiao Yang <yangx.jy@cn.fujitsu.com>
      Reviewed-by: default avatarXiao Yang <yangx.jy@cn.fujitsu.com>
      Reviewed-by: default avatarJoel Fernandes <joel@joelfernandes.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      d16a8c31
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 6e7f2eac
      Linus Torvalds authored
      Pull arm64 fix from Catalin Marinas:
       "Avoid potential NULL dereference in huge_pte_alloc() on pmd_alloc()
        failure"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: hugetlb: avoid potential NULL dereference
      6e7f2eac
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 8c16ec94
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Bugfixes, mostly for ARM and AMD, and more documentation.
      
        Slightly bigger than usual because I couldn't send out what was
        pending for rc4, but there is nothing worrisome going on. I have more
        fixes pending for guest debugging support (gdbstub) but I will send
        them next week"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
        KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
        KVM: selftests: Fix build for evmcs.h
        kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bits
        KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB path
        docs/virt/kvm: Document configuring and running nested guests
        KVM: s390: Remove false WARN_ON_ONCE for the PQAP instruction
        kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts
        KVM: x86: Fixes posted interrupt check for IRQs delivery modes
        KVM: SVM: fill in kvm_run->debug.arch.dr[67]
        KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warning
        KVM: arm64: Fix 32bit PC wrap-around
        KVM: arm64: vgic-v4: Initialize GICv4.1 even in the absence of a virtual ITS
        KVM: arm64: Save/restore sp_el0 as part of __guest_enter
        KVM: arm64: Delete duplicated label in invalid_vector
        KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi()
        KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
        KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits
        KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits
        KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER read
        KVM: arm64: PSCI: Forbid 64bit functions for 32bit guests
        ...
      8c16ec94
    • Linus Torvalds's avatar
      Merge tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfs · de268ccb
      Linus Torvalds authored
      Pull configfs fix from Christoph Hellwig:
       "Fix a refcount leak in configfs_rmdir (Xiyu Yang)"
      
      * tag 'configfs-for-5.7' of git://git.infradead.org/users/hch/configfs:
        configfs: fix config_item refcnt leak in configfs_rmdir()
      de268ccb
    • Mark Rutland's avatar
      arm64: hugetlb: avoid potential NULL dereference · 027d0c71
      Mark Rutland authored
      The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may
      pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL:
      
      |   CC      arch/arm64/mm/pageattr.o
      |   CC      arch/arm64/mm/hugetlbpage.o
      |                  from arch/arm64/mm/hugetlbpage.c:10:
      | arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’:
      | ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference]
      | ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
      | arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
      |     |arch/arm64/mm/hugetlbpage.c:232:10:
      |     |./arch/arm64/include/asm/pgtable-types.h:28:24:
      | ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
      | arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
      
      This can only occur when the kernel cannot allocate a page, and so is
      unlikely to happen in practice before other systems start failing.
      
      We can avoid this by bailing out if pmd_alloc() fails, as we do earlier
      in the function if pud_alloc() fails.
      
      Fixes: 66b3923a ("arm64: hugetlb: add support for PTE contiguous bit")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarKyrill Tkachov <kyrylo.tkachov@arm.com>
      Cc: <stable@vger.kernel.org> # 4.5.x-
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      027d0c71
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · a811c1fa
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix reference count leaks in various parts of batman-adv, from Xiyu
          Yang.
      
       2) Update NAT checksum even when it is zero, from Guillaume Nault.
      
       3) sk_psock reference count leak in tls code, also from Xiyu Yang.
      
       4) Sanity check TCA_FQ_CODEL_DROP_BATCH_SIZE netlink attribute in
          fq_codel, from Eric Dumazet.
      
       5) Fix panic in choke_reset(), also from Eric Dumazet.
      
       6) Fix VLAN accel handling in bnxt_fix_features(), from Michael Chan.
      
       7) Disallow out of range quantum values in sch_sfq, from Eric Dumazet.
      
       8) Fix crash in x25_disconnect(), from Yue Haibing.
      
       9) Don't pass pointer to local variable back to the caller in
          nf_osf_hdr_ctx_init(), from Arnd Bergmann.
      
      10) Wireguard should use the ECN decap helper functions, from Toke
          Høiland-Jørgensen.
      
      11) Fix command entry leak in mlx5 driver, from Moshe Shemesh.
      
      12) Fix uninitialized variable access in mptcp's
          subflow_syn_recv_sock(), from Paolo Abeni.
      
      13) Fix unnecessary out-of-order ingress frame ordering in macsec, from
          Scott Dial.
      
      14) IPv6 needs to use a global serial number for dst validation just
          like ipv4, from David Ahern.
      
      15) Fix up PTP_1588_CLOCK deps, from Clay McClure.
      
      16) Missing NLM_F_MULTI flag in gtp driver netlink messages, from
          Yoshiyuki Kurauchi.
      
      17) Fix a regression in that dsa user port errors should not be fatal,
          from Florian Fainelli.
      
      18) Fix iomap leak in enetc driver, from Dejin Zheng.
      
      19) Fix use after free in lec_arp_clear_vccs(), from Cong Wang.
      
      20) Initialize protocol value earlier in neigh code paths when
          generating events, from Roman Mashak.
      
      21) netdev_update_features() must be called with RTNL mutex in macsec
          driver, from Antoine Tenart.
      
      22) Validate untrusted GSO packets even more strictly, from Willem de
          Bruijn.
      
      23) Wireguard decrypt worker needs a cond_resched(), from Jason
          Donenfeld.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
        net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
        MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order
        wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing
        wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning
        wireguard: send/receive: cond_resched() when processing worker ringbuffers
        wireguard: socket: remove errant restriction on looping to self
        wireguard: selftests: use normal kernel stack size on ppc64
        net: ethernet: ti: am65-cpsw-nuss: fix irqs type
        ionic: Use debugfs_create_bool() to export bool
        net: dsa: Do not leave DSA master with NULL netdev_ops
        net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred
        net: stricter validation of untrusted gso packets
        seg6: fix SRH processing to comply with RFC8754
        net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms
        net: dsa: ocelot: the MAC table on Felix is twice as large
        net: dsa: sja1105: the PTP_CLK extts input reacts on both edges
        selftests: net: tcp_mmap: fix SO_RCVLOWAT setting
        net: hsr: fix incorrect type usage for protocol variable
        net: macsec: fix rtnl locking issue
        net: mvpp2: cls: Prevent buffer overflow in mvpp2_ethtool_cls_rule_del()
        ...
      a811c1fa
    • Pablo Neira Ayuso's avatar
      net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE · 16f80360
      Pablo Neira Ayuso authored
      This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
      that the frontend does not need counters, this hw stats type request
      never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
      the driver to disable the stats, however, if the driver cannot disable
      counters, it bails out.
      
      TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
      except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
      (this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
      TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.
      
      Fixes: 319a1d19 ("flow_offload: check for basic action hw stats type")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16f80360
    • Lukas Bulwahn's avatar
      MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order · b0956956
      Lukas Bulwahn authored
      Commit 9b038086 ("docs: networking: convert DIM to RST") added a new
      file entry to DYNAMIC INTERRUPT MODERATION to the end, and not following
      alphabetical order.
      
      So, ./scripts/checkpatch.pl -f MAINTAINERS complains:
      
        WARNING: Misordered MAINTAINERS entry - list file patterns in alphabetic
        order
        #5966: FILE: MAINTAINERS:5966:
        +F:      lib/dim/
        +F:      Documentation/networking/net_dim.rst
      
      Reorder the file entries to keep MAINTAINERS nicely ordered.
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0956956
    • David S. Miller's avatar
      Merge branch 'wireguard-fixes' · d3f3e6ac
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard fixes for 5.7-rc5
      
      With Ubuntu and Debian having backported this into their kernels, we're
      finally seeing testing from places we hadn't seen prior, which is nice.
      With that comes more fixes:
      
      1) The CI for PPC64 was running with extremely small stacks for 64-bit,
         causing spurious crashes in surprising places.
      
      2) There's was an old leftover routing loop restriction, which no longer
         makes sense given the queueing architecture, and was causing problems
         for people who really did want nested routing.
      
      3) Not yielding our kthread on CONFIG_PREEMPT_VOLUNTARY systems caused
         RCU stalls and other issues, reported by Wang Jian, with the fix
         suggested by Sultan Alsawaf.
      
      4) Clang spewed warnings in a selftest for CONFIG_IPV6=n, reported by
         Arnd Bergmann.
      
      5) A complicated if statement was simplified to an assignment while also
         making the likely/unlikely hinting more correct and simple, and
         increasing readability, suggested by Sultan.
      
      Patches (2) and (3) have Fixes: lines and are probably good candidates
      for stable.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3f3e6ac
    • Jason A. Donenfeld's avatar
      wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing · 243f2148
      Jason A. Donenfeld authored
      It's very unlikely that send will become true. It's nearly always false
      between 0 and 120 seconds of a session, and in most cases becomes true
      only between 120 and 121 seconds before becoming false again. So,
      unlikely(send) is clearly the right option here.
      
      What happened before was that we had this complex boolean expression
      with multiple likely and unlikely clauses nested. Since this is
      evaluated left-to-right anyway, the whole thing got converted to
      unlikely. So, we can clean this up to better represent what's going on.
      
      The generated code is the same.
      Suggested-by: default avatarSultan Alsawaf <sultan@kerneltoast.com>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      243f2148
    • Jason A. Donenfeld's avatar
      wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning · 4fed818e
      Jason A. Donenfeld authored
      Without setting these to NULL, clang complains in certain
      configurations that have CONFIG_IPV6=n:
      
      In file included from drivers/net/wireguard/ratelimiter.c:223:
      drivers/net/wireguard/selftest/ratelimiter.c:173:34: error: variable 'skb6' is uninitialized when used here [-Werror,-Wuninitialized]
                      ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count);
                                                     ^~~~
      drivers/net/wireguard/selftest/ratelimiter.c:123:29: note: initialize the variable 'skb6' to silence this warning
              struct sk_buff *skb4, *skb6;
                                         ^
                                          = NULL
      drivers/net/wireguard/selftest/ratelimiter.c:173:40: error: variable 'hdr6' is uninitialized when used here [-Werror,-Wuninitialized]
                      ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count);
                                                           ^~~~
      drivers/net/wireguard/selftest/ratelimiter.c:125:22: note: initialize the variable 'hdr6' to silence this warning
              struct ipv6hdr *hdr6;
                                  ^
      
      We silence this warning by setting the variables to NULL as the warning
      suggests.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fed818e
    • Jason A. Donenfeld's avatar
      wireguard: send/receive: cond_resched() when processing worker ringbuffers · 4005f5c3
      Jason A. Donenfeld authored
      Users with pathological hardware reported CPU stalls on CONFIG_
      PREEMPT_VOLUNTARY=y, because the ringbuffers would stay full, meaning
      these workers would never terminate. That turned out not to be okay on
      systems without forced preemption, which Sultan observed. This commit
      adds a cond_resched() to the bottom of each loop iteration, so that
      these workers don't hog the core. Note that we don't need this on the
      napi poll worker, since that terminates after its budget is expended.
      Suggested-by: default avatarSultan Alsawaf <sultan@kerneltoast.com>
      Reported-by: default avatarWang Jian <larkwang@gmail.com>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4005f5c3
    • Jason A. Donenfeld's avatar
      wireguard: socket: remove errant restriction on looping to self · b673e24a
      Jason A. Donenfeld authored
      It's already possible to create two different interfaces and loop
      packets between them. This has always been possible with tunnels in the
      kernel, and isn't specific to wireguard. Therefore, the networking stack
      already needs to deal with that. At the very least, the packet winds up
      exceeding the MTU and is discarded at that point. So, since this is
      already something that happens, there's no need to forbid the not very
      exceptional case of routing a packet back to the same interface; this
      loop is no different than others, and we shouldn't special case it, but
      rather rely on generic handling of loops in general. This also makes it
      easier to do interesting things with wireguard such as onion routing.
      
      At the same time, we add a selftest for this, ensuring that both onion
      routing works and infinite routing loops do not crash the kernel. We
      also add a test case for wireguard interfaces nesting packets and
      sending traffic between each other, as well as the loop in this case
      too. We make sure to send some throughput-heavy traffic for this use
      case, to stress out any possible recursion issues with the locks around
      workqueues.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b673e24a
    • Jason A. Donenfeld's avatar
      wireguard: selftests: use normal kernel stack size on ppc64 · a0fd7cc8
      Jason A. Donenfeld authored
      While at some point it might have made sense to be running these tests
      on ppc64 with 4k stacks, the kernel hasn't actually used 4k stacks on
      64-bit powerpc in a long time, and more interesting things that we test
      don't really work when we deviate from the default (16k). So, we stop
      pushing our luck in this commit, and return to the default instead of
      the minimum.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0fd7cc8
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw-nuss: fix irqs type · 6f5c27f9
      Grygorii Strashko authored
      The K3 INTA driver, which is source TX/RX IRQs for CPSW NUSS, defines IRQs
      triggering type as EDGE by default, but triggering type for CPSW NUSS TX/RX
      IRQs has to be LEVEL as the EDGE triggering type may cause unnecessary IRQs
      triggering and NAPI scheduling for empty queues. It was discovered with
      RT-kernel.
      
      Fix it by explicitly specifying CPSW NUSS TX/RX IRQ type as
      IRQF_TRIGGER_HIGH.
      
      Fixes: 93a76530 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f5c27f9
    • Geert Uytterhoeven's avatar
      ionic: Use debugfs_create_bool() to export bool · 0735ccc9
      Geert Uytterhoeven authored
      Currently bool ionic_cq.done_color is exported using
      debugfs_create_u8(), which requires a cast, preventing further compiler
      checks.
      
      Fix this by switching to debugfs_create_bool(), and dropping the cast.
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Acked-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0735ccc9
    • Florian Fainelli's avatar
      net: dsa: Do not leave DSA master with NULL netdev_ops · 050569fc
      Florian Fainelli authored
      When ndo_get_phys_port_name() for the CPU port was added we introduced
      an early check for when the DSA master network device in
      dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
      we perform the teardown operation in dsa_master_ndo_teardown() we would
      not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
      non-NULL initialized.
      
      With network device drivers such as virtio_net, this leads to a NPD as
      soon as the DSA switch hanging off of it gets torn down because we are
      now assigning the virtio_net device's netdev_ops a NULL pointer.
      
      Fixes: da7b9e9b ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
      Reported-by: default avatarAllen Pais <allen.pais@oracle.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarAllen Pais <allen.pais@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      050569fc
    • Vladimir Oltean's avatar
      net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred · 65722159
      Vladimir Oltean authored
      This was caused by a poor merge conflict resolution on my side. The
      "act = &cls->rule->action.entries[0];" assignment was already present in
      the code prior to the patch mentioned below.
      
      Fixes: e13c2075 ("net: dsa: refactor matchall mirred action to separate function")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65722159
    • Willem de Bruijn's avatar
      net: stricter validation of untrusted gso packets · 9274124f
      Willem de Bruijn authored
      Syzkaller again found a path to a kernel crash through bad gso input:
      a packet with transport header extending beyond skb_headlen(skb).
      
      Tighten validation at kernel entry:
      
      - Verify that the transport header lies within the linear section.
      
          To avoid pulling linux/tcp.h, verify just sizeof tcphdr.
          tcp_gso_segment will call pskb_may_pull (th->doff * 4) before use.
      
      - Match the gso_type against the ip_proto found by the flow dissector.
      
      Fixes: bfd5f4a3 ("packet: Add GSO/csum offload support.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9274124f
    • Ahmed Abdelsalam's avatar
      seg6: fix SRH processing to comply with RFC8754 · 0cb7498f
      Ahmed Abdelsalam authored
      The Segment Routing Header (SRH) which defines the SRv6 dataplane is defined
      in RFC8754.
      
      RFC8754 (section 4.1) defines the SR source node behavior which encapsulates
      packets into an outer IPv6 header and SRH. The SR source node encodes the
      full list of Segments that defines the packet path in the SRH. Then, the
      first segment from list of Segments is copied into the Destination address
      of the outer IPv6 header and the packet is sent to the first hop in its path
      towards the destination.
      
      If the Segment list has only one segment, the SR source node can omit the SRH
      as he only segment is added in the destination address.
      
      RFC8754 (section 4.1.1) defines the Reduced SRH, when a source does not
      require the entire SID list to be preserved in the SRH. A reduced SRH does
      not contain the first segment of the related SR Policy (the first segment is
      the one already in the DA of the IPv6 header), and the Last Entry field is
      set to n-2, where n is the number of elements in the SR Policy.
      
      RFC8754 (section 4.3.1.1) defines the SRH processing and the logic to
      validate the SRH (S09, S10, S11) which works for both reduced and
      non-reduced behaviors.
      
      This patch updates seg6_validate_srh() to validate the SRH as per RFC8754.
      Signed-off-by: default avatarAhmed Abdelsalam <ahabdels@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0cb7498f
    • David S. Miller's avatar
      Merge branch 'FDB-fixes-for-Felix-and-Ocelot-switches' · 6e0ddb65
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      FDB fixes for Felix and Ocelot switches
      
      This series fixes the following problems:
      - Dynamically learnt addresses never expiring (neither for Ocelot nor
        for Felix)
      - Half of the FDB not visible in 'bridge fdb show' (for Felix only)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e0ddb65
    • Vladimir Oltean's avatar
      net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms · c0d7eccb
      Vladimir Oltean authored
      One may notice that automatically-learnt entries 'never' expire, even
      though the bridge configures the address age period at 300 seconds.
      
      Actually the value written to hardware corresponds to a time interval
      1000 times higher than intended, i.e. 83 hours.
      
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Faineli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0d7eccb
    • Vladimir Oltean's avatar
      net: dsa: ocelot: the MAC table on Felix is twice as large · 21ce7f3e
      Vladimir Oltean authored
      When running 'bridge fdb dump' on Felix, sometimes learnt and static MAC
      addresses would appear, sometimes they wouldn't.
      
      Turns out, the MAC table has 4096 entries on VSC7514 (Ocelot) and 8192
      entries on VSC9959 (Felix), so the existing code from the Ocelot common
      library only dumped half of Felix's MAC table. They are both organized
      as a 4-way set-associative TCAM, so we just need a single variable
      indicating the correct number of rows.
      
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21ce7f3e
  2. 06 May, 2020 10 commits