1. 07 May, 2020 21 commits
    • Peter Zijlstra's avatar
      livepatch: Remove .klp.arch · 1d05334d
      Peter Zijlstra authored
      After the previous patch, vmlinux-specific KLP relocations are now
      applied early during KLP module load.  This means that .klp.arch
      sections are no longer needed for *vmlinux-specific* KLP relocations.
      
      One might think they're still needed for *module-specific* KLP
      relocations.  If a to-be-patched module is loaded *after* its
      corresponding KLP module is loaded, any corresponding KLP relocations
      will be delayed until the to-be-patched module is loaded.  If any
      special sections (.parainstructions, for example) rely on those
      relocations, their initializations (apply_paravirt) need to be done
      afterwards.  Thus the apparent need for arch_klp_init_object_loaded()
      and its corresponding .klp.arch sections -- it allows some of the
      special section initializations to be done at a later time.
      
      But... if you look closer, that dependency between the special sections
      and the module-specific KLP relocations doesn't actually exist in
      reality.  Looking at the contents of the .altinstructions and
      .parainstructions sections, there's not a realistic scenario in which a
      KLP module's .altinstructions or .parainstructions section needs to
      access a symbol in a to-be-patched module.  It might need to access a
      local symbol or even a vmlinux symbol; but not another module's symbol.
      When a special section needs to reference a local or vmlinux symbol, a
      normal rela can be used instead of a KLP rela.
      
      Since the special section initializations don't actually have any real
      dependency on module-specific KLP relocations, .klp.arch and
      arch_klp_init_object_loaded() no longer have a reason to exist.  So
      remove them.
      
      As Peter said much more succinctly:
      
        So the reason for .klp.arch was that .klp.rela.* stuff would overwrite
        paravirt instructions. If that happens you're doing it wrong. Those
        RELAs are core kernel, not module, and thus should've happened in
        .rela.* sections at patch-module loading time.
      
        Reverting this removes the two apply_{paravirt,alternatives}() calls
        from the late patching path, and means we don't have to worry about
        them when removing module_disable_ro().
      
      [ jpoimboe: Rewrote patch description.  Tweaked klp_init_object_loaded()
      	    error path. ]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Acked-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      1d05334d
    • Josh Poimboeuf's avatar
      livepatch: Apply vmlinux-specific KLP relocations early · 7c8e2bdd
      Josh Poimboeuf authored
      KLP relocations are livepatch-specific relocations which are applied to
      a KLP module's text or data.  They exist for two reasons:
      
        1) Unexported symbols: replacement functions often need to access
           unexported symbols (e.g. static functions), which "normal"
           relocations don't allow.
      
        2) Late module patching: this is the ability for a KLP module to
           bypass normal module dependencies, such that the KLP module can be
           loaded *before* a to-be-patched module.  This means that
           relocations which need to access symbols in the to-be-patched
           module might need to be applied to the KLP module well after it has
           been loaded.
      
      Non-late-patched KLP relocations are applied from the KLP module's init
      function.  That usually works fine, unless the patched code wants to use
      alternatives, paravirt patching, jump tables, or some other special
      section which needs relocations.  Then we run into ordering issues and
      crashes.
      
      In order for those special sections to work properly, the KLP
      relocations should be applied *before* the special section init code
      runs, such as apply_paravirt(), apply_alternatives(), or
      jump_label_apply_nops().
      
      You might think the obvious solution would be to move the KLP relocation
      initialization earlier, but it's not necessarily that simple.  The
      problem is the above-mentioned late module patching, for which KLP
      relocations can get applied well after the KLP module is loaded.
      
      To "fix" this issue in the past, we created .klp.arch sections:
      
        .klp.arch.{module}..altinstructions
        .klp.arch.{module}..parainstructions
      
      Those sections allow KLP late module patching code to call
      apply_paravirt() and apply_alternatives() after the module-specific KLP
      relocations (.klp.rela.{module}.{section}) have been applied.
      
      But that has a lot of drawbacks, including code complexity, the need for
      arch-specific code, and the (per-arch) danger that we missed some
      special section -- for example the __jump_table section which is used
      for jump labels.
      
      It turns out there's a simpler and more functional approach.  There are
      two kinds of KLP relocation sections:
      
        1) vmlinux-specific KLP relocation sections
      
           .klp.rela.vmlinux.{sec}
      
           These are relocations (applied to the KLP module) which reference
           unexported vmlinux symbols.
      
        2) module-specific KLP relocation sections
      
           .klp.rela.{module}.{sec}:
      
           These are relocations (applied to the KLP module) which reference
           unexported or exported module symbols.
      
      Up until now, these have been treated the same.  However, they're
      inherently different.
      
      Because of late module patching, module-specific KLP relocations can be
      applied very late, thus they can create the ordering headaches described
      above.
      
      But vmlinux-specific KLP relocations don't have that problem.  There's
      nothing to prevent them from being applied earlier.  So apply them at
      the same time as normal relocations, when the KLP module is being
      loaded.
      
      This means that for vmlinux-specific KLP relocations, we no longer have
      any ordering issues.  vmlinux-referencing jump labels, alternatives, and
      paravirt patching will work automatically, without the need for the
      .klp.arch hacks.
      
      All that said, for module-specific KLP relocations, the ordering
      problems still exist and we *do* still need .klp.arch.  Or do we?  Stay
      tuned.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Acked-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      7c8e2bdd
    • Josh Poimboeuf's avatar
      livepatch: Disallow vmlinux.ko · dcf550e5
      Josh Poimboeuf authored
      This is purely a theoretical issue, but if there were a module named
      vmlinux.ko, the livepatch relocation code wouldn't be able to
      distinguish between vmlinux-specific and vmlinux.o-specific KLP
      relocations.
      
      If CONFIG_LIVEPATCH is enabled, don't allow a module named vmlinux.ko.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      dcf550e5
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · a811c1fa
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix reference count leaks in various parts of batman-adv, from Xiyu
          Yang.
      
       2) Update NAT checksum even when it is zero, from Guillaume Nault.
      
       3) sk_psock reference count leak in tls code, also from Xiyu Yang.
      
       4) Sanity check TCA_FQ_CODEL_DROP_BATCH_SIZE netlink attribute in
          fq_codel, from Eric Dumazet.
      
       5) Fix panic in choke_reset(), also from Eric Dumazet.
      
       6) Fix VLAN accel handling in bnxt_fix_features(), from Michael Chan.
      
       7) Disallow out of range quantum values in sch_sfq, from Eric Dumazet.
      
       8) Fix crash in x25_disconnect(), from Yue Haibing.
      
       9) Don't pass pointer to local variable back to the caller in
          nf_osf_hdr_ctx_init(), from Arnd Bergmann.
      
      10) Wireguard should use the ECN decap helper functions, from Toke
          Høiland-Jørgensen.
      
      11) Fix command entry leak in mlx5 driver, from Moshe Shemesh.
      
      12) Fix uninitialized variable access in mptcp's
          subflow_syn_recv_sock(), from Paolo Abeni.
      
      13) Fix unnecessary out-of-order ingress frame ordering in macsec, from
          Scott Dial.
      
      14) IPv6 needs to use a global serial number for dst validation just
          like ipv4, from David Ahern.
      
      15) Fix up PTP_1588_CLOCK deps, from Clay McClure.
      
      16) Missing NLM_F_MULTI flag in gtp driver netlink messages, from
          Yoshiyuki Kurauchi.
      
      17) Fix a regression in that dsa user port errors should not be fatal,
          from Florian Fainelli.
      
      18) Fix iomap leak in enetc driver, from Dejin Zheng.
      
      19) Fix use after free in lec_arp_clear_vccs(), from Cong Wang.
      
      20) Initialize protocol value earlier in neigh code paths when
          generating events, from Roman Mashak.
      
      21) netdev_update_features() must be called with RTNL mutex in macsec
          driver, from Antoine Tenart.
      
      22) Validate untrusted GSO packets even more strictly, from Willem de
          Bruijn.
      
      23) Wireguard decrypt worker needs a cond_resched(), from Jason
          Donenfeld.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
        net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
        MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order
        wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing
        wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning
        wireguard: send/receive: cond_resched() when processing worker ringbuffers
        wireguard: socket: remove errant restriction on looping to self
        wireguard: selftests: use normal kernel stack size on ppc64
        net: ethernet: ti: am65-cpsw-nuss: fix irqs type
        ionic: Use debugfs_create_bool() to export bool
        net: dsa: Do not leave DSA master with NULL netdev_ops
        net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred
        net: stricter validation of untrusted gso packets
        seg6: fix SRH processing to comply with RFC8754
        net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms
        net: dsa: ocelot: the MAC table on Felix is twice as large
        net: dsa: sja1105: the PTP_CLK extts input reacts on both edges
        selftests: net: tcp_mmap: fix SO_RCVLOWAT setting
        net: hsr: fix incorrect type usage for protocol variable
        net: macsec: fix rtnl locking issue
        net: mvpp2: cls: Prevent buffer overflow in mvpp2_ethtool_cls_rule_del()
        ...
      a811c1fa
    • Pablo Neira Ayuso's avatar
      net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE · 16f80360
      Pablo Neira Ayuso authored
      This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
      that the frontend does not need counters, this hw stats type request
      never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
      the driver to disable the stats, however, if the driver cannot disable
      counters, it bails out.
      
      TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
      except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
      (this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
      TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.
      
      Fixes: 319a1d19 ("flow_offload: check for basic action hw stats type")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16f80360
    • Lukas Bulwahn's avatar
      MAINTAINERS: put DYNAMIC INTERRUPT MODERATION in proper order · b0956956
      Lukas Bulwahn authored
      Commit 9b038086 ("docs: networking: convert DIM to RST") added a new
      file entry to DYNAMIC INTERRUPT MODERATION to the end, and not following
      alphabetical order.
      
      So, ./scripts/checkpatch.pl -f MAINTAINERS complains:
      
        WARNING: Misordered MAINTAINERS entry - list file patterns in alphabetic
        order
        #5966: FILE: MAINTAINERS:5966:
        +F:      lib/dim/
        +F:      Documentation/networking/net_dim.rst
      
      Reorder the file entries to keep MAINTAINERS nicely ordered.
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0956956
    • David S. Miller's avatar
      Merge branch 'wireguard-fixes' · d3f3e6ac
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard fixes for 5.7-rc5
      
      With Ubuntu and Debian having backported this into their kernels, we're
      finally seeing testing from places we hadn't seen prior, which is nice.
      With that comes more fixes:
      
      1) The CI for PPC64 was running with extremely small stacks for 64-bit,
         causing spurious crashes in surprising places.
      
      2) There's was an old leftover routing loop restriction, which no longer
         makes sense given the queueing architecture, and was causing problems
         for people who really did want nested routing.
      
      3) Not yielding our kthread on CONFIG_PREEMPT_VOLUNTARY systems caused
         RCU stalls and other issues, reported by Wang Jian, with the fix
         suggested by Sultan Alsawaf.
      
      4) Clang spewed warnings in a selftest for CONFIG_IPV6=n, reported by
         Arnd Bergmann.
      
      5) A complicated if statement was simplified to an assignment while also
         making the likely/unlikely hinting more correct and simple, and
         increasing readability, suggested by Sultan.
      
      Patches (2) and (3) have Fixes: lines and are probably good candidates
      for stable.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3f3e6ac
    • Jason A. Donenfeld's avatar
      wireguard: send/receive: use explicit unlikely branch instead of implicit coalescing · 243f2148
      Jason A. Donenfeld authored
      It's very unlikely that send will become true. It's nearly always false
      between 0 and 120 seconds of a session, and in most cases becomes true
      only between 120 and 121 seconds before becoming false again. So,
      unlikely(send) is clearly the right option here.
      
      What happened before was that we had this complex boolean expression
      with multiple likely and unlikely clauses nested. Since this is
      evaluated left-to-right anyway, the whole thing got converted to
      unlikely. So, we can clean this up to better represent what's going on.
      
      The generated code is the same.
      Suggested-by: default avatarSultan Alsawaf <sultan@kerneltoast.com>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      243f2148
    • Jason A. Donenfeld's avatar
      wireguard: selftests: initalize ipv6 members to NULL to squelch clang warning · 4fed818e
      Jason A. Donenfeld authored
      Without setting these to NULL, clang complains in certain
      configurations that have CONFIG_IPV6=n:
      
      In file included from drivers/net/wireguard/ratelimiter.c:223:
      drivers/net/wireguard/selftest/ratelimiter.c:173:34: error: variable 'skb6' is uninitialized when used here [-Werror,-Wuninitialized]
                      ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count);
                                                     ^~~~
      drivers/net/wireguard/selftest/ratelimiter.c:123:29: note: initialize the variable 'skb6' to silence this warning
              struct sk_buff *skb4, *skb6;
                                         ^
                                          = NULL
      drivers/net/wireguard/selftest/ratelimiter.c:173:40: error: variable 'hdr6' is uninitialized when used here [-Werror,-Wuninitialized]
                      ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count);
                                                           ^~~~
      drivers/net/wireguard/selftest/ratelimiter.c:125:22: note: initialize the variable 'hdr6' to silence this warning
              struct ipv6hdr *hdr6;
                                  ^
      
      We silence this warning by setting the variables to NULL as the warning
      suggests.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fed818e
    • Jason A. Donenfeld's avatar
      wireguard: send/receive: cond_resched() when processing worker ringbuffers · 4005f5c3
      Jason A. Donenfeld authored
      Users with pathological hardware reported CPU stalls on CONFIG_
      PREEMPT_VOLUNTARY=y, because the ringbuffers would stay full, meaning
      these workers would never terminate. That turned out not to be okay on
      systems without forced preemption, which Sultan observed. This commit
      adds a cond_resched() to the bottom of each loop iteration, so that
      these workers don't hog the core. Note that we don't need this on the
      napi poll worker, since that terminates after its budget is expended.
      Suggested-by: default avatarSultan Alsawaf <sultan@kerneltoast.com>
      Reported-by: default avatarWang Jian <larkwang@gmail.com>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4005f5c3
    • Jason A. Donenfeld's avatar
      wireguard: socket: remove errant restriction on looping to self · b673e24a
      Jason A. Donenfeld authored
      It's already possible to create two different interfaces and loop
      packets between them. This has always been possible with tunnels in the
      kernel, and isn't specific to wireguard. Therefore, the networking stack
      already needs to deal with that. At the very least, the packet winds up
      exceeding the MTU and is discarded at that point. So, since this is
      already something that happens, there's no need to forbid the not very
      exceptional case of routing a packet back to the same interface; this
      loop is no different than others, and we shouldn't special case it, but
      rather rely on generic handling of loops in general. This also makes it
      easier to do interesting things with wireguard such as onion routing.
      
      At the same time, we add a selftest for this, ensuring that both onion
      routing works and infinite routing loops do not crash the kernel. We
      also add a test case for wireguard interfaces nesting packets and
      sending traffic between each other, as well as the loop in this case
      too. We make sure to send some throughput-heavy traffic for this use
      case, to stress out any possible recursion issues with the locks around
      workqueues.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b673e24a
    • Jason A. Donenfeld's avatar
      wireguard: selftests: use normal kernel stack size on ppc64 · a0fd7cc8
      Jason A. Donenfeld authored
      While at some point it might have made sense to be running these tests
      on ppc64 with 4k stacks, the kernel hasn't actually used 4k stacks on
      64-bit powerpc in a long time, and more interesting things that we test
      don't really work when we deviate from the default (16k). So, we stop
      pushing our luck in this commit, and return to the default instead of
      the minimum.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0fd7cc8
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw-nuss: fix irqs type · 6f5c27f9
      Grygorii Strashko authored
      The K3 INTA driver, which is source TX/RX IRQs for CPSW NUSS, defines IRQs
      triggering type as EDGE by default, but triggering type for CPSW NUSS TX/RX
      IRQs has to be LEVEL as the EDGE triggering type may cause unnecessary IRQs
      triggering and NAPI scheduling for empty queues. It was discovered with
      RT-kernel.
      
      Fix it by explicitly specifying CPSW NUSS TX/RX IRQ type as
      IRQF_TRIGGER_HIGH.
      
      Fixes: 93a76530 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f5c27f9
    • Geert Uytterhoeven's avatar
      ionic: Use debugfs_create_bool() to export bool · 0735ccc9
      Geert Uytterhoeven authored
      Currently bool ionic_cq.done_color is exported using
      debugfs_create_u8(), which requires a cast, preventing further compiler
      checks.
      
      Fix this by switching to debugfs_create_bool(), and dropping the cast.
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Acked-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0735ccc9
    • Florian Fainelli's avatar
      net: dsa: Do not leave DSA master with NULL netdev_ops · 050569fc
      Florian Fainelli authored
      When ndo_get_phys_port_name() for the CPU port was added we introduced
      an early check for when the DSA master network device in
      dsa_master_ndo_setup() already implements ndo_get_phys_port_name(). When
      we perform the teardown operation in dsa_master_ndo_teardown() we would
      not be checking that cpu_dp->orig_ndo_ops was successfully allocated and
      non-NULL initialized.
      
      With network device drivers such as virtio_net, this leads to a NPD as
      soon as the DSA switch hanging off of it gets torn down because we are
      now assigning the virtio_net device's netdev_ops a NULL pointer.
      
      Fixes: da7b9e9b ("net: dsa: Add ndo_get_phys_port_name() for CPU port")
      Reported-by: default avatarAllen Pais <allen.pais@oracle.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarAllen Pais <allen.pais@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      050569fc
    • Vladimir Oltean's avatar
      net: dsa: remove duplicate assignment in dsa_slave_add_cls_matchall_mirred · 65722159
      Vladimir Oltean authored
      This was caused by a poor merge conflict resolution on my side. The
      "act = &cls->rule->action.entries[0];" assignment was already present in
      the code prior to the patch mentioned below.
      
      Fixes: e13c2075 ("net: dsa: refactor matchall mirred action to separate function")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65722159
    • Willem de Bruijn's avatar
      net: stricter validation of untrusted gso packets · 9274124f
      Willem de Bruijn authored
      Syzkaller again found a path to a kernel crash through bad gso input:
      a packet with transport header extending beyond skb_headlen(skb).
      
      Tighten validation at kernel entry:
      
      - Verify that the transport header lies within the linear section.
      
          To avoid pulling linux/tcp.h, verify just sizeof tcphdr.
          tcp_gso_segment will call pskb_may_pull (th->doff * 4) before use.
      
      - Match the gso_type against the ip_proto found by the flow dissector.
      
      Fixes: bfd5f4a3 ("packet: Add GSO/csum offload support.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9274124f
    • Ahmed Abdelsalam's avatar
      seg6: fix SRH processing to comply with RFC8754 · 0cb7498f
      Ahmed Abdelsalam authored
      The Segment Routing Header (SRH) which defines the SRv6 dataplane is defined
      in RFC8754.
      
      RFC8754 (section 4.1) defines the SR source node behavior which encapsulates
      packets into an outer IPv6 header and SRH. The SR source node encodes the
      full list of Segments that defines the packet path in the SRH. Then, the
      first segment from list of Segments is copied into the Destination address
      of the outer IPv6 header and the packet is sent to the first hop in its path
      towards the destination.
      
      If the Segment list has only one segment, the SR source node can omit the SRH
      as he only segment is added in the destination address.
      
      RFC8754 (section 4.1.1) defines the Reduced SRH, when a source does not
      require the entire SID list to be preserved in the SRH. A reduced SRH does
      not contain the first segment of the related SR Policy (the first segment is
      the one already in the DA of the IPv6 header), and the Last Entry field is
      set to n-2, where n is the number of elements in the SR Policy.
      
      RFC8754 (section 4.3.1.1) defines the SRH processing and the logic to
      validate the SRH (S09, S10, S11) which works for both reduced and
      non-reduced behaviors.
      
      This patch updates seg6_validate_srh() to validate the SRH as per RFC8754.
      Signed-off-by: default avatarAhmed Abdelsalam <ahabdels@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0cb7498f
    • David S. Miller's avatar
      Merge branch 'FDB-fixes-for-Felix-and-Ocelot-switches' · 6e0ddb65
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      FDB fixes for Felix and Ocelot switches
      
      This series fixes the following problems:
      - Dynamically learnt addresses never expiring (neither for Ocelot nor
        for Felix)
      - Half of the FDB not visible in 'bridge fdb show' (for Felix only)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e0ddb65
    • Vladimir Oltean's avatar
      net: mscc: ocelot: ANA_AUTOAGE_AGE_PERIOD holds a value in seconds, not ms · c0d7eccb
      Vladimir Oltean authored
      One may notice that automatically-learnt entries 'never' expire, even
      though the bridge configures the address age period at 300 seconds.
      
      Actually the value written to hardware corresponds to a time interval
      1000 times higher than intended, i.e. 83 hours.
      
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Faineli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0d7eccb
    • Vladimir Oltean's avatar
      net: dsa: ocelot: the MAC table on Felix is twice as large · 21ce7f3e
      Vladimir Oltean authored
      When running 'bridge fdb dump' on Felix, sometimes learnt and static MAC
      addresses would appear, sometimes they wouldn't.
      
      Turns out, the MAC table has 4096 entries on VSC7514 (Ocelot) and 8192
      entries on VSC9959 (Felix), so the existing code from the Ocelot common
      library only dumped half of Felix's MAC table. They are both organized
      as a 4-way set-associative TCAM, so we just need a single variable
      indicating the correct number of rows.
      
      Fixes: 56051948 ("net: dsa: ocelot: add driver for Felix switch family")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21ce7f3e
  2. 06 May, 2020 9 commits
  3. 05 May, 2020 8 commits
  4. 04 May, 2020 2 commits
    • Qiushi Wu's avatar
      nfp: abm: fix a memory leak bug · bd4af432
      Qiushi Wu authored
      In function nfp_abm_vnic_set_mac, pointer nsp is allocated by nfp_nsp_open.
      But when nfp_nsp_has_hwinfo_lookup fail, the pointer is not released,
      which can lead to a memory leak bug. Fix this issue by adding
      nfp_nsp_close(nsp) in the error path.
      
      Fixes: f6e71efd ("nfp: abm: look up MAC addresses via management FW")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd4af432
    • Cong Wang's avatar
      atm: fix a memory leak of vcc->user_back · 8d9f73c0
      Cong Wang authored
      In lec_arp_clear_vccs() only entry->vcc is freed, but vcc
      could be installed on entry->recv_vcc too in lec_vcc_added().
      
      This fixes the following memory leak:
      
      unreferenced object 0xffff8880d9266b90 (size 16):
        comm "atm2", pid 425, jiffies 4294907980 (age 23.488s)
        hex dump (first 16 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 6b 6b 6b a5  ............kkk.
        backtrace:
          [<(____ptrval____)>] kmem_cache_alloc_trace+0x10e/0x151
          [<(____ptrval____)>] lane_ioctl+0x4b3/0x569
          [<(____ptrval____)>] do_vcc_ioctl+0x1ea/0x236
          [<(____ptrval____)>] svc_ioctl+0x17d/0x198
          [<(____ptrval____)>] sock_do_ioctl+0x47/0x12f
          [<(____ptrval____)>] sock_ioctl+0x2f9/0x322
          [<(____ptrval____)>] vfs_ioctl+0x1e/0x2b
          [<(____ptrval____)>] ksys_ioctl+0x61/0x80
          [<(____ptrval____)>] __x64_sys_ioctl+0x16/0x19
          [<(____ptrval____)>] do_syscall_64+0x57/0x65
          [<(____ptrval____)>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Cc: Gengming Liu <l.dmxcsnsbh@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d9f73c0