1. 17 Feb, 2020 5 commits
    • Jethro Beekman's avatar
      net: fib_rules: Correctly set table field when table number exceeds 8 bits · 540e585a
      Jethro Beekman authored
      In 709772e6, RT_TABLE_COMPAT was added to
      allow legacy software to deal with routing table numbers >= 256, but the
      same change to FIB rule queries was overlooked.
      Signed-off-by: default avatarJethro Beekman <jethro@fortanix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      540e585a
    • Leon Romanovsky's avatar
      net/rds: Track user mapped pages through special API · 0d4597c8
      Leon Romanovsky authored
      Convert net/rds to use the newly introduces pin_user_pages() API,
      which properly sets FOLL_PIN. Setting FOLL_PIN is now required for
      code that requires tracking of pinned pages.
      
      Note that this effectively changes the code's behavior: it now
      ultimately calls set_page_dirty_lock(), instead of set_page_dirty().
      This is probably more accurate.
      
      As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
      dealing with a file backed page where we have reference on the inode it
      hangs off." [1]
      
      [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de
      
      Cc: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d4597c8
    • Benjamin Poirier's avatar
      ipv6: Fix nlmsg_flags when splitting a multipath route · afecdb37
      Benjamin Poirier authored
      When splitting an RTA_MULTIPATH request into multiple routes and adding the
      second and later components, we must not simply remove NLM_F_REPLACE but
      instead replace it by NLM_F_CREATE. Otherwise, it may look like the netlink
      message was malformed.
      
      For example,
      	ip route add 2001:db8::1/128 dev dummy0
      	ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0 \
      		nexthop via fe80::30:2 dev dummy0
      results in the following warnings:
      [ 1035.057019] IPv6: RTM_NEWROUTE with no NLM_F_CREATE or NLM_F_REPLACE
      [ 1035.057517] IPv6: NLM_F_CREATE should be set when creating new route
      
      This patch makes the nlmsg sequence look equivalent for __ip6_ins_rt() to
      what it would get if the multipath route had been added in multiple netlink
      operations:
      	ip route add 2001:db8::1/128 dev dummy0
      	ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0
      	ip route append 2001:db8::1/128 nexthop via fe80::30:2 dev dummy0
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@cumulusnetworks.com>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afecdb37
    • Benjamin Poirier's avatar
      ipv6: Fix route replacement with dev-only route · e404b8c7
      Benjamin Poirier authored
      After commit 27596472 ("ipv6: fix ECMP route replacement") it is no
      longer possible to replace an ECMP-able route by a non ECMP-able route.
      For example,
      	ip route add 2001:db8::1/128 via fe80::1 dev dummy0
      	ip route replace 2001:db8::1/128 dev dummy0
      does not work as expected.
      
      Tweak the replacement logic so that point 3 in the log of the above commit
      becomes:
      3. If the new route is not ECMP-able, and no matching non-ECMP-able route
      exists, replace matching ECMP-able route (if any) or add the new route.
      
      We can now summarize the entire replace semantics to:
      When doing a replace, prefer replacing a matching route of the same
      "ECMP-able-ness" as the replace argument. If there is no such candidate,
      fallback to the first route found.
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@cumulusnetworks.com>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e404b8c7
    • Hangbin Liu's avatar
      selftests: forwarding: use proto icmp for {gretap, ip6gretap}_mac testing · e8023b03
      Hangbin Liu authored
      For tc ip_proto filter, when we extract the flow via __skb_flow_dissect()
      without flag FLOW_DISSECTOR_F_STOP_AT_ENCAP, we will continue extract to
      the inner proto.
      
      So for GRE + ICMP messages, we should not track GRE proto, but inner ICMP
      proto.
      
      For test mirror_gre.sh, it may make user confused if we capture ICMP
      message on $h3(since the flow is GRE message). So I move the capture
      dev to h3-gt{4,6}, and only capture ICMP message.
      
      Before the fix:
      ]# ./mirror_gre.sh
      TEST: ingress mirror to gretap (skip_hw)                            [ OK ]
      TEST: egress mirror to gretap (skip_hw)                             [ OK ]
      TEST: ingress mirror to ip6gretap (skip_hw)                         [ OK ]
      TEST: egress mirror to ip6gretap (skip_hw)                          [ OK ]
      TEST: ingress mirror to gretap: envelope MAC (skip_hw)              [FAIL]
       Expected to capture 10 packets, got 0.
      TEST: egress mirror to gretap: envelope MAC (skip_hw)               [FAIL]
       Expected to capture 10 packets, got 0.
      TEST: ingress mirror to ip6gretap: envelope MAC (skip_hw)           [FAIL]
       Expected to capture 10 packets, got 0.
      TEST: egress mirror to ip6gretap: envelope MAC (skip_hw)            [FAIL]
       Expected to capture 10 packets, got 0.
      TEST: two simultaneously configured mirrors (skip_hw)               [ OK ]
      WARN: Could not test offloaded functionality
      
      After fix:
      ]# ./mirror_gre.sh
      TEST: ingress mirror to gretap (skip_hw)                            [ OK ]
      TEST: egress mirror to gretap (skip_hw)                             [ OK ]
      TEST: ingress mirror to ip6gretap (skip_hw)                         [ OK ]
      TEST: egress mirror to ip6gretap (skip_hw)                          [ OK ]
      TEST: ingress mirror to gretap: envelope MAC (skip_hw)              [ OK ]
      TEST: egress mirror to gretap: envelope MAC (skip_hw)               [ OK ]
      TEST: ingress mirror to ip6gretap: envelope MAC (skip_hw)           [ OK ]
      TEST: egress mirror to ip6gretap: envelope MAC (skip_hw)            [ OK ]
      TEST: two simultaneously configured mirrors (skip_hw)               [ OK ]
      WARN: Could not test offloaded functionality
      
      Fixes: ba8d3987 ("selftests: forwarding: Add test for mirror to gretap")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarPetr Machata <pmachata@gmail.com>
      Tested-by: default avatarPetr Machata <pmachata@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8023b03
  2. 14 Feb, 2020 24 commits
  3. 13 Feb, 2020 11 commits
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · b19e8c68
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Summary below, but it's all reasonably straightforward. There are some
        more fixes on the horizon, but nothing disastrous yet.
      
        Summary:
      
         - Fix build when KASLR is enabled but CONFIG_ARCH_RANDOM is not set
      
         - Fix context-switching of SSBS state on systems that implement it
      
         - Fix spinlock compiler warning introduced during the merge window
      
         - Fix incorrect header inclusion (linux/clk-provider.h)
      
         - Use SYSCTL_{ZERO,ONE} instead of rolling our own static variables
      
         - Don't scream if optional SMMUv3 PMU irq is missing
      
         - Remove some unused function prototypes"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: time: Replace <linux/clk-provider.h> by <linux/of_clk.h>
        arm64: Fix CONFIG_ARCH_RANDOM=n build
        perf/smmuv3: Use platform_get_irq_optional() for wired interrupt
        arm64/spinlock: fix a -Wunused-function warning
        arm64: ssbs: Fix context-switch when SSBS is present on all CPUs
        arm64: use shared sysctl constants
        arm64: Drop do_el0_ia_bp_hardening() & do_sp_pc_abort() declarations
      b19e8c68
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 1d40890a
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
      
       - Revert two patches to gpio_do_set_config() and implement the proper
         solution that works, also drop an unecessary call in set_config()
      
       - Fix up the lockdep class for hierarchical IRQ domains.
      
       - Remove some bridge code for line directions.
      
       - Fix a register access bug in the Xilinx driver.
      
      * tag 'gpio-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: sifive: fix static checker warning
        spmi: pmic-arb: Set lockdep class for hierarchical irq domains
        gpio: xilinx: Fix bug where the wrong GPIO register is written to
        gpiolib: remove unnecessary argument from set_config call
        gpio: bd71828: Remove unneeded defines for GPIO_LINE_DIRECTION_IN/OUT
        MAINTAINERS: Sort entries in database for GPIO
        gpiolib: fix gpio_do_set_config()
        Revert "gpiolib: remove set but not used variable 'config'"
        Revert "gpiolib: Remove duplicated function gpio_do_set_config()"
      1d40890a
    • David S. Miller's avatar
      Merge branch 'icmp-account-for-NAT-when-sending-icmps-from-ndo-layer' · 803381f9
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      icmp: account for NAT when sending icmps from ndo layer
      
      The ICMP routines use the source address for two reasons:
      
      1. Rate-limiting ICMP transmissions based on source address, so
         that one source address cannot provoke a flood of replies. If
         the source address is wrong, the rate limiting will be
         incorrectly applied.
      
      2. Choosing the interface and hence new source address of the
         generated ICMP packet. If the original packet source address
         is wrong, ICMP replies will be sent from the wrong source
         address, resulting in either a misdelivery, infoleak, or just
         general network admin confusion.
      
      Most of the time, the icmp_send and icmpv6_send routines can just reach
      down into the skb's IP header to determine the saddr. However, if
      icmp_send or icmpv6_send is being called from a network device driver --
      there are a few in the tree -- then it's possible that by the time
      icmp_send or icmpv6_send looks at the packet, the packet's source
      address has already been transformed by SNAT or MASQUERADE or some other
      transformation that CONNTRACK knows about. In this case, the packet's
      source address is most certainly the *wrong* source address to be used
      for the purpose of ICMP replies.
      
      Rather, the source address we want to use for ICMP replies is the
      original one, from before the transformation occurred.
      
      Fortunately, it's very easy to just ask CONNTRACK if it knows about this
      packet, and if so, how to fix it up. The saddr is the only field in the
      header we need to fix up, for the purposes of the subsequent processing
      in the icmp_send and icmpv6_send functions, so we do the lookup very
      early on, so that the rest of the ICMP machinery can progress as usual.
      
      Changes v3->v4:
      - Add back the skb_shared checking, since the previous assumption isn't
        actually true [Eric]. This implies dropping the additional patches v3 had
        for removing skb_share_check from various drivers. We can revisit that
        general set of ideas later, but that's probably better suited as a net-next
        patchset rather than this stable one which is geared at fixing bugs. So,
        this implements things in the safe conservative way.
      
      Changes v2->v3:
      - Add selftest to ensure this actually does what we want and never regresses.
      - Check the size of the skb header before operating on it.
      - Use skb_ensure_writable to ensure we can modify the cloned skb [Florian].
      - Conditionalize this on IPS_SRC_NAT so we don't do anything unnecessarily
        [Florian].
      - It turns out that since we're calling these from the xmit path,
        skb_share_check isn't required, so remove that [Florian]. This simplifes the
        code a bit too. **The supposition here is that skbs passed to ndo_start_xmit
        are _never_ shared. If this is not correct NOW IS THE TIME TO PIPE UP, for
        doom awaits us later.**
      - While investigating the shared skb business, several drivers appeared to be
        calling it incorrectly in the xmit path, so this series also removes those
        unnecessary calls, based on the supposition mentioned in the previous point.
      
      Changes v1->v2:
      - icmpv6 takes subtly different types than icmpv4, like u32 instead of be32,
        u8 instead of int.
      - Since we're technically writing to the skb, we need to make sure it's not
        a shared one [Dave, 2017].
      - Restore the original skb data after icmp_send returns. All current users
        are freeing the packet right after, so it doesn't matter, but future users
        might not.
      - Remove superfluous route lookup in sunvnet [Dave].
      - Use NF_NAT instead of NF_CONNTRACK for condition [Florian].
      - Include this cover letter [Dave].
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      803381f9
    • Jason A. Donenfeld's avatar
      xfrm: interface: use icmp_ndo_send helper · 45942ba8
      Jason A. Donenfeld authored
      Because xfrmi is calling icmp from network device context, it should use
      the ndo helper so that the rate limiting applies correctly.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45942ba8
    • Jason A. Donenfeld's avatar
      wireguard: device: use icmp_ndo_send helper · a12d7f3c
      Jason A. Donenfeld authored
      Because wireguard is calling icmp from network device context, it should
      use the ndo helper so that the rate limiting applies correctly.  This
      commit adds a small test to the wireguard test suite to ensure that the
      new functions continue doing the right thing in the context of
      wireguard. It does this by setting up a condition that will definately
      evoke an icmp error message from the driver, but along a nat'd path.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a12d7f3c
    • Jason A. Donenfeld's avatar
      sunvnet: use icmp_ndo_send helper · 67c9a7e1
      Jason A. Donenfeld authored
      Because sunvnet is calling icmp from network device context, it should use
      the ndo helper so that the rate limiting applies correctly. While we're
      at it, doing the additional route lookup before calling icmp_ndo_send is
      superfluous, since this is the job of the icmp code in the first place.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Shannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67c9a7e1
    • Jason A. Donenfeld's avatar
      gtp: use icmp_ndo_send helper · e0fce6f9
      Jason A. Donenfeld authored
      Because gtp is calling icmp from network device context, it should use
      the ndo helper so that the rate limiting applies correctly.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Harald Welte <laforge@gnumonks.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0fce6f9
    • Jason A. Donenfeld's avatar
      icmp: introduce helper for nat'd source address in network device context · 0b41713b
      Jason A. Donenfeld authored
      This introduces a helper function to be called only by network drivers
      that wraps calls to icmp[v6]_send in a conntrack transformation, in case
      NAT has been used. We don't want to pollute the non-driver path, though,
      so we introduce this as a helper to be called by places that actually
      make use of this, as suggested by Florian.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b41713b
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 64ae1342
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "This fixes a Kconfig anomaly when lib/crypto is enabled without Crypto
        API"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: Kconfig - allow tests to be disabled when manager is disabled
      64ae1342
    • David S. Miller's avatar
      Merge branch 'skip_sw-skip_hw-validation' · 07134cf6
      David S. Miller authored
      Davide Caratti says:
      
      ====================
      add missing validation of 'skip_hw/skip_sw'
      
      ensure that all classifiers currently supporting HW offload
      validate the 'flags' parameter provided by user:
      
      - patch 1/2 fixes cls_matchall
      - patch 2/2 fixes cls_flower
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07134cf6
    • Davide Caratti's avatar
      net/sched: flower: add missing validation of TCA_FLOWER_FLAGS · e2debf08
      Davide Caratti authored
      unlike other classifiers that can be offloaded (i.e. users can set flags
      like 'skip_hw' and 'skip_sw'), 'cls_flower' doesn't validate the size of
      netlink attribute 'TCA_FLOWER_FLAGS' provided by user: add a proper entry
      to fl_policy.
      
      Fixes: 5b33f488 ("net/flower: Introduce hardware offload support")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2debf08