1. 27 Aug, 2020 16 commits
    • David S. Miller's avatar
      Merge branch 's390-qeth-next' · 44771ea5
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: updates 2020-08-27
      
      please apply the following patch series for qeth to netdev's net-next tree.
      
      Patch 8 makes some improvements to how we handle HW address events,
      avoiding some uncertainty around processing stale events after we
      switched off the feature.
      Except for that it's all straight-forward cleanups.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44771ea5
    • Julian Wiedmann's avatar
      s390/qeth: strictly order bridge address events · 9d6a569a
      Julian Wiedmann authored
      The current code for bridge address events has two shortcomings in its
      control sequence:
      
      1. after disabling address events via PNSO, we don't flush the remaining
         events from the event_wq. So if the feature is re-enabled fast
         enough, stale events could leak over.
      2. PNSO and the events' arrival via the READ ccw device are unordered.
         So even if we flushed the workqueue, it's difficult to say whether
         the READ device might produce more events onto the workqueue
         afterwards.
      
      Fix this by
      1. explicitly fencing off the events when we no longer care, in the
         READ device's event handler. This ensures that once we flush the
         workqueue, it doesn't get additional address events.
      2. Flush the workqueue after disabling the events & fencing them off.
         As the code that triggers the flush will typically hold the sbp_lock,
         we need to rework the worker code to avoid a deadlock here in case
         of a 'notifications-stopped' event. In case of lock contention,
         requeue such an event with a delay. We'll eventually aquire the lock,
         or spot that the feature has been disabled and the event can thus be
         discarded.
      
      This leaves the theoretical race that a stale event could arrive
      _after_ we re-enabled ourselves to receive events again. Such an event
      would be impossible to distinguish from a 'good' event, nothing we can
      do about it.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d6a569a
    • Julian Wiedmann's avatar
      s390/qeth: unify structs for bridge port state · 65b0494e
      Julian Wiedmann authored
      The data returned from IPA_SBP_QUERY_BRIDGE_PORTS and
      IPA_SBP_BRIDGE_PORT_STATE_CHANGE has the same format. Use a single
      struct definition for it.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65b0494e
    • Julian Wiedmann's avatar
      s390/qeth: copy less data from bridge state events · 61c6f217
      Julian Wiedmann authored
      Current code copies _all_ entries from the event into a worker, when we
      later only need specific data from the first entry.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61c6f217
    • Julian Wiedmann's avatar
      s390/qeth: don't let HW override the configured port role · a04f0eca
      Julian Wiedmann authored
      The only time that our Bridgeport role should change is when we change
      the configuration ourselves. In which case we also adjust our internal
      state tracking, no need to do it again when we receive the corresponding
      event.
      
      Removing the locked section helps a subsequent patch that needs to flush
      the workqueue while under sbp_lock.
      
      It would be nice to raise a warning here in case HW does weird things
      after all, but this could end up generating false-positives when we
      change the configuration ourselves.
      Suggested-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a04f0eca
    • Julian Wiedmann's avatar
      s390/qeth: don't disable address events during initialization · 16379503
      Julian Wiedmann authored
      A newly initialized device is disabled for address events, there's no
      need to explicitly disable them.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16379503
    • Julian Wiedmann's avatar
      s390/qeth: make queue lock a proper spinlock · a1668474
      Julian Wiedmann authored
      queue->state is a ternary spinlock in disguise, used by
      OSA's TX completion path to lock the Output Queue and flush any pending
      packets on it to the device. If the Queue is already locked by our TX
      code, setting the lock word to QETH_OUT_Q_LOCKED_FLUSH lets the TX
      completion code move on - the TX path will later take care of things
      when it unlocks the Queue.
      
      This sort of DIY locking is a non-starter of course, just let the
      TX completion path block on the spinlock when necessary. If that ends up
      causing additional latency due to lock contention, then converting
      the OSA path to use xmit_more is the right way to go forward.
      
      Also slightly expand the locked section and capture all of
      qeth_do_send_packet(), so that the update for the 'bufs_pack' statistics
      is done race-free.
      
      While reworking the TX completion path's code, remove a barrier() that
      doesn't make any sense.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a1668474
    • Julian Wiedmann's avatar
      s390/qeth: use to_delayed_work() · beaadcc6
      Julian Wiedmann authored
      Avoid poking around in the delayed_work struct's internals.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      beaadcc6
    • Julian Wiedmann's avatar
      s390/qeth: clean up qeth_l3_send_setdelmc()'s declaration · b14912eb
      Julian Wiedmann authored
      Clarify that the 'ipacmd' parameter is an enum, and thus compatible to
      what qeth_ipa_alloc_cmd() expects as input.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b14912eb
    • Hoang Huu Le's avatar
      tipc: fix use-after-free in tipc_bcast_get_mode · fdeba99b
      Hoang Huu Le authored
      Syzbot has reported those issues as:
      
      ==================================================================
      BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759
      Read of size 1 at addr ffff88805e6b3571 by task kworker/0:6/3850
      
      CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events tipc_net_finalize_work
      
      Thread 1's call trace:
      [...]
        kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing
        tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721
        tipc_exit_net+0x24/0x270 net/tipc/core.c:112
      [...]
      
      Thread 2's call trace:
      [...]
        tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase
      has already been freed by Thread 1
      
        tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744
        tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752
        tipc_net_finalize net/tipc/net.c:141 [inline]
        tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131
        tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150
      [...]
      
      ==================================================================
      BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
      Read of size 8 at addr ffff888052ab2000 by task kworker/0:13/30628
      CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events tipc_net_finalize_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1f0/0x31e lib/dump_stack.c:118
       print_address_description+0x66/0x5a0 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report+0x132/0x1d0 mm/kasan/report.c:530
       tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344
       tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138
       tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150
       process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
       worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
       kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
      [...]
      Freed by task 14058:
       save_stack mm/kasan/common.c:48 [inline]
       set_track mm/kasan/common.c:56 [inline]
       kasan_set_free_info mm/kasan/common.c:316 [inline]
       __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x10a/0x220 mm/slab.c:3757
       tipc_exit_net+0x29/0x50 net/tipc/core.c:113
       ops_exit_list net/core/net_namespace.c:186 [inline]
       cleanup_net+0x708/0xba0 net/core/net_namespace.c:603
       process_one_work+0x789/0xfc0 kernel/workqueue.c:2269
       worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415
       kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
      
      Fix it by calling flush_scheduled_work() to make sure the
      tipc_net_finalize_work() stopped before releasing bcbase object.
      
      Reported-by: syzbot+6ea1f7a8df64596ef4d7@syzkaller.appspotmail.com
      Reported-by: syzbot+e9cc557752ab126c1b99@syzkaller.appspotmail.com
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarHoang Huu Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdeba99b
    • David S. Miller's avatar
      Merge branch 'Move-MDIO-drivers-into-their-own-directory' · ef734763
      David S. Miller authored
      Andrew Lunn says:
      
      ====================
      Move MDIO drivers into their own directory
      
      The phy subdirectory is getting cluttered. It has both PHY drivers and
      MDIO drivers, plus a stray switch driver. Soon more PCS drivers are
      likely to appear.
      
      Move MDIO and PCS drivers into new directories. This requires fixing
      up the xgene driver which uses a relative include path.
      
      v2:
      Move the subdirs to drivers/net, rather than drivers/net/phy.
      
      v3:
      Add subdirectories under include/linux for mdio and pcs
      
      v4:
      there->their
      include path fix
      No new kconfig prompts
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef734763
    • Andrew Lunn's avatar
      net: phy: Sort Kconfig and Makefile · 0457eb26
      Andrew Lunn authored
      Sort the Kconfig based on the text shown in make menuconfig and sort
      the Makefile by CONFIG symbol.
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0457eb26
    • Andrew Lunn's avatar
      net: mdio: Move MDIO drivers into a new subdirectory · a9770eac
      Andrew Lunn authored
      Move all the MDIO drivers and multiplexers into drivers/net/mdio.  The
      mdio core is however left in the phy directory, due to mutual
      dependencies between the MDIO core and the PHY core.
      
      Take this opportunity to sort the Kconfig based on the menuconfig
      strings, and move the multiplexers to the end with a separating
      comment.
      
      v2:
      Fix typo in commit message
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9770eac
    • Andrew Lunn's avatar
      net: xgene: Move shared header file into include/linux · 232e15e1
      Andrew Lunn authored
      This header file is currently included into the ethernet driver via a
      relative path into the PHY subsystem. This is bad practice, and causes
      issues for the upcoming move of the MDIO driver. Move the header file
      into include/linux to clean this up.
      
      v2:
      Move header to include/linux/mdio
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      232e15e1
    • Andrew Lunn's avatar
      net/phy/mdio-i2c: Move header file to include/linux/mdio · fcba68bd
      Andrew Lunn authored
      In preparation for moving all MDIO drivers into drivers/net/mdio, move
      the mdio-i2c header file into include/linux/mdio so it can be used by
      both the MDIO driver and the SFP code which instantiates I2C MDIO
      busses.
      
      v2:
      Add include/linux/mdio
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcba68bd
    • Andrew Lunn's avatar
      net: pcs: Move XPCS into new PCS subdirectory · 2fa4e4b7
      Andrew Lunn authored
      Create drivers/net/pcs and move the Synopsys DesignWare XPCS into the
      new directory. Move the header file into a subdirectory
      include/linux/pcs
      
      Start a naming convention of all PCS files use the prefix pcs-, and
      rename the XPCS files to fit.
      
      v2:
      Add include/linux/pcs
      
      v4:
      Fix include path in stmmac.
      Remove PCS_DEVICES to avoid new prompts
      
      Cc: Jose Abreu <Jose.Abreu@synopsys.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2fa4e4b7
  2. 26 Aug, 2020 24 commits
    • David S. Miller's avatar
      Merge branch 'drivers-net-constify-static-ops-variables' · f0966581
      David S. Miller authored
      Rikard Falkeborn says:
      
      ====================
      drivers/net: constify static ops-variables
      
      This series constifies a number of static ops variables, to allow the
      compiler to put them in read-only memory. Compile-tested only.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0966581
    • Rikard Falkeborn's avatar
      net: ath11k: constify ath11k_thermal_ops · 31ffcb10
      Rikard Falkeborn authored
      The only usage of ath11k_thermal_ops is to pass its address to
      thermal_cooling_device_register() which takes a const pointer. Make it
      const to allow the compiler to put it in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31ffcb10
    • Rikard Falkeborn's avatar
      net: phy: mscc: macsec: constify vsc8584_macsec_ops · 73a9df4c
      Rikard Falkeborn authored
      The only usage of vsc8584_macsec_ops is to assign its address to the
      macsec_ops field in the phydev struct, which is a const pointer. Make it
      const to allow the compiler to put it in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73a9df4c
    • Rikard Falkeborn's avatar
      net: phy: at803x: constify static regulator_ops · 3faaf539
      Rikard Falkeborn authored
      The only usage of vddio_regulator_ops and vddh_regulator_ops is to
      assign their address to the ops field in the regulator_desc struct,
      which is a const pointer. Make them const to allow the compiler to
      put them in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3faaf539
    • Rikard Falkeborn's avatar
      net: renesas: sh_eth: constify bb_ops · b968a44f
      Rikard Falkeborn authored
      The only usage of bb_ops is to assign its address to the ops field in
      the mdiobb_ctrl struct, which is a const pointer. Make it const to allow
      the compiler to put it in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b968a44f
    • Rikard Falkeborn's avatar
      net: ethernet: ravb: constify bb_ops · 3ab4519a
      Rikard Falkeborn authored
      The only usage of bb_ops is to assign its address to the ops field in
      the mdiobb_ctrl struct, which is a const pointer. Make it const to allow
      the compiler to put it in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ab4519a
    • Rikard Falkeborn's avatar
      net: ethernet: qualcomm: constify qca_serdev_ops · 715d0871
      Rikard Falkeborn authored
      The only usage of qca_serdev_ops is to pass its address to
      serdev_device_set_client_ops() which takes a const pointer. Make it
      const to allow the compiler to put it in read-only memory.
      Signed-off-by: default avatarRikard Falkeborn <rikard.falkeborn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      715d0871
    • Wang Hai's avatar
      net: ipa: remove duplicate include · d6fc1923
      Wang Hai authored
      Remove linux/notifier.h which is included more than once
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6fc1923
    • David S. Miller's avatar
      Merge branch 'refactoring-of-ibmvnic-code' · 8396fb8d
      David S. Miller authored
      Lijun Pan says:
      
      ====================
      refactoring of ibmvnic code
      
      This patch series refactor reset_init and init functions,
      and make some other cosmetic changes to make the code
      easier to read and debug. v2 removes __func__ and v1's 1/5.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8396fb8d
    • Lijun Pan's avatar
      ibmvnic: merge ibmvnic_reset_init and ibmvnic_init · 635e442f
      Lijun Pan authored
      These two functions share the majority of the code, hence merge
      them together. In the meanwhile, add a reset pass-in parameter
      to differentiate them. Thus, the code is easier to read and to tell
      the difference between reset_init and regular init.
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      635e442f
    • Lijun Pan's avatar
      ibmvnic: remove never executed if statement · 550f4d46
      Lijun Pan authored
      At the beginning of the function, from_passive_init is set false by
      "adapter->from_passive_init = false;",
      hence the if statement will never run.
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      550f4d46
    • Lijun Pan's avatar
      ibmvnic: improve ibmvnic_init and ibmvnic_reset_init · fa68bfab
      Lijun Pan authored
      When H_SEND_CRQ command returns with H_CLOSED, it means the
      server's CRQ is not ready yet. Instead of resetting immediately,
      we wait for the server to launch passive init.
      ibmvnic_init() and ibmvnic_reset_init() should also return the
      error code from ibmvnic_send_crq_init() call.
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa68bfab
    • Lijun Pan's avatar
      ibmvnic: compare adapter->init_done_rc with more readable ibmvnic_rc_codes · 4c5f6af0
      Lijun Pan authored
      Instead of comparing (adapter->init_done_rc == 1), let it
      be (adapter->init_done_rc == PARTIALSUCCESS).
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c5f6af0
    • David S. Miller's avatar
      Merge branch 'ipv4-nexthop-Various-improvements' · bf82d565
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      ipv4: nexthop: Various improvements
      
      This patch set contains various improvements that I made to the nexthop
      object code while studying it towards my upcoming changes.
      
      While patches #4 and #6 fix bugs, they are not regressions (never
      worked). They also do not occur to me as critical issues, which is why I
      am targeting them at net-next.
      
      Tested with fib_nexthops.sh:
      
      Tests passed: 134
      Tests failed:   0
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf82d565
    • Ido Schimmel's avatar
      selftests: fib_nexthops: Test IPv6 route with group after replacing IPv4 nexthops · 041bc0dc
      Ido Schimmel authored
      Test that an IPv6 route can not use a nexthop group with mixed IPv4 and
      IPv6 nexthops, but can use it after replacing the IPv4 nexthops with
      IPv6 nexthops.
      
      Output without previous patch:
      
      # ./fib_nexthops.sh -t ipv6_fcnal_runtime
      
      IPv6 functional runtime
      -----------------------
      TEST: Route add                                                     [ OK ]
      TEST: Route delete                                                  [ OK ]
      TEST: Ping with nexthop                                             [ OK ]
      TEST: Ping - multipath                                              [ OK ]
      TEST: Ping - blackhole                                              [ OK ]
      TEST: Ping - blackhole replaced with gateway                        [ OK ]
      TEST: Ping - gateway replaced by blackhole                          [ OK ]
      TEST: Ping - group with blackhole                                   [ OK ]
      TEST: Ping - group blackhole replaced with gateways                 [ OK ]
      TEST: IPv6 route with device only nexthop                           [ OK ]
      TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
      TEST: IPv6 route can not have a v4 gateway                          [ OK ]
      TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
      TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route using a group after removing v4 gateways           [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route using a group after replacing v4 gateways          [FAIL]
      TEST: Nexthop with default route and rpfilter                       [ OK ]
      TEST: Nexthop with multipath default route and rpfilter             [ OK ]
      
      Tests passed:  21
      Tests failed:   1
      
      Output with previous patch:
      
      # ./fib_nexthops.sh -t ipv6_fcnal_runtime
      
      IPv6 functional runtime
      -----------------------
      TEST: Route add                                                     [ OK ]
      TEST: Route delete                                                  [ OK ]
      TEST: Ping with nexthop                                             [ OK ]
      TEST: Ping - multipath                                              [ OK ]
      TEST: Ping - blackhole                                              [ OK ]
      TEST: Ping - blackhole replaced with gateway                        [ OK ]
      TEST: Ping - gateway replaced by blackhole                          [ OK ]
      TEST: Ping - group with blackhole                                   [ OK ]
      TEST: Ping - group blackhole replaced with gateways                 [ OK ]
      TEST: IPv6 route with device only nexthop                           [ OK ]
      TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
      TEST: IPv6 route can not have a v4 gateway                          [ OK ]
      TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
      TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route using a group after removing v4 gateways           [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route using a group after replacing v4 gateways          [ OK ]
      TEST: Nexthop with default route and rpfilter                       [ OK ]
      TEST: Nexthop with multipath default route and rpfilter             [ OK ]
      
      Tests passed:  22
      Tests failed:   0
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      041bc0dc
    • Ido Schimmel's avatar
      ipv4: nexthop: Correctly update nexthop group when replacing a nexthop · 885a3b15
      Ido Schimmel authored
      Each nexthop group contains an indication if it has IPv4 nexthops
      ('has_v4'). Its purpose is to prevent IPv6 routes from using groups with
      IPv4 nexthops.
      
      However, the indication is not updated when a nexthop is replaced. This
      results in the kernel wrongly rejecting IPv6 routes from pointing to
      groups that only contain IPv6 nexthops. Example:
      
      # ip nexthop replace id 1 via 192.0.2.2 dev dummy10
      # ip nexthop replace id 10 group 1
      # ip nexthop replace id 1 via 2001:db8:1::2 dev dummy10
      # ip route replace 2001:db8:10::/64 nhid 10
      Error: IPv6 routes can not use an IPv4 nexthop.
      
      Solve this by iterating over all the nexthop groups that the replaced
      nexthop is a member of and potentially update their IPv4 indication
      according to the new set of member nexthops.
      
      Avoid wasting cycles by only performing the update in case an IPv4
      nexthop is replaced by an IPv6 nexthop.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      885a3b15
    • Ido Schimmel's avatar
      selftests: fib_nexthops: Test IPv6 route with group after removing IPv4 nexthops · 05290a27
      Ido Schimmel authored
      Test that an IPv6 route can not use a nexthop group with mixed IPv4 and
      IPv6 nexthops, but can use it after deleting the IPv4 nexthops.
      
      Output without previous patch:
      
      # ./fib_nexthops.sh -t ipv6_fcnal_runtime
      
      IPv6 functional runtime
      -----------------------
      TEST: Route add                                                     [ OK ]
      TEST: Route delete                                                  [ OK ]
      TEST: Ping with nexthop                                             [ OK ]
      TEST: Ping - multipath                                              [ OK ]
      TEST: Ping - blackhole                                              [ OK ]
      TEST: Ping - blackhole replaced with gateway                        [ OK ]
      TEST: Ping - gateway replaced by blackhole                          [ OK ]
      TEST: Ping - group with blackhole                                   [ OK ]
      TEST: Ping - group blackhole replaced with gateways                 [ OK ]
      TEST: IPv6 route with device only nexthop                           [ OK ]
      TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
      TEST: IPv6 route can not have a v4 gateway                          [ OK ]
      TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
      TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route using a group after deleting v4 gateways           [FAIL]
      TEST: Nexthop with default route and rpfilter                       [ OK ]
      TEST: Nexthop with multipath default route and rpfilter             [ OK ]
      
      Tests passed:  18
      Tests failed:   1
      
      Output with previous patch:
      
      bash-5.0# ./fib_nexthops.sh -t ipv6_fcnal_runtime
      
      IPv6 functional runtime
      -----------------------
      TEST: Route add                                                     [ OK ]
      TEST: Route delete                                                  [ OK ]
      TEST: Ping with nexthop                                             [ OK ]
      TEST: Ping - multipath                                              [ OK ]
      TEST: Ping - blackhole                                              [ OK ]
      TEST: Ping - blackhole replaced with gateway                        [ OK ]
      TEST: Ping - gateway replaced by blackhole                          [ OK ]
      TEST: Ping - group with blackhole                                   [ OK ]
      TEST: Ping - group blackhole replaced with gateways                 [ OK ]
      TEST: IPv6 route with device only nexthop                           [ OK ]
      TEST: IPv6 multipath route with nexthop mix - dev only + gw         [ OK ]
      TEST: IPv6 route can not have a v4 gateway                          [ OK ]
      TEST: Nexthop replace - v6 route, v4 nexthop                        [ OK ]
      TEST: Nexthop replace of group entry - v6 route, v4 nexthop         [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route can not have a group with v4 and v6 gateways       [ OK ]
      TEST: IPv6 route using a group after deleting v4 gateways           [ OK ]
      TEST: Nexthop with default route and rpfilter                       [ OK ]
      TEST: Nexthop with multipath default route and rpfilter             [ OK ]
      
      Tests passed:  19
      Tests failed:   0
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05290a27
    • Ido Schimmel's avatar
      ipv4: nexthop: Correctly update nexthop group when removing a nexthop · 863b2558
      Ido Schimmel authored
      Each nexthop group contains an indication if it has IPv4 nexthops
      ('has_v4'). Its purpose is to prevent IPv6 routes from using groups with
      IPv4 nexthops.
      
      However, the indication is not updated when a nexthop is removed. This
      results in the kernel wrongly rejecting IPv6 routes from pointing to
      groups that only contain IPv6 nexthops. Example:
      
      # ip nexthop replace id 1 via 192.0.2.2 dev dummy10
      # ip nexthop replace id 2 via 2001:db8:1::2 dev dummy10
      # ip nexthop replace id 10 group 1/2
      # ip nexthop del id 1
      # ip route replace 2001:db8:10::/64 nhid 10
      Error: IPv6 routes can not use an IPv4 nexthop.
      
      Solve this by updating the indication according to the new set of
      member nexthops.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      863b2558
    • Ido Schimmel's avatar
      ipv4: nexthop: Remove unnecessary rtnl_dereference() · 233c6378
      Ido Schimmel authored
      The pointer is not RCU protected, so remove the unnecessary
      rtnl_dereference(). This suppresses the following warning:
      
      net/ipv4/nexthop.c:1101:24: error: incompatible types in comparison expression (different address spaces):
      net/ipv4/nexthop.c:1101:24:    struct rb_node [noderef] __rcu *
      net/ipv4/nexthop.c:1101:24:    struct rb_node *
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      233c6378
    • Ido Schimmel's avatar
      ipv4: nexthop: Use nla_put_be32() for NHA_GATEWAY · 33d80996
      Ido Schimmel authored
      The code correctly uses nla_get_be32() to get the payload of the
      attribute, but incorrectly uses nla_put_u32() to add the attribute to
      the payload. This results in the following warning:
      
      net/ipv4/nexthop.c:279:59: warning: incorrect type in argument 3 (different base types)
      net/ipv4/nexthop.c:279:59:    expected unsigned int [usertype] value
      net/ipv4/nexthop.c:279:59:    got restricted __be32 [usertype] ipv4
      
      Suppress the warning by using nla_put_be32().
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33d80996
    • Ido Schimmel's avatar
      ipv4: nexthop: Reduce allocation size of 'struct nh_group' · d7d49dc7
      Ido Schimmel authored
      The struct looks as follows:
      
      struct nh_group {
      	struct nh_group		*spare; /* spare group for removals */
      	u16			num_nh;
      	bool			mpath;
      	bool			fdb_nh;
      	bool			has_v4;
      	struct nh_grp_entry	nh_entries[];
      };
      
      But its offset within 'struct nexthop' is also taken into account to
      determine the allocation size.
      
      Instead, use struct_size() to allocate only the required number of
      bytes.
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7d49dc7
    • David S. Miller's avatar
      Merge branch 'net_prefetch-API' · 751e4251
      David S. Miller authored
      Tariq Toukan says:
      
      ====================
      net_prefetch API
      
      This patchset adds a common net API for L1 cacheline size-aware prefetch.
      
      Patch 1 introduces the common API in net and aligns the drivers to use it.
      Patches 2 and 3 add usage in mlx4 and mlx5 Eth drivers.
      
      Series generated against net-next commit:
      079f921e Merge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      751e4251
    • Tariq Toukan's avatar
      net/mlx4_en: RX, Add a prefetch command for small L1_CACHE_BYTES · aed4d4c6
      Tariq Toukan authored
      A single cacheline might not contain the packet header for
      small L1_CACHE_BYTES values.
      Use net_prefetch() as it issues an additional prefetch
      in this case.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aed4d4c6
    • Tariq Toukan's avatar
      net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES · e20f0dbf
      Tariq Toukan authored
      A single cacheline might not contain the packet header for
      small L1_CACHE_BYTES values.
      Use net_prefetch() as it issues an additional prefetch
      in this case.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e20f0dbf