1. 05 Aug, 2021 5 commits
    • Vladimir Oltean's avatar
      net: dsa: sja1105: configure the cascade ports based on topology · 30a100e6
      Vladimir Oltean authored
      The sja1105 switch family has a feature called "cascade ports" which can
      be used in topologies where multiple SJA1105/SJA1110 switches are daisy
      chained. Upstream switches set this bit for the DSA link towards the
      downstream switches. This is used when the upstream switch receives a
      control packet (PTP, STP) from a downstream switch, because if the
      source port for a control packet is marked as a cascade port, then the
      source port, switch ID and RX timestamp will not be taken again on the
      upstream switch, it is assumed that this has already been done by the
      downstream switch (the leaf port in the tree) and that the CPU has
      everything it needs to decode the information from this packet.
      
      We need to distinguish between an upstream-facing DSA link and a
      downstream-facing DSA link, because the upstream-facing DSA links are
      "host ports" for the SJA1105/SJA1110 switches, and the downstream-facing
      DSA links are "cascade ports".
      
      Note that SJA1105 supports a single cascade port, so only daisy chain
      topologies work. With SJA1110, there can be more complex topologies such
      as:
      
                          eth0
                           |
                       host port
                           |
       sw0p0    sw0p1    sw0p2    sw0p3    sw0p4
         |        |                 |        |
       cascade  cascade            user     user
        port     port              port     port
         |        |
         |        |
         |        |
         |       host
         |       port
         |        |
         |      sw1p0    sw1p1    sw1p2    sw1p3    sw1p4
         |                 |        |        |        |
         |                user     user     user     user
        host              port     port     port     port
        port
         |
       sw2p0    sw2p1    sw2p2    sw2p3    sw2p4
                  |        |        |        |
                 user     user     user     user
                 port     port     port     port
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30a100e6
    • Vladimir Oltean's avatar
      net: dsa: give preference to local CPU ports · 2c0b0325
      Vladimir Oltean authored
      Be there an "H" switch topology, where there are 2 switches connected as
      follows:
      
               eth0                                                     eth1
                |                                                        |
             CPU port                                                CPU port
                |                        DSA link                        |
       sw0p0  sw0p1  sw0p2  sw0p3  sw0p4 -------- sw1p4  sw1p3  sw1p2  sw1p1  sw1p0
         |             |      |                            |      |             |
       user          user   user                         user   user          user
       port          port   port                         port   port          port
      
      basically one where each switch has its own CPU port for termination,
      but there is also a DSA link in case packets need to be forwarded in
      hardware between one switch and another.
      
      DSA insists to see this as a daisy chain topology, basically registering
      all network interfaces as sw0p0@eth0, ... sw1p0@eth0 and disregarding
      eth1 as a valid DSA master.
      
      This is only half the story, since when asked using dsa_port_is_cpu(),
      DSA will respond that sw1p1 is a CPU port, however one which has no
      dp->cpu_dp pointing to it. So sw1p1 is enabled, but not used.
      
      Furthermore, be there a driver for switches which support only one
      upstream port. This driver iterates through its ports and checks using
      dsa_is_upstream_port() whether the current port is an upstream one.
      For switch 1, two ports pass the "is upstream port" checks:
      
      - sw1p4 is an upstream port because it is a routing port towards the
        dedicated CPU port assigned using dsa_tree_setup_default_cpu()
      
      - sw1p1 is also an upstream port because it is a CPU port, albeit one
        that is disabled. This is because dsa_upstream_port() returns:
      
      	if (!cpu_dp)
      		return port;
      
        which means that if @dp does not have a ->cpu_dp pointer (which is a
        characteristic of CPU ports themselves as well as unused ports), then
        @dp is its own upstream port.
      
      So the driver for switch 1 rightfully says: I have two upstream ports,
      but I don't support multiple upstream ports! So let me error out, I
      don't know which one to choose and what to do with the other one.
      
      Generally I am against enforcing any default policy in the kernel in
      terms of user to CPU port assignment (like round robin or such) but this
      case is different. To solve the conundrum, one would have to:
      
      - Disable sw1p1 in the device tree or mark it as "not a CPU port" in
        order to comply with DSA's view of this topology as a daisy chain,
        where the termination traffic from switch 1 must pass through switch 0.
        This is counter-productive because it wastes 1Gbps of termination
        throughput in switch 1.
      - Disable the DSA link between sw0p4 and sw1p4 and do software
        forwarding between switch 0 and 1, and basically treat the switches as
        part of disjoint switch trees. This is counter-productive because it
        wastes 1Gbps of autonomous forwarding throughput between switch 0 and 1.
      - Treat sw0p4 and sw1p4 as user ports instead of DSA links. This could
        work, but it makes cross-chip bridging impossible. In this setup we
        would need to have 2 separate bridges, br0 spanning the ports of
        switch 0, and br1 spanning the ports of switch 1, and the "DSA links
        treated as user ports" sw0p4 (part of br0) and sw1p4 (part of br1) are
        the gateway ports between one bridge and another. This is hard to
        manage from a user's perspective, who wants to have a unified view of
        the switching fabric and the ability to transparently add ports to the
        same bridge. VLANs would also need to be explicitly managed by the
        user on these gateway ports.
      
      So it seems that the only reasonable thing to do is to make DSA prefer
      CPU ports that are local to the switch. Meaning that by default, the
      user and DSA ports of switch 0 will get assigned to the CPU port from
      switch 0 (sw0p1) and the user and DSA ports of switch 1 will get
      assigned to the CPU port from switch 1.
      
      The way this solves the problem is that sw1p4 is no longer an upstream
      port as far as switch 1 is concerned (it no longer views sw0p1 as its
      dedicated CPU port).
      
      So here we are, the first multi-CPU port that DSA supports is also
      perhaps the most uneventful one: the individual switches don't support
      multiple CPUs, however the DSA switch tree as a whole does have multiple
      CPU ports. No user space assignment of user ports to CPU ports is
      desirable, necessary, or possible.
      
      Ports that do not have a local CPU port (say there was an extra switch
      hanging off of sw0p0) default to the standard implementation of getting
      assigned to the first CPU port of the DSA switch tree. Is that good
      enough? Probably not (if the downstream switch was hanging off of switch
      1, we would most certainly prefer its CPU port to be sw1p1), but in
      order to support that use case too, we would need to traverse the
      dst->rtable in search of an optimum dedicated CPU port, one that has the
      smallest number of hops between dp->ds and dp->cpu_dp->ds. At the
      moment, the DSA routing table structure does not keep the number of hops
      between dl->dp and dl->link_dp, and while it is probably deducible,
      there is zero justification to write that code now. Let's hope DSA will
      never have to support that use case.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c0b0325
    • Vladimir Oltean's avatar
      net: dsa: rename teardown_default_cpu to teardown_cpu_ports · 0e8eb9a1
      Vladimir Oltean authored
      There is nothing specific to having a default CPU port to what
      dsa_tree_teardown_default_cpu() does. Even with multiple CPU ports,
      it would do the same thing: iterate through the ports of this switch
      tree and reset the ->cpu_dp pointer to NULL. So rename it accordingly.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e8eb9a1
    • Alex Elder's avatar
      net: ipa: fix IPA v4.9 interconnects · 0fd75f57
      Alex Elder authored
      Three interconnects are defined for IPA version 4.9, but there
      should only be two.  They should also use names that match what's
      used for other platforms (and specified in the Device Tree binding).
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fd75f57
    • Colin Ian King's avatar
      mctp: remove duplicated assignment of pointer hdr · df7ba0eb
      Colin Ian King authored
      The pointer hdr is being initialized and also re-assigned with the
      same value from the call to function mctp_hdr. Static analysis reports
      that the initializated value is unused. The second assignment is
      duplicated and can be removed.
      
      Addresses-Coverity: ("Unused value").
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df7ba0eb
  2. 04 Aug, 2021 35 commits
    • Sebastian Andrzej Siewior's avatar
      net: Replace deprecated CPU-hotplug functions. · 372bbdd5
      Sebastian Andrzej Siewior authored
      The functions get_online_cpus() and put_online_cpus() have been
      deprecated during the CPU hotplug rework. They map directly to
      cpus_read_lock() and cpus_read_unlock().
      
      Replace deprecated CPU-hotplug functions with the official version.
      The behavior remains unchanged.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      372bbdd5
    • Sebastian Andrzej Siewior's avatar
      virtio_net: Replace deprecated CPU-hotplug functions. · a0d1d0f4
      Sebastian Andrzej Siewior authored
      The functions get_online_cpus() and put_online_cpus() have been
      deprecated during the CPU hotplug rework. They map directly to
      cpus_read_lock() and cpus_read_unlock().
      
      Replace deprecated CPU-hotplug functions with the official version.
      The behavior remains unchanged.
      
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: virtualization@lists.linux-foundation.org
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a0d1d0f4
    • Nick Richardson's avatar
      pktgen: Remove redundant clone_skb override · c2eecaa1
      Nick Richardson authored
      When the netif_receive xmit_mode is set, a line is supposed to set
      clone_skb to a default 0 value. This line is made redundant due to a
      preceding line that checks if clone_skb is more than zero and returns
      -ENOTSUPP.
      
      Overriding clone_skb to 0 does not make any difference to the behavior
      because if it was positive we return error. So it can be either 0 or
      negative, and in both cases the behavior is the same.
      
      Remove redundant line that sets clone_skb to zero.
      Signed-off-by: default avatarNick Richardson <richardsonnick@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2eecaa1
    • Jonathan Lemon's avatar
      ptp: ocp: Expose various resources on the timecard. · 773bda96
      Jonathan Lemon authored
      The OpenCompute timecard driver has additional functionality besides
      a clock.  Make the following resources available:
      
       - The external timestamp channels (ts0/ts1)
       - devlink support for flashing and health reporting
       - GPS and MAC serial ports
       - board serial number (obtained from i2c device)
      
      Also add watchdog functionality for when GNSS goes into holdover.
      
      The resources are collected under a timecard class directory:
      
        [jlemon@timecard ~]$ ls -g /sys/class/timecard/ocp1/
        total 0
        -r--r--r--. 1 root 4096 Aug  3 19:49 available_clock_sources
        -rw-r--r--. 1 root 4096 Aug  3 19:49 clock_source
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 device -> ../../../0000:04:00.0/
        -r--r--r--. 1 root 4096 Aug  3 19:49 gps_sync
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 i2c -> ../../xiic-i2c.1024/i2c-2/
        drwxr-xr-x. 2 root    0 Aug  3 19:49 power/
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 pps ->
        ../../../../../virtual/pps/pps1/
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 ptp -> ../../ptp/ptp2/
        -r--r--r--. 1 root 4096 Aug  3 19:49 serialnum
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 subsystem ->
        ../../../../../../class/timecard/
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 ttyGPS -> ../../tty/ttyS7/
        lrwxrwxrwx. 1 root    0 Aug  3 19:49 ttyMAC -> ../../tty/ttyS8/
        -rw-r--r--. 1 root 4096 Aug  3 19:39 uevent
      
      The labeling is needed at the minimum, in order to tell the serial
      devices apart.
      Signed-off-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      773bda96
    • Pavel Tikhomirov's avatar
      sock: allow reading and changing sk_userlocks with setsockopt · 04190bf8
      Pavel Tikhomirov authored
      SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket
      buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and
      tcp_sndbuf_expand()). If we've just created a new socket this adjustment
      is enabled on it, but if one changes the socket buffer size by
      setsockopt(SO_{SND,RCV}BUF*) it becomes disabled.
      
      CRIU needs to call setsockopt(SO_{SND,RCV}BUF*) on each socket on
      restore as it first needs to increase buffer sizes for packet queues
      restore and second it needs to restore back original buffer sizes. So
      after CRIU restore all sockets become non-auto-adjustable, which can
      decrease network performance of restored applications significantly.
      
      CRIU need to be able to restore sockets with enabled/disabled adjustment
      to the same state it was before dump, so let's add special setsockopt
      for it.
      
      Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so
      that using these interface one can reenable automatic socket buffer
      adjustment on their sockets.
      Signed-off-by: default avatarPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04190bf8
    • Peilin Ye's avatar
      tc-testing: Add control-plane selftests for sch_mq · 625af9f0
      Peilin Ye authored
      Recently we added multi-queue support to netdevsim in commit d4861fc6
      ("netdevsim: Add multi-queue support"); add a few control-plane selftests
      for sch_mq using this new feature.
      
      Use nsPlugin.py to avoid network interface name collisions.
      Reviewed-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      625af9f0
    • Vladimir Oltean's avatar
      Revert "net: build all switchdev drivers as modules when the bridge is a module" · a54182b2
      Vladimir Oltean authored
      This reverts commit b0e81817. Explicit
      driver dependency on the bridge is no longer needed since
      switchdev_bridge_port_{,un}offload() is no longer implemented by the
      bridge driver but by switchdev.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a54182b2
    • Vladimir Oltean's avatar
      net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge · 957e2235
      Vladimir Oltean authored
      With the introduction of explicit offloading API in switchdev in commit
      2f5dc00f ("net: bridge: switchdev: let drivers inform which bridge
      ports are offloaded"), we started having Ethernet switch drivers calling
      directly into a function exported by net/bridge/br_switchdev.c, which is
      a function exported by the bridge driver.
      
      This means that drivers that did not have an explicit dependency on the
      bridge before, like cpsw and am65-cpsw, now do - otherwise it is not
      possible to call a symbol exported by a driver that can be built as
      module unless you are a module too.
      
      There was an attempt to solve the dependency issue in the form of commit
      b0e81817 ("net: build all switchdev drivers as modules when the
      bridge is a module"). Grygorii Strashko, however, says about it:
      
      | In my opinion, the problem is a bit bigger here than just fixing the
      | build :(
      |
      | In case, of ^cpsw the switchdev mode is kinda optional and in many
      | cases (especially for testing purposes, NFS) the multi-mac mode is
      | still preferable mode.
      |
      | There were no such tight dependency between switchdev drivers and
      | bridge core before and switchdev serviced as independent, notification
      | based layer between them, so ^cpsw still can be "Y" and bridge can be
      | "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE
      | will need to be set as "Y", or we will have to update drivers to
      | support build with BRIDGE=n and maintain separate builds for
      | networking vs non-networking testing.  But is this enough?  Wouldn't
      | it cause 'chain reaction' required to add more and more "Y" options
      | (like CONFIG_VLAN_8021Q)?
      |
      | PS. Just to be sure we on the same page - ARM builds will be forced
      | (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our
      | automation testing will just fail with omap2plus_defconfig.
      
      In the light of this, it would be desirable for some configurations to
      avoid dependencies between switchdev drivers and the bridge, and have
      the switchdev mode as completely optional within the driver.
      
      Arnd Bergmann also tried to write a patch which better expressed the
      build time dependency for Ethernet switch drivers where the switchdev
      support is optional, like cpsw/am65-cpsw, and this made the drivers
      follow the bridge (compile as module if the bridge is a module) only if
      the optional switchdev support in the driver was enabled in the first
      place:
      https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/
      
      but this still did not solve the fact that cpsw and am65-cpsw now must
      be built as modules when the bridge is a module - it just expressed
      correctly that optional dependency. But the new behavior is an apparent
      regression from Grygorii's perspective.
      
      So to support the use case where the Ethernet driver is built-in,
      NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we
      need a framework that can handle the possible absence of the bridge from
      the running system, i.e. runtime bloatware as opposed to build-time
      bloatware.
      
      Luckily we already have this framework, since switchdev has been using
      it extensively. Events from the bridge side are transmitted to the
      driver side using notifier chains - this was originally done so that
      unrelated drivers could snoop for events emitted by the bridge towards
      ports that are implemented by other drivers (think of a switch driver
      with LAG offload that listens for switchdev events on a bonding/team
      interface that it offloads).
      
      There are also events which are transmitted from the driver side to the
      bridge side, which again are modeled using notifiers.
      SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with
      notifying the bridge that a MAC address has been dynamically learned.
      So there is a precedent we can use for modeling the new framework.
      
      The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work
      that the bridge needs to do when a port becomes offloaded is blocking in
      its nature: replay VLANs, MDBs etc. The calling context is indeed
      blocking (we are under rtnl_mutex), but the existing switchdev
      notification chain that the bridge is subscribed to is only the atomic
      one. So we need to subscribe the bridge to the blocking switchdev
      notification chain too.
      
      This patch:
      - keeps the driver-side perception of the switchdev_bridge_port_{,un}offload
        unchanged
      - moves the implementation of switchdev_bridge_port_{,un}offload from
        the bridge module into the switchdev module.
      - makes everybody that is subscribed to the switchdev blocking notifier
        chain "hear" offload & unoffload events
      - makes the bridge driver subscribe and handle those events
      - moves the bridge driver's handling of those events into 2 new
        functions called br_switchdev_port_{,un}offload. These functions
        contain in fact the core of the logic that was previously in
        switchdev_bridge_port_{,un}offload, just that now we go through an
        extra indirection layer to reach them.
      
      Unlike all the other switchdev notification structures, the structure
      used to carry the bridge port information, struct
      switchdev_notifier_brport_info, does not contain a "bool handled".
      This is because in the current usage pattern, we always know that a
      switchdev bridge port offloading event will be handled by the bridge,
      because the switchdev_bridge_port_offload() call was initiated by a
      NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a
      bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event
      couldn't have happened.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      957e2235
    • David S. Miller's avatar
      Merge tag 'linux-can-next-for-5.15-20210804' of... · 9c0532f9
      David S. Miller authored
      Merge tag 'linux-can-next-for-5.15-20210804' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can-next 2021-08-04
      
      this is a pull request of 5 patches for net-next/master.
      
      The first patch is by me and fixes a typo in a comment in the CAN
      J1939 protocol.
      
      The next 2 patches are by Oleksij Rempel and update the CAN J1939
      protocol to send RX status updates via the error queue mechanism.
      
      The next patch is by me and adds a missing variable initialization to
      the flexcan driver (the problem was introduced in the current net-next
      cycle).
      
      The last patch is by Aswath Govindraju and adds power-domains to the
      Bosch m_can DT binding documentation.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c0532f9
    • Aswath Govindraju's avatar
      dt-bindings: net: can: Document power-domains property · d85165b2
      Aswath Govindraju authored
      Document power-domains property for adding the Power domain provider.
      
      Link: https://lore.kernel.org/r/20210802091822.16407-1-a-govindraju@ti.comSigned-off-by: default avatarAswath Govindraju <a-govindraju@ti.com>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      d85165b2
    • Marc Kleine-Budde's avatar
      can: flexcan: flexcan_clks_enable(): add missing variable initialization · 33626669
      Marc Kleine-Budde authored
      This patch adds the missing initialization of the "err" variable in
      the flexcan_clks_enable() function.
      
      Fixes: d9cead75 ("can: flexcan: add mcf5441x support")
      Link: https://lore.kernel.org/r/20210728075428.1493568-1-mkl@pengutronix.deReported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Angelo Dureghello <angelo@kernel-space.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      33626669
    • Oleksij Rempel's avatar
      can: j1939: extend UAPI to notify about RX status · 5b9272e9
      Oleksij Rempel authored
      To be able to create applications with user friendly feedback, we need be
      able to provide receive status information.
      
      Typical ETP transfer may take seconds or even hours. To give user some
      clue or show a progress bar, the stack should push status updates.
      Same as for the TX information, the socket error queue will be used with
      following new signals:
      - J1939_EE_INFO_RX_RTS   - received and accepted request to send signal.
      - J1939_EE_INFO_RX_DPO   - received data package offset signal
      - J1939_EE_INFO_RX_ABORT - RX session was aborted
      
      Instead of completion signal, user will get data package.
      To activate this signals, application should set
      SOF_TIMESTAMPING_RX_SOFTWARE to the SO_TIMESTAMPING socket option. This
      will avoid unpredictable application behavior for the old software.
      
      Link: https://lore.kernel.org/r/20210707094854.30781-3-o.rempel@pengutronix.deSigned-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      5b9272e9
    • Oleksij Rempel's avatar
    • Eric Dumazet's avatar
      ipv6: exthdrs: get rid of indirect calls in ip6_parse_tlv() · 51b8f812
      Eric Dumazet authored
      As presented last month in our "BIG TCP" talk at netdev 0x15,
      we plan using IPv6 jumbograms.
      
      One of the minor problem we talked about is the fact that
      ip6_parse_tlv() is currently using tables to list known tlvs,
      thus using potentially expensive indirect calls.
      
      While we could mitigate this cost using macros from
      indirect_call_wrapper.h, we also can get rid of the tables
      and let the compiler emit optimized code.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Justin Iurman <justin.iurman@uliege.be>
      Cc: Coco Li <lixiaoyan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51b8f812
    • David S. Miller's avatar
      Merge branch 'm7530-sw-fallback' · d8517985
      David S. Miller authored
      DENG Qingfang says:
      
      ====================
      mt7530 software fallback bridging fix
      
      DSA core has gained software fallback support since commit 2f5dc00f,
      but it does not work properly on mt7530. This patch series fixes the
      issues.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8517985
    • DENG Qingfang's avatar
      net: dsa: mt7530: always install FDB entries with IVL and FID 1 · 73c447ca
      DENG Qingfang authored
      This reverts commit 7e777021 ("mt7530 mt7530_fdb_write only set ivl
      bit vid larger than 1").
      
      Before this series, the default value of all ports' PVID is 1, which is
      copied into the FDB entry, even if the ports are VLAN unaware. So
      `bridge fdb show` will show entries like `dev swp0 vlan 1 self` even on
      a VLAN-unaware bridge.
      
      The blamed commit does not solve that issue completely, instead it may
      cause a new issue that FDB is inaccessible in a VLAN-aware bridge with
      PVID 1.
      
      This series sets PVID to 0 on VLAN-unaware ports, so `bridge fdb show`
      will no longer print `vlan 1` on VLAN-unaware bridges, and that special
      case in fdb_write is not required anymore.
      
      Set FDB entries' filter ID to 1 to match the VLAN table.
      Signed-off-by: default avatarDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73c447ca
    • DENG Qingfang's avatar
      net: dsa: mt7530: set STP state on filter ID 1 · a9e3f62d
      DENG Qingfang authored
      As filter ID 1 is the only one used for bridges, set STP state on it.
      Signed-off-by: default avatarDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9e3f62d
    • DENG Qingfang's avatar
      net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridges · 6087175b
      DENG Qingfang authored
      Consider the following bridge configuration, where bond0 is not
      offloaded:
      
               +-- br0 --+
              / /   |     \
             / /    |      \
            /  |    |     bond0
           /   |    |     /   \
         swp0 swp1 swp2 swp3 swp4
           .        .       .
           .        .       .
           A        B       C
      
      Ideally, when the switch receives a packet from swp3 or swp4, it should
      forward the packet to the CPU, according to the port matrix and unknown
      unicast flood settings.
      
      But packet loss will happen if the destination address is at one of the
      offloaded ports (swp0~2). For example, when client C sends a packet to
      A, the FDB lookup will indicate that it should be forwarded to swp0, but
      the port matrix of swp3 and swp4 is configured to only allow the CPU to
      be its destination, so it is dropped.
      
      However, this issue does not happen if the bridge is VLAN-aware. That is
      because VLAN-aware bridges use independent VLAN learning, i.e. use VID
      for FDB lookup, on offloaded ports. As swp3 and swp4 are not offloaded,
      shared VLAN learning with default filter ID of 0 is used instead. So the
      lookup for A with filter ID 0 never hits and the packet can be forwarded
      to the CPU.
      
      In the current code, only two combinations were used to toggle user
      ports' VLAN awareness: one is PCR.PORT_VLAN set to port matrix mode with
      PVC.VLAN_ATTR set to transparent port, the other is PCR.PORT_VLAN set to
      security mode with PVC.VLAN_ATTR set to user port.
      
      It turns out that only PVC.VLAN_ATTR contributes to VLAN awareness, and
      port matrix mode just skips the VLAN table lookup. The reference manual
      is somehow misleading when describing PORT_VLAN modes. It states that
      PORT_MEM (VLAN port member) is used for destination if the VLAN table
      lookup hits, but actually **PORT_MEM & PORT_MATRIX** (bitwise AND of
      VLAN port member and port matrix) is used instead, which means we can
      have two or more separate VLAN-aware bridges with the same PVID and
      traffic won't leak between them.
      
      Therefore, to solve this, enable independent VLAN learning with PVID 0
      on VLAN-unaware bridges, by setting their PCR.PORT_VLAN to fallback
      mode, while leaving standalone ports in port matrix mode. The CPU port
      is always set to fallback mode to serve those bridges.
      
      During testing, it is found that FDB lookup with filter ID of 0 will
      also hit entries with VID 0 even with independent VLAN learning. To
      avoid that, install all VLANs with filter ID of 1.
      Signed-off-by: default avatarDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6087175b
    • DENG Qingfang's avatar
      net: dsa: mt7530: enable assisted learning on CPU port · 0b69c54c
      DENG Qingfang authored
      Consider the following bridge configuration, where bond0 is not
      offloaded:
      
               +-- br0 --+
              / /   |     \
             / /    |      \
            /  |    |     bond0
           /   |    |     /   \
         swp0 swp1 swp2 swp3 swp4
           .        .       .
           .        .       .
           A        B       C
      
      Address learning is enabled on offloaded ports (swp0~2) and the CPU
      port, so when client A sends a packet to C, the following will happen:
      
      1. The switch learns that client A can be reached at swp0.
      2. The switch probably already knows that client C can be reached at the
         CPU port, so it forwards the packet to the CPU.
      3. The bridge core knows client C can be reached at bond0, so it
         forwards the packet back to the switch.
      4. The switch learns that client A can be reached at the CPU port.
      5. The switch forwards the packet to either swp3 or swp4, according to
         the packet's tag.
      
      That makes client A's MAC address flap between swp0 and the CPU port. If
      client B sends a packet to A, it is possible that the packet is
      forwarded to the CPU. With offload_fwd_mark = 1, the bridge core won't
      forward it back to the switch, resulting in packet loss.
      
      As we have the assisted_learning_on_cpu_port in DSA core now, enable
      that and disable hardware learning on the CPU port.
      Signed-off-by: default avatarDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <oltean@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b69c54c
    • David S. Miller's avatar
      Merge branch 'ipa-pm-irqs' · 8eceea41
      David S. Miller authored
      Alex Elder says:
      
      ====================
      net: ipa: prepare GSI interrupts for runtime PM
      
      The last patch in this series arranges for GSI interrupts to be
      disabled when the IPA hardware is suspended.  This ensures the clock
      is always operational when a GSI interrupt fires.  Leading up to
      that are patches that rearrange the code a bit to allow this to
      be done.
      
      The first two patches aren't *directly* related.  They remove some
      flag arguments to some GSI suspend/resume related functions, using
      the version field now present in the GSI structure.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8eceea41
    • Alex Elder's avatar
      net: ipa: disable GSI interrupts while suspended · 45a42a3c
      Alex Elder authored
      Introduce new functions gsi_suspend() and gsi_resume(), which will
      disable the GSI interrupt handler after all endpoints are suspended
      and re-enable it before endpoints are resumed.  This will ensure no
      GSI interrupt handler will fire when the hardware is suspended.
      
      Here's a little further explanation.  There are seven GSI interrupt
      types, and most are disabled except when needed.
        - These two are not used (never enabled):
            GSI_INTER_EE_CH_CTRL
            GSI_INTER_EE_EV_CTRL
        - These two are only used to implement channel and event ring
          commands, and are only enabled while a command is underway:
            GSI_CH_CTRL
            GSI_EV_CTRL
        - The IEOB interrupt signals I/O completion.  It will not fire
          when a channel is stopped (or "suspended").
            GSI_IEOB
        - This interrupt is used to allocate or halt modem channels,
          and is only enabled while such a command is underway.
            GSI_GLOB_EE
          However it also is used to signal certain errors, and this could
          occur at any time.
        - The general interrupt signals general errors, and could occur at
          any time.
            GSI_GENERAL
      
      The purpose for this change is to ensure no global or general
      interrupts fire due to errors while the hardware is suspended.
      We enable the clock on resume, and at that time we can "handle"
      (at least report) these error conditions.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45a42a3c
    • Alex Elder's avatar
      net: ipa: move gsi_irq_init() code into setup · b176f95b
      Alex Elder authored
      The GSI IRQ handler could be triggered as soon as it is registered
      with request_irq().  The handler function, gsi_isr(), touches
      hardware, meaning the IPA clock must be operational.  The IPA clock
      is not operating when the handler is registered (in gsi_irq_init()),
      so this is a problem.
      
      Move the call to request_irq() for the GSI interrupt handler into
      gsi_irq_setup(), which is called when the IPA clock is known to be
      operational (and furthermore, the GSI firmware will have been
      loaded).  Request the IRQ at the end of that function, after all
      interrupt types have been disabled and masked.
      
      Move the matching free_irq() call into gsi_irq_teardown(), and get
      rid of the now empty gsi_irq_exit(),
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b176f95b
    • Alex Elder's avatar
      net: ipa: have gsi_irq_setup() return an error code · 1657d8a4
      Alex Elder authored
      Change gsi_irq_setup() so it returns an error value, and introduce
      gsi_irq_teardown() as its inverse.  Set the interrupt type (IRQ
      rather than MSI) in gsi_irq_setup().
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1657d8a4
    • Alex Elder's avatar
      net: ipa: move some GSI setup functions · a7860a5f
      Alex Elder authored
      Move gsi_irq_setup() and gsi_ring_setup() so they're defined right
      above gsi_setup() where they're called.  This is a trivial movement
      of code to prepare for upcoming patches.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7860a5f
    • Alex Elder's avatar
      net: ipa: move version check for channel suspend/resume · 4a4ba483
      Alex Elder authored
      Change the Boolean flags passed to __gsi_channel_start() and
      __gsi_channel_stop() so they represent whether the request is being
      made to implement suspend (versus stop) or resume (versus start).
      
      Then stop or start the channel for suspend/resume requests only if
      the hardware version indicates it should be done.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a4ba483
    • Alex Elder's avatar
      net: ipa: use gsi->version for channel suspend/resume · decfef0f
      Alex Elder authored
      The GSI layer has the IPA version now, so there's no need for
      version-specific flags to be passed from IPA.  One instance of
      this is in gsi_channel_suspend() and gsi_channel_resume(), which
      indicate whether or not the endpoint suspend is implemented by
      GSI stopping the channel.  We can make that determination based
      on gsi->version, eliminating the need for a Boolean flag in those
      functions.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      decfef0f
    • David S. Miller's avatar
      Merge branch 'mhi-mbim' · 93bbcfee
      David S. Miller authored
      Loic Poulain says:
      
      ====================
      net: mhi: move MBIM to WWAN
      
      Implement a proper WWAN driver for MBIM network protocol, with multi link
      management supported through the WWAN framework (wwan rtnetlink).
      
      Until now, MBIM over MHI was supported directly in the mhi_net driver, via
      some protocol rx/tx fixup callbacks, but with only one session supported
      (no multilink muxing). We can then remove that part from mhi_net and restore
      the driver to a simpler version for 'raw' ip transfer (or QMAP via rmnet link).
      
      Note that a wwan0 link is created by default for session-id 0. Additional links
      can be managed via ip tool:
      
          $ ip link add dev wwan0mms parentdev wwan0 type wwan linkid 1
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93bbcfee
    • Loic Poulain's avatar
      net: mhi: Remove MBIM protocol · 7ffa7542
      Loic Poulain authored
      The MBIM protocol has now been integrated in a proper WWAN driver. We
      can then revert back to a simpler driver for mhi_net, which is used
      for raw IP or QMAP protocol (via rmnet link).
      
      - Remove protocol management
      - Remove WWAN framework usage (only valid for mbim)
      - Remove net/mhi directory for simpler mhi_net.c file
      Signed-off-by: default avatarLoic Poulain <loic.poulain@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ffa7542
    • Loic Poulain's avatar
      net: wwan: Add MHI MBIM network driver · aa730a99
      Loic Poulain authored
      Add new wwan driver for MBIM over MHI. MBIM is a transport protocol
      for IP packets, allowing packet aggregation and muxing. Initially
      designed for USB bus, it is also exposed through MHI bus for QCOM
      based PCIe wwan modems.
      
      This driver supports the new wwan rtnetlink interface for multi-link
      management and has been tested with Quectel EM120R-GL M2 module.
      Signed-off-by: default avatarLoic Poulain <loic.poulain@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa730a99
    • David S. Miller's avatar
      Merge branch 'queues' · 8730379e
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      net: add netif_set_real_num_queues() for device reconfig
      
      This short set adds a helper to make the implementation of
      two-phase NIC reconfig easier.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8730379e
    • Jakub Kicinski's avatar
      nfp: use netif_set_real_num_queues() · e874f455
      Jakub Kicinski authored
      Avoid reconfig problems due to failures in netif_set_real_num_tx_queues()
      by using netif_set_real_num_queues().
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e874f455
    • Jakub Kicinski's avatar
      net: add netif_set_real_num_queues() for device reconfig · 271e5b7d
      Jakub Kicinski authored
      netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues()
      can fail which breaks drivers trying to implement reconfiguration
      in a way that can't leave the device half-broken. In other words
      those functions are incompatible with prepare/commit approach.
      
      Luckily setting real number of queues can fail only if the number
      is increased, meaning that if we order operations correctly we
      can guarantee ending up with either new config (success), or
      the old one (on error).
      
      Provide a helper implementing such logic so that drivers don't
      have to duplicate it.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      271e5b7d
    • Rocco Yue's avatar
      net: add extack arg for link ops · 8679c31e
      Rocco Yue authored
      Pass extack arg to validate_linkmsg and validate_link_af callbacks.
      If a netlink attribute has a reject_message, use the extended ack
      mechanism to carry the message back to user space.
      Signed-off-by: default avatarRocco Yue <rocco.yue@mediatek.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8679c31e
    • Rao Shoaib's avatar
      af_unix: Add OOB support · 314001f0
      Rao Shoaib authored
      This patch adds OOB support for AF_UNIX sockets.
      The semantics is same as TCP.
      
      The last byte of a message with the OOB flag is
      treated as the OOB byte. The byte is separated into
      a skb and a pointer to the skb is stored in unix_sock.
      The pointer is used to enforce OOB semantics.
      Signed-off-by: default avatarRao Shoaib <rao.shoaib@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      314001f0
    • David S. Miller's avatar
      Merge branch 'dpaa2-switch-next' · 7e89350c
      David S. Miller authored
      Ioana Ciornei says:
      
      ====================
      dpaa2-switch: integrate the MAC endpoint support
      
      This patch set integrates the already available MAC support into the
      dpaa2-switch driver as well.
      
      The first 4 patches are fixing up some minor problems or optimizing the
      code, while the remaining ones are actually integrating the dpaa2-mac
      support into the switch driver by calling the dpaa2_mac_* provided
      functions. While at it, we also export the MAC statistics in ethtool
      like we do for dpaa2-eth.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e89350c