1. 28 Aug, 2019 13 commits
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Clear VLAN filtering offload netdev feature · e9bf9694
      Vladimir Oltean authored
      The switch barely supports traffic I/O, and it does that by repurposing
      VLANs when there is no bridge that is taking control of them.
      
      Letting DSA declare this netdev feature as supported (see
      dsa_slave_create) would mean that VLAN sub-interfaces created on sja1105
      switch ports will be hardware offloaded. That means that
      net/8021q/vlan_core.c would install the VLAN into the filter tables of
      the switch, potentially interfering with the tag_8021q VLANs.
      
      We need to prevent that from happening and not let the 8021q core
      offload VLANs to the switch hardware tables. In vlan_filtering=0 modes
      of operation, the switch ports can pass through VLAN-tagged frames with
      no problem.
      Suggested-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9bf9694
    • Vladimir Oltean's avatar
      net: dsa: Advertise the VLAN offload netdev ability only if switch supports it · 9b236d2a
      Vladimir Oltean authored
      When adding a VLAN sub-interface on a DSA slave port, the 8021q core
      checks NETIF_F_HW_VLAN_CTAG_FILTER and, if the netdev is capable of
      filtering, calls .ndo_vlan_rx_add_vid or .ndo_vlan_rx_kill_vid to
      configure the VLAN offloading.
      
      DSA sets this up counter-intuitively: it always advertises this netdev
      feature, but the underlying driver may not actually support VLAN table
      manipulation. In that case, the DSA core is forced to ignore the error,
      because not being able to offload the VLAN is still fine - and should
      result in the creation of a non-accelerated VLAN sub-interface.
      
      Change this so that the netdev feature is only advertised for switch
      drivers that support VLAN manipulation, instead of checking for
      -EOPNOTSUPP at runtime.
      Suggested-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b236d2a
    • David S. Miller's avatar
      Merge branch 'net-ethernet-mediatek-convert-to-PHYLINK' · 1ddc5d94
      David S. Miller authored
      René van Dorst says:
      
      ====================
      net: ethernet: mediatek: convert to PHYLINK
      
      These patches converts mediatek driver to PHYLINK API.
      
      v3->v4:
      * Phylink improvements and clean-ups after review
      v2->v3:
      * Phylink improvements and clean-ups after review
      v1->v2:
      * Rebase for mt76x8 changes
      * Phylink improvements and clean-ups after review
      * SGMII port doesn't support 2.5Gbit in SGMII mode only in BASE-X mode.
        Refactor the code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ddc5d94
    • René van Dorst's avatar
      dt-bindings: net: ethernet: Update mt7622 docs and dts to reflect the new phylink API · bd69baaa
      René van Dorst authored
      This patch the removes the recently added mediatek,physpeed property.
      Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit.
      See mt7622-bananapi-bpi-r64.dts for a working example.
      Signed-off-by: default avatarRené van Dorst <opensource@vdorst.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd69baaa
    • René van Dorst's avatar
      net: ethernet: mediatek: Re-add support SGMII · 7e538372
      René van Dorst authored
      * Re-add SGMII support but now with PHYLINK API support
        So the SGMII changes are more clear
      * Move SGMII block setup from mtk_gmac_sgmii_path_setup() to
        mtk_mac_config()
      * Merge mtk_setup_hw_path() into mtk_mac_config()
      * Remove mediatek,physpeed property, fixed-link supports now any speed so
        speed = <2500>; is now valid with PHYLINK
      * Demagic SGMII register values
      * Use phylink state to setup fixed-link mode
      Signed-off-by: default avatarRené van Dorst <opensource@vdorst.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e538372
    • René van Dorst's avatar
      net: ethernet: mediatek: Add basic PHYLINK support · b8fc9f30
      René van Dorst authored
      This convert the basics to PHYLINK API.
      SGMII support is not in this patch.
      Signed-off-by: default avatarRené van Dorst <opensource@vdorst.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8fc9f30
    • David S. Miller's avatar
      Merge branch 'net-dsa-explicit-programmation-of-VLAN-on-CPU-ports' · cb6ec975
      David S. Miller authored
      Vivien Didelot says:
      
      ====================
      net: dsa: explicit programmation of VLAN on CPU ports
      
      When a VLAN is programmed on a user port, every switch of the fabric also
      program the CPU ports and the DSA links as part of the VLAN. To do that,
      DSA makes use of bitmaps to prepare all members of a VLAN.
      
      While this is expected for DSA links which are used as conduit between
      interconnected switches, only the dedicated CPU port of the slave must be
      programmed, not all CPU ports of the fabric. This may also cause problems in
      other corners of DSA such as the tag_8021q.c driver, which needs to program
      its ports manually, CPU port included.
      
      We need the dsa_port_vlan_{add,del} functions and its dsa_port_vid_{add,del}
      variants to simply trigger the VLAN programmation without any logic in them,
      but they may currently skip the operation based on the bridge device state.
      
      This patchset gets rid of the bitmap operations, and moves the bridge device
      check as well as the explicit programmation of CPU ports where they belong,
      in the slave code.
      
      While at it, clear the VLAN flags before programming a CPU port, as it
      doesn't make sense to forward the PVID flag for example for such ports.
      
      Changes in v2: only clear the PVID flag.
      ====================
      Tested-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb6ec975
    • Vivien Didelot's avatar
      net: dsa: clear VLAN PVID flag for CPU port · b9499904
      Vivien Didelot authored
      When the bridge offloads a VLAN on a slave port, we also need to
      program its dedicated CPU port as a member of the VLAN.
      
      Drivers may handle the CPU port's membership as they want. For example,
      Marvell as a special "Unmodified" mode to pass frames as is through
      such ports.
      
      Even though DSA expects the drivers to handle the CPU port membership,
      it does not make sense to program user VLANs as PVID on the CPU port.
      This patch clears this flag before programming the CPU port.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Suggested-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9499904
    • Vivien Didelot's avatar
      net: dsa: program VLAN on CPU port from slave · 7e1741b4
      Vivien Didelot authored
      DSA currently programs a VLAN on the CPU port implicitly after the
      related notifier is received by a switch.
      
      While we still need to do this transparent programmation of the DSA
      links in the fabric, programming the CPU port this way may cause
      problems in some corners such as the tag_8021q driver.
      
      Because the dedicated CPU port is specific to a slave, make their
      programmation explicit a few layers up, in the slave code.
      
      Note that technically, DSA links have a dedicated CPU port as well,
      but since they are only used as conduit between interconnected switches
      of a fabric, programming them transparently this way is what we want.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e1741b4
    • Vivien Didelot's avatar
      net: dsa: check bridge VLAN in slave operations · c5335d73
      Vivien Didelot authored
      The bridge VLANs are not offloaded by dsa_port_vlan_* if the port is
      not bridged or if its bridge is not VLAN aware.
      
      This is a good thing but other corners of DSA, such as the tag_8021q
      driver, may need to program VLANs regardless the bridge state.
      
      And also because bridge_dev is specific to user ports anyway, move
      these checks were it belongs, one layer up in the slave code.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Suggested-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5335d73
    • Vivien Didelot's avatar
      net: dsa: add slave VLAN helpers · bdcff080
      Vivien Didelot authored
      Add dsa_slave_vlan_add and dsa_slave_vlan_del helpers to handle
      SWITCHDEV_OBJ_ID_PORT_VLAN switchdev objects. Also copy the
      switchdev_obj_port_vlan structure on add since we will modify it in
      future patches.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdcff080
    • Vivien Didelot's avatar
      net: dsa: do not skip -EOPNOTSUPP in dsa_port_vid_add · cf360866
      Vivien Didelot authored
      Currently dsa_port_vid_add returns 0 if the switch returns -EOPNOTSUPP.
      
      This function is used in the tag_8021q.c code to offload the PVID of
      ports, which would simply not work if .port_vlan_add is not supported
      by the underlying switch.
      
      Do not skip -EOPNOTSUPP in dsa_port_vid_add but only when necessary,
      that is to say in dsa_slave_vlan_rx_add_vid.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf360866
    • Vivien Didelot's avatar
      net: dsa: remove bitmap operations · e65d45cc
      Vivien Didelot authored
      The bitmap operations were introduced to simplify the switch drivers
      in the future, since most of them could implement the common VLAN and
      MDB operations (add, del, dump) with simple functions taking all target
      ports at once, and thus limiting the number of hardware accesses.
      
      Programming an MDB or VLAN this way in a single operation would clearly
      simplify the drivers a lot but would require a new get-set interface
      in DSA. The usage of such bitmap from the stack also raised concerned
      in the past, leading to the dynamic allocation of a new ds->_bitmap
      member in the dsa_switch structure. So let's get rid of them for now.
      
      This commit nicely wraps the ds->ops->port_{mdb,vlan}_{prepare,add}
      switch operations into new dsa_switch_{mdb,vlan}_{prepare,add}
      variants not using any bitmap argument anymore.
      
      New dsa_switch_{mdb,vlan}_match helpers have been introduced to make
      clear which local port of a switch must be programmed with the target
      object. While the targeted user port is an obvious candidate, the
      DSA links must also be programmed, as well as the CPU port for VLANs.
      
      While at it, also remove local variables that are only used once.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e65d45cc
  2. 27 Aug, 2019 12 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 68aaf445
      David S. Miller authored
      Minor conflict in r8169, bug fix had two versions in net
      and net-next, take the net-next hunks.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68aaf445
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 9e8312f5
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights include:
      
        Stable fixes:
      
         - Fix a page lock leak in nfs_pageio_resend()
      
         - Ensure O_DIRECT reports an error if the bytes read/written is 0
      
         - Don't handle errors if the bind/connect succeeded
      
         - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was
           invalidat ed"
      
        Bugfixes:
      
         - Don't refresh attributes with mounted-on-file information
      
         - Fix return values for nfs4_file_open() and nfs_finish_open()
      
         - Fix pnfs layoutstats reporting of I/O errors
      
         - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort
           for soft I/O errors when the user specifies a hard mount.
      
         - Various fixes to the error handling in sunrpc
      
         - Don't report writepage()/writepages() errors twice"
      
      * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: remove set but not used variable 'mapping'
        NFSv2: Fix write regression
        NFSv2: Fix eof handling
        NFS: Fix writepage(s) error handling to not report errors twice
        NFS: Fix spurious EIO read errors
        pNFS/flexfiles: Don't time out requests on hard mounts
        SUNRPC: Handle connection breakages correctly in call_status()
        Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"
        SUNRPC: Handle EADDRINUSE and ENOBUFS correctly
        pNFS/flexfiles: Turn off soft RPC calls
        SUNRPC: Don't handle errors if the bind/connect succeeded
        NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()
        NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup
        NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
        NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
        NFSv4: Fix return value in nfs_finish_open()
        NFSv4: Fix return values for nfs4_file_open()
        NFS: Don't refresh attributes with mounted-on-file information
      9e8312f5
    • Linus Torvalds's avatar
      Merge tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 6525771f
      Linus Torvalds authored
      Pull ARC updates from Vineet Gupta:
      
       - support for Edge Triggered IRQs in ARC IDU intc
      
       - other fixes here and there
      
      * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        arc: prefer __section from compiler_attributes.h
        dt-bindings: IDU-intc: Add support for edge-triggered interrupts
        dt-bindings: IDU-intc: Clean up documentation
        ARCv2: IDU-intc: Add support for edge-triggered interrupts
        ARC: unwind: Mark expected switch fall-throughs
        ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations
        ARC: fix typo in setup_dma_ops log message
        ARCv2: entry: early return from exception need not clear U & DE bits
      6525771f
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 8d645408
      Linus Torvalds authored
      Pull MFD fix from Lee Jones:
       "Identify potentially unused functions in rk808 driver when !PM"
      
      * tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        mfd: rk808: Make PM function declaration static
        mfd: rk808: Mark pm functions __maybe_unused
      8d645408
    • Linus Torvalds's avatar
      Merge tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 0004654f
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes as usual:
      
         - More coverage of USB-audio descriptor sanity checks
      
         - A fix for mute LED regression on Conexant HD-audio codecs
      
         - A few device-specific fixes and quirks for USB-audio and HD-audio
      
         - A fix for (die-hard remaining) possible race in sequencer core
      
         - FireWire oxfw regression fix that was introduced in 5.3-rc1"
      
      * tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: oxfw: fix to handle correct stream for PCM playback
        ALSA: seq: Fix potential concurrent access to the deleted pool
        ALSA: usb-audio: Check mixer unit bitmap yet more strictly
        ALSA: line6: Fix memory leak at line6_init_pcm() error path
        ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate()
        ALSA: hda/ca0132 - Add new SBZ quirk
        ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604
        ALSA: hda - Fixes inverted Conexant GPIO mic mute led
      0004654f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 452a0444
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya
          Leoshkevich.
      
       2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for
          SMC from Jason Baron.
      
       3) ipv6_mc_may_pull() should return 0 for malformed packets, not
          -EINVAL. From Stefano Brivio.
      
       4) Don't forget to unpin umem xdp pages in error path of
          xdp_umem_reg(). From Ivan Khoronzhuk.
      
       5) Fix sta object leak in mac80211, from Johannes Berg.
      
       6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2
          switches. From Florian Fainelli.
      
       7) Revert DMA sync removal from r8169 which was causing regressions on
          some MIPS Loongson platforms. From Heiner Kallweit.
      
       8) Use after free in flow dissector, from Jakub Sitnicki.
      
       9) Fix NULL derefs of net devices during ICMP processing across
          collect_md tunnels, from Hangbin Liu.
      
      10) proto_register() memory leaks, from Zhang Lin.
      
      11) Set NLM_F_MULTI flag in multipart netlink messages consistently,
          from John Fastabend.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
        r8152: Set memory to all 0xFFs on failed reg reads
        openvswitch: Fix conntrack cache with timeout
        ipv4: mpls: fix mpls_xmit for iptunnel
        nexthop: Fix nexthop_num_path for blackhole nexthops
        net: rds: add service level support in rds-info
        net: route dump netlink NLM_F_MULTI flag missing
        s390/qeth: reject oversized SNMP requests
        sock: fix potential memory leak in proto_register()
        MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT
        xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
        ipv4/icmp: fix rt dst dev null pointer dereference
        openvswitch: Fix log message in ovs conntrack
        bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
        bpf: fix use after free in prog symbol exposure
        bpf: fix precision tracking in presence of bpf2bpf calls
        flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
        Revert "r8169: remove not needed call to dma_sync_single_for_device"
        ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
        net/ncsi: Fix the payload copying for the request coming from Netlink
        qed: Add cleanup in qed_slowpath_start()
        ...
      452a0444
    • YueHaibing's avatar
      NFS: remove set but not used variable 'mapping' · 99300a85
      YueHaibing authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      fs/nfs/write.c: In function nfs_page_async_flush:
      fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable]
      
      It is not use since commit aefb623c422e ("NFS: Fix
      writepage(s) error handling to not report errors twice")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      99300a85
    • Trond Myklebust's avatar
      NFSv2: Fix write regression · d33d4beb
      Trond Myklebust authored
      Ensure we update the write result count on success, since the
      RPC call itself does not do so.
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      d33d4beb
    • Trond Myklebust's avatar
      NFSv2: Fix eof handling · 71affe9b
      Trond Myklebust authored
      If we received a reply from the server with a zero length read and
      no error, then that implies we are at eof.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      71affe9b
    • Lee Jones's avatar
      mfd: rk808: Make PM function declaration static · 4d82fa67
      Lee Jones authored
      Avoids:
        ../drivers/mfd/rk808.c:771:1: warning: symbol 'rk8xx_pm_ops' \
          was not declared. Should it be static?
      
      Fixes: 5752bc43 ("mfd: rk808: Mark pm functions __maybe_unused")
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      4d82fa67
    • Arnd Bergmann's avatar
      mfd: rk808: Mark pm functions __maybe_unused · 5752bc43
      Arnd Bergmann authored
      The newly added suspend/resume functions are only used if CONFIG_PM
      is enabled:
      
      drivers/mfd/rk808.c:752:12: error: 'rk8xx_resume' defined but not used [-Werror=unused-function]
      drivers/mfd/rk808.c:732:12: error: 'rk8xx_suspend' defined but not used [-Werror=unused-function]
      
      Mark them as __maybe_unused so the compiler can silently drop them
      when they are not needed.
      
      Fixes: 586c1b41 ("mfd: rk808: Add RK817 and RK809 support")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      5752bc43
    • Jakub Kicinski's avatar
      nfp: add AMDA0058 boards to firmware list · d00ee466
      Jakub Kicinski authored
      Add MODULE_FIRMWARE entries for AMDA0058 boards.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d00ee466
  3. 26 Aug, 2019 15 commits
    • Heiner Kallweit's avatar
      r8169: improve DMA handling in rtl_rx · 3c95e501
      Heiner Kallweit authored
      Move the call to dma_sync_single_for_cpu after calling napi_alloc_skb.
      This avoids calling dma_sync_single_for_cpu w/o handing control back
      to device if the memory allocation should fail.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c95e501
    • David S. Miller's avatar
      Merge branch 'cls-hw-offload-rtnl' · 72991b56
      David S. Miller authored
      Vlad Buslov says:
      
      ====================
      Refactor cls hardware offload API to support rtnl-independent drivers
      
      Currently, all cls API hardware offloads driver callbacks require caller
      to hold rtnl lock when calling them. This patch set introduces new API
      that allows drivers to register callbacks that are not dependent on rtnl
      lock and unlocked classifiers to offload filters without obtaining rtnl
      lock first, which is intended to allow offloading tc rules in parallel.
      
      Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
      TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are
      already registered with this flag and only take rtnl lock when qdisc or
      classifier requires it. Classifiers can indicate that their ops
      callbacks don't require caller to hold rtnl lock by setting the
      TCF_PROTO_OPS_DOIT_UNLOCKED flag. Unlocked implementation of flower
      classifier is now upstreamed. However, this implementation still obtains
      rtnl lock before calling hardware offloads API.
      
      Implement following cls API changes:
      
      - Introduce new "unlocked_driver_cb" flag to struct flow_block_offload
        to allow registering and unregistering block hardware offload
        callbacks that do not require caller to hold rtnl lock. Drivers that
        doesn't require users of its tc offload callbacks to hold rtnl lock
        sets the flag to true on block bind/unbind. Internally tcf_block is
        extended with additional lockeddevcnt counter that is used to count
        number of devices that require rtnl lock that block is bound to. When
        this counter is zero, tc_setup_cb_*() functions execute callbacks
        without obtaining rtnl lock.
      
      - Extend cls API single hardware rule update tc_setup_cb_call() function
        with tc_setup_cb_add(), tc_setup_cb_replace(), tc_setup_cb_destroy()
        and tc_setup_cb_reoffload() functions. These new APIs are needed to
        move management of block offload counter, filter in hardware counter
        and flag from classifier implementations to cls API, which is now
        responsible for managing them in concurrency-safe manner. Access to
        cb_list from callback execution code is synchronized by obtaining new
        'cb_lock' rw_semaphore in read mode, which allows executing callbacks
        in parallel, but excludes any modifications of data from
        register/unregister code. tcf_block offloads counter type is changed
        to atomic integer to allow updating the counter concurrently.
      
      - Extend classifier ops with new ops->hw_add() and ops->hw_del()
        callbacks which are used to notify unlocked classifiers when filter is
        successfully added or deleted to hardware without releasing cb_lock.
        This is necessary to update classifier state atomically with callback
        list traversal and updating of all relevant counters and allows
        unlocked classifiers to synchronize with concurrent reoffload without
        requiring any changes to driver callback API implementations.
      
      New tc flow_action infrastructure is also modified to allow its user to
      execute without rtnl lock protection. Function tc_setup_flow_action() is
      modified to conditionally obtain rtnl lock before accessing action
      state. Action data that is accessed by reference is either copied or
      reference counted to prevent concurrent action overwrite from
      deallocating it. New function tc_cleanup_flow_action() is introduced to
      cleanup/release all such data obtained by tc_setup_flow_action().
      
      Flower classifier (only unlocked classifier at the moment) is modified
      to use new cls hardware offloads API and no longer obtains rtnl lock
      before calling it.
      ====================
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72991b56
    • Vlad Buslov's avatar
      net: sched: flower: don't take rtnl lock for cls hw offloads API · 918190f5
      Vlad Buslov authored
      Don't manually take rtnl lock in flower classifier before calling cls
      hardware offloads API. Instead, pass rtnl lock status via 'rtnl_held'
      parameter.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      918190f5
    • Vlad Buslov's avatar
      net: sched: copy tunnel info when setting flow_action entry->tunnel · 1444c175
      Vlad Buslov authored
      In order to remove dependency on rtnl lock, modify tc_setup_flow_action()
      to copy tunnel info, instead of just saving pointer to tunnel_key action
      tunnel info. This is necessary to prevent concurrent action overwrite from
      releasing tunnel info while it is being used by rtnl-unlocked driver.
      
      Implement helper tcf_tunnel_info_copy() that is used to copy tunnel info
      with all its options to dynamically allocated memory block. Modify
      tc_cleanup_flow_action() to free dynamically allocated tunnel info.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1444c175
    • Vlad Buslov's avatar
      net: sched: take reference to action dev before calling offloads · 5a6ff4b1
      Vlad Buslov authored
      In order to remove dependency on rtnl lock when calling hardware offload
      API, take reference to action mirred dev when initializing flow_action
      structure in tc_setup_flow_action(). Implement function
      tc_cleanup_flow_action(), use it to release the device after hardware
      offload API is done using it.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a6ff4b1
    • Vlad Buslov's avatar
      net: sched: take rtnl lock in tc_setup_flow_action() · 9838b20a
      Vlad Buslov authored
      In order to allow using new flow_action infrastructure from unlocked
      classifiers, modify tc_setup_flow_action() to accept new 'rtnl_held'
      argument. Take rtnl lock before accessing tc_action data. This is necessary
      to protect from concurrent action replace.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9838b20a
    • Vlad Buslov's avatar
      net: sched: conditionally obtain rtnl lock in cls hw offloads API · 11bd634d
      Vlad Buslov authored
      In order to remove dependency on rtnl lock from offloads code of
      classifiers, take rtnl lock conditionally before executing driver
      callbacks. Only obtain rtnl lock if block is bound to devices that require
      it.
      
      Block bind/unbind code is rtnl-locked and obtains block->cb_lock while
      holding rtnl lock. Obtain locks in same order in tc_setup_cb_*() functions
      to prevent deadlock.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11bd634d
    • Vlad Buslov's avatar
      net: sched: add API for registering unlocked offload block callbacks · c9f14470
      Vlad Buslov authored
      Extend struct flow_block_offload with "unlocked_driver_cb" flag to allow
      registering and unregistering block hardware offload callbacks that do not
      require caller to hold rtnl lock. Extend tcf_block with additional
      lockeddevcnt counter that is incremented for each non-unlocked driver
      callback attached to device. This counter is necessary to conditionally
      obtain rtnl lock before calling hardware callbacks in following patches.
      
      Register mlx5 tc block offload callbacks as "unlocked".
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9f14470
    • Vlad Buslov's avatar
      net: sched: notify classifier on successful offload add/delete · a449a3e7
      Vlad Buslov authored
      To remove dependency on rtnl lock, extend classifier ops with new
      ops->hw_add() and ops->hw_del() callbacks. Call them from cls API while
      holding cb_lock every time filter if successfully added to or deleted from
      hardware.
      
      Implement the new API in flower classifier. Use it to manage hw_filters
      list under cb_lock protection, instead of relying on rtnl lock to
      synchronize with concurrent fl_reoffload() call.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a449a3e7
    • Vlad Buslov's avatar
      net: sched: refactor block offloads counter usage · 40119211
      Vlad Buslov authored
      Without rtnl lock protection filters can no longer safely manage block
      offloads counter themselves. Refactor cls API to protect block offloadcnt
      with tcf_block->cb_lock that is already used to protect driver callback
      list and nooffloaddevcnt counter. The counter can be modified by concurrent
      tasks by new functions that execute block callbacks (which is safe with
      previous patch that changed its type to atomic_t), however, block
      bind/unbind code that checks the counter value takes cb_lock in write mode
      to exclude any concurrent modifications. This approach prevents race
      conditions between bind/unbind and callback execution code but allows for
      concurrency for tc rule update path.
      
      Move block offload counter, filter in hardware counter and filter flags
      management from classifiers into cls hardware offloads API. Make functions
      tcf_block_offload_{inc|dec}() and tc_cls_offload_cnt_update() to be cls API
      private. Implement following new cls API to be used instead:
      
        tc_setup_cb_add() - non-destructive filter add. If filter that wasn't
        already in hardware is successfully offloaded, increment block offloads
        counter, set filter in hardware counter and flag. On failure, previously
        offloaded filter is considered to be intact and offloads counter is not
        decremented.
      
        tc_setup_cb_replace() - destructive filter replace. Release existing
        filter block offload counter and reset its in hardware counter and flag.
        Set new filter in hardware counter and flag. On failure, previously
        offloaded filter is considered to be destroyed and offload counter is
        decremented.
      
        tc_setup_cb_destroy() - filter destroy. Unconditionally decrement block
        offloads counter.
      
        tc_setup_cb_reoffload() - reoffload filter to single cb. Execute cb() and
        call tc_cls_offload_cnt_update() if cb() didn't return an error.
      
      Refactor all offload-capable classifiers to atomically offload filters to
      hardware, change block offload counter, and set filter in hardware counter
      and flag by means of the new cls API functions.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40119211
    • Vlad Buslov's avatar
      net: sched: change tcf block offload counter type to atomic_t · 97394bef
      Vlad Buslov authored
      As a preparation for running proto ops functions without rtnl lock, change
      offload counter type to atomic. This is necessary to allow updating the
      counter by multiple concurrent users when offloading filters to hardware
      from unlocked classifiers.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97394bef
    • Vlad Buslov's avatar
      net: sched: protect block offload-related fields with rw_semaphore · 4f8116c8
      Vlad Buslov authored
      In order to remove dependency on rtnl lock, extend tcf_block with 'cb_lock'
      rwsem and use it to protect flow_block->cb_list and related counters from
      concurrent modification. The lock is taken in read mode for read-only
      traversal of cb_list in tc_setup_cb_call() and write mode in all other
      cases. This approach ensures that:
      
      - cb_list is not changed concurrently while filters is being offloaded on
        block.
      
      - block->nooffloaddevcnt is checked while holding the lock in read mode,
        but is only changed by bind/unbind code when holding the cb_lock in write
        mode to prevent concurrent modification.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f8116c8
    • Trond Myklebust's avatar
      NFS: Fix writepage(s) error handling to not report errors twice · 96c41455
      Trond Myklebust authored
      If writepage()/writepages() saw an error, but handled it without
      reporting it, we should not be re-reporting that error on exit.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      96c41455
    • Trond Myklebust's avatar
      NFS: Fix spurious EIO read errors · 8f54c7a4
      Trond Myklebust authored
      If the client attempts to read a page, but the read fails due to some
      spurious error (e.g. an ACCESS error or a timeout, ...) then we need
      to allow other processes to retry.
      Also try to report errors correctly when doing a synchronous readpage.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      8f54c7a4
    • Trond Myklebust's avatar
      pNFS/flexfiles: Don't time out requests on hard mounts · 7af46292
      Trond Myklebust authored
      If the mount is hard, we should ignore the 'io_maxretrans' module
      parameter so that we always keep retrying.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      7af46292