1. 09 Mar, 2020 40 commits
    • Alex Elder's avatar
      soc: qcom: ipa: modem and microcontroller · a646d6ec
      Alex Elder authored
      This patch includes code implementing the modem functionality.
      There are several communication paths between the AP and modem,
      separate from the main data path provided by IPA.  SMP2P provides
      primitive messaging and interrupt capability, and QMI allows more
      complex out-of-band messaging to occur between entities on the AP
      and modem.  (SMP2P and QMI support are added by the next patch.)
      Management of these (plus the network device implementing the data
      path) is done by code within "ipa_modem.c".
      
      Sort of unrelated, this patch also includes the code supporting the
      microcontroller CPU present on the IPA.  The microcontroller can be
      used to implement special handling of packets, but at this time we
      don't support that.  Still, it is a component that needs to be
      initialized, and in the event of a crash we need to do some
      synchronization between the AP and the microcontroller.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a646d6ec
    • Alex Elder's avatar
      soc: qcom: ipa: immediate commands · 731c46ed
      Alex Elder authored
      One TX endpoint (per EE) is used for issuing immediate commands to
      the IPA.  These commands request activites beyond simple data
      transfers to be done by the IPA hardware.  For example, the IPA is
      able to manage routing packets among endpoints, and immediate commands
      are used to configure tables used for that routing.
      
      Immediate commands are built on top of GSI transactions.  They are
      different from normal transfers (in that they use a special endpoint,
      and their "payload" is interpreted differently), so separate functions
      are used to issue immediate command transactions.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      731c46ed
    • Alex Elder's avatar
      soc: qcom: ipa: filter and routing tables · 2b9feef2
      Alex Elder authored
      This patch contains code implementing filter and routing tables for
      the IPA.  A filter table allows rules to be used for filtering
      packets that depart the AP at an endpoint.  A filter table entry
      contains the address of a set of rules to apply for each endpoint
      that supports filtering.
      
      A routing table allows packets to be routed to an endpoint based
      on packet metadata.  It is also a table whose entries each contain
      the address of a set of routing rules to apply.
      
      Neither filtering nor routing is supported by the current driver.
      All table entries refer to rules that mean "no filtering" and "no
      routing."
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b9feef2
    • Alex Elder's avatar
      soc: qcom: ipa: IPA endpoints · 84f9bd12
      Alex Elder authored
      This patch includes the code implementing an IPA endpoint.  This is
      the primary abstraction implemented by the IPA.  An endpoint is one
      end of a network connection between two entities physically
      connected to the IPA.  Specifically, the AP and the modem implement
      endpoints, and an (AP endpoint, modem endpoint) pair implements the
      transfer of network data in one direction between the AP and modem.
      
      Endpoints are built on top of GSI channels, but IPA endpoints
      represent the higher-level functionality that the IPA provides.
      Data can be sent through a GSI channel, but it is the IPA endpoint
      that represents what is on the "other end" to receive that data.
      Other functionality, including aggregation, checksum offload and
      (at some future date) IP routing and filtering are all associated
      with the IPA endpoint.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84f9bd12
    • Alex Elder's avatar
      soc: qcom: ipa: GSI transactions · 9dd441e4
      Alex Elder authored
      This patch implements GSI transactions.  A GSI transaction is a
      structure that represents a single request (consisting of one or
      more TREs) sent to the GSI hardware.  The last TRE in a transaction
      includes a flag requesting that the GSI interrupt the AP to notify
      that it has completed.
      
      TREs are executed and completed strictly in order.  For this reason,
      the completion of a single TRE implies that all previous TREs (in
      particular all of those "earlier" in a transaction) have completed.
      
      Whenever there is a need to send a request (a set of TREs) to the
      IPA, a GSI transaction is allocated, specifying the number of TREs
      that will be required.  Details of the request (e.g. transfer offsets
      and length) are represented by in a Linux scatterlist array that is
      incorporated in the transaction structure.
      
      Once all commands (TREs) are added to a transaction it is committed.
      When the hardware signals that the request has completed, a callback
      function allows for cleanup or followup activity to be performed
      before the transaction is freed.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dd441e4
    • Alex Elder's avatar
      soc: qcom: ipa: IPA interface to GSI · c3f398b1
      Alex Elder authored
      This patch provides interface functions supplied by the IPA layer
      that are called from the GSI layer.  One function is called when a
      GSI transaction has completed.  The others allow the GSI layer to
      inform the IPA layer when the hardware has been told it has new TREs
      to execute, and when the hardware has indicated transactions have
      completed.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3f398b1
    • Alex Elder's avatar
      soc: qcom: ipa: the generic software interface · 650d1603
      Alex Elder authored
      This patch includes "gsi.c", which implements the generic software
      interface (GSI) for IPA.  The generic software interface abstracts
      channels, which provide a means of transferring data either from the
      AP to the IPA, or from the IPA to the AP.  A ring buffer of "transfer
      elements" (TREs) is used to describe data transfers to perform.  The
      AP writes a doorbell register associated with a channel to let it know
      it has added new entries (for an AP->IPA channel) or has finished
      processing entries (for an IPA->AP channel).
      
      Each channel also has an event ring buffer, used by the IPA to
      communicate information about events related to a channel (for
      example, the completion of TREs).  The IPA writes its own doorbell
      register, which triggers an interrupt on the AP, to signal that
      new event information has arrived.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      650d1603
    • Alex Elder's avatar
      soc: qcom: ipa: GSI headers · ca48b27b
      Alex Elder authored
      The Generic Software Interface is a layer of the IPA driver that
      abstracts the underlying hardware.  The next patch includes the
      main code for GSI (including some additional documentation).  This
      patch just includes three GSI header files.
      
        - "gsi.h" is the top-level GSI header file.  This structure is
          is embedded within the IPA structure.  The main abstraction
          implemented by the GSI code is the channel, and this header
          exposes several operations that can be performed on a GSI channel.
      
        - "gsi_private.h" exposes some definitions that are intended to be
          private, used only by the main GSI code and the GSI transaction
          code (defined in an upcoming patch).
      
        - Like "ipa_reg.h", "gsi_reg.h" defines the offsets of the 32-bit
          registers used by the GSI layer, along with masks that define the
          position and width of fields less than 32 bits located within
          these registers.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca48b27b
    • Alex Elder's avatar
      soc: qcom: ipa: clocking, interrupts, and memory · ba764c4d
      Alex Elder authored
      This patch incorporates three source files (and their headers).  They're
      grouped into one patch mainly for the purpose of making the number and
      size of patches in this series somewhat reasonable.
      
        - "ipa_clock.c" and "ipa_clock.h" implement clocking for the IPA device.
          The IPA has a single core clock managed by the common clock framework.
          In addition, the IPA has three buses whose bandwidth is managed by the
          Linux interconnect framework.  At this time the core clock and all
          three buses are either on or off; we don't yet do any more fine-grained
          management than that.  The core clock and interconnects are enabled
          and disabled as a unit, using a unified clock-like abstraction,
          ipa_clock_get()/ipa_clock_put().
      
        - "ipa_interrupt.c" and "ipa_interrupt.h" implement IPA interrupts.
          There are two hardware IRQs used by the IPA driver (the other is
          the GSI interrupt, described in a separate patch).  Several types
          of interrupt are handled by the IPA IRQ handler; these are not part
          of data/fast path.
      
        - The IPA has a region of local memory that is accessible by the AP
          (and modem).  Within that region are areas with certain defined
          purposes.  "ipa_mem.c" and "ipa_mem.h" define those regions, and
          implement their initialization.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba764c4d
    • Alex Elder's avatar
      soc: qcom: ipa: configuration data · 1ed7d0c0
      Alex Elder authored
      This patch defines configuration data that is used to specify some
      of the details of IPA hardware supported by the driver.  It is built
      as Device Tree match data, discovered at boot time.  The driver
      supports the Qualcomm SDM845 SoC.  Data for the Qualcomm SC7180 is
      also defined here, but it is not yet completely supported.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ed7d0c0
    • Alex Elder's avatar
      soc: qcom: ipa: main code · cdf2e941
      Alex Elder authored
      This patch includes three source files that represent some basic "main
      program" code for the IPA driver.  They are:
        - "ipa.h" defines the top-level IPA structure which represents an IPA
           device throughout the code.
        - "ipa_main.c" contains the platform driver probe function, along with
          some general code used during initialization.
        - "ipa_reg.h" defines the offsets of the 32-bit registers used for the
          IPA device, along with masks that define the position and width of
          fields within these registers.
        - "version.h" defines some symbolic IPA version numbers.
      
      Each file includes some documentation that provides a little more
      overview of how the code is organized and used.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdf2e941
    • Alex Elder's avatar
      dt-bindings: soc: qcom: add IPA bindings · fc39c40a
      Alex Elder authored
      Add the binding definitions for the "qcom,ipa" device tree node.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc39c40a
    • Alex Elder's avatar
      remoteproc: add IPA notification to q6v5 driver · d7f5f3c8
      Alex Elder authored
      Set up a subdev in the q6v5 modem remoteproc driver that generates
      event notifications for the IPA driver to use for initialization and
      recovery following a modem shutdown or crash.
      
      A pair of new functions provides a way for the IPA driver to register
      and deregister a notification callback function that will be called
      whenever modem events (about to boot, running, about to shut down,
      etc.) occur.  A void pointer value (provided by the IPA driver at
      registration time) and an event type are supplied to the callback
      function.
      
      One event, MODEM_REMOVING, is signaled whenever the q6v5 driver is
      about to remove the notification subdevice.  It requires the IPA
      driver de-register its callback.
      
      This sub-device is only used by the modem subsystem (MSS) driver,
      so the code that adds the new subdev and allows registration and
      deregistration of the notifier is found in "qcom_q6v6_mss.c".
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7f5f3c8
    • David S. Miller's avatar
      Merge branch 'QorIQ-DPAA-Use-random-MAC-address-when-none-is-given' · e2f5cb72
      David S. Miller authored
      Sascha Hauer says:
      
      ====================
      QorIQ DPAA: Use random MAC address when none is given
      
      Use random MAC addresses when they are not provided in the device tree.
      Tested on LS1046ARDB.
      
      Changes in v3:
       addressed all MAC types, removed some redundant code in dtsec in
       the process
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2f5cb72
    • Madalin Bucur's avatar
      dpaa_eth: Use random MAC address when none is given · cbb961ca
      Madalin Bucur authored
      If there is no valid MAC address in the device tree, use a random
      MAC address.
      Signed-off-by: default avatarSascha Hauer <s.hauer@pengutronix.de>
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@oss.nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbb961ca
    • Madalin Bucur's avatar
      fsl/fman: tolerate missing MAC address in device tree · f3353b99
      Madalin Bucur authored
      Allow the initialization of the MAC to be performed even if the
      device tree does not provide a valid MAC address. Later a random
      MAC address should be assigned by the Ethernet driver.
      Signed-off-by: default avatarSascha Hauer <s.hauer@pengutronix.de>
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@oss.nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3353b99
    • Madalin Bucur's avatar
      fsl/fman: reuse set_mac_address() in dtsec init() · 6b995bde
      Madalin Bucur authored
      Reuse the set_mac_address() in the init() function.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@oss.nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b995bde
    • David S. Miller's avatar
      Merge branch 'bnxt_en-Updates' · 896328fe
      David S. Miller authored
      Michael Chan says:
      
      ====================
      bnxt_en: Updates.
      
      This series includes simplification and improvement of NAPI polling
      logic in bnxt_poll_p5().  The improvements will prevent starving the
      async events from firmware if we are in continuous NAPI polling.
      The rest of the patches include cleanups, a better return code for
      firmware busy, and to clear devlink port type more properly.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      896328fe
    • Vasundhara Volam's avatar
      bnxt_en: Call devlink_port_type_clear() in remove() · 0fcfc7a1
      Vasundhara Volam authored
      Similar to other drivers, properly clear the devlink port type when
      removing the device before unregistration.
      
      Cc: Jiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fcfc7a1
    • Vasundhara Volam's avatar
      bnxt_en: Return -EAGAIN if fw command returns BUSY · 3a707bed
      Vasundhara Volam authored
      If firmware command returns error code as HWRM_ERR_CODE_BUSY, which
      means it cannot handle the command due to a conflicting command
      from another function, convert it to -EAGAIN.  If it is an ethtool
      operation, this error code will be returned to userspace.
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a707bed
    • Vasundhara Volam's avatar
      bnxt_en: Modify some bnxt_hwrm_*_free() functions to void. · 3d061591
      Vasundhara Volam authored
      Return code is not needed in some of these functions, as the return
      code from firmware message is ignored. Remove the unused rc variable
      and also convert functions to void.
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d061591
    • Vasundhara Volam's avatar
      bnxt_en: Remove unnecessary assignment of return code · 9f90445c
      Vasundhara Volam authored
      As part of converting error code in firmware message to standard
      code, checking for firmware return code is removed in most of the
      places. Remove the assignment of return code where the function
      can directly return.
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f90445c
    • Michael Chan's avatar
      bnxt_en: Clear DCB settings after firmware reset. · 843d699d
      Michael Chan authored
      The driver stores a copy of the DCB settings that have been applied to
      the firmware.  After firmware reset, the firmware settings are gone and
      will revert back to default.  Clear the driver's copy so that if there
      is a DCBNL request to get the settings, the driver will retrieve the
      current settings from the firmware.  lldpad keeps the DCB settings in
      userspace and will re-apply the settings if it is running.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      843d699d
    • Michael Chan's avatar
      bnxt_en: Process the NQ under NAPI continuous polling. · 389a877a
      Michael Chan authored
      When we are in continuous NAPI polling mode, the current code in
      bnxt_poll_p5() will only process the completion rings and will not
      process the NQ until interrupt is re-enabled.  Tis logic works and
      will not cause RX or TX starvation, but async events in the NQ may
      be delayed for the duration of continuous NAPI polling.  These
      async events may be firmware or VF events.
      
      Continue to handle the NQ after we are done polling the completion
      rings.  This actually simplies the code in bnxt_poll_p5().
      
      Acknowledge the NQ so these async events will not overflow.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      389a877a
    • Michael Chan's avatar
      bnxt_en: Simplify __bnxt_poll_cqs_done(). · 340ac85e
      Michael Chan authored
      Simplify the function by removing tha 'all' parameter.  In the current
      code, the caller has to specify whether to update/arm both completion
      rings with the 'all' parameter.
      
      Instead of this, we can just update/arm all the completion rings
      that have been polled.  By setting cpr->had_work_done earlier in
      __bnxt_poll_work(), we know which completion ring has been polled
      and can just update/arm all the completion rings with
      cpr->had_work_done set.
      
      This simplifies the function with one less parameter and works just
      as well.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      340ac85e
    • Michael Chan's avatar
      bnxt_en: Handle all NQ notifications in bnxt_poll_p5(). · 54a9062f
      Michael Chan authored
      In bnxt_poll_p5(), the logic polls for up to 2 completion rings (RX and
      TX) for work.  In the current code, if we reach budget polling the
      first completion ring, we will stop.  If the other completion ring
      has work to do, we will handle it when NAPI calls us back.
      
      This is not optimal.  We potentially leave an unproceesed entry in
      the NQ.  When we are finally done with NAPI polling and re-enable
      interrupt, the remaining entry in the NQ will cause interrupt to
      be triggered immediately for no reason.
      
      Modify the code in bnxt_poll_p5() to keep looping until all NQ
      entries are handled even if the first completion ring has reached
      budget.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54a9062f
    • Eric Dumazet's avatar
      net/sched: act_ct: fix lockdep splat in tcf_ct_flow_table_get · 138470a9
      Eric Dumazet authored
      Convert zones_lock spinlock to zones_mutex mutex,
      and struct (tcf_ct_flow_table)->ref to a refcount,
      so that control path can use regular GFP_KERNEL allocations
      from standard process context. This is more robust
      in case of memory pressure.
      
      The refcount is needed because tcf_ct_flow_table_put() can
      be called from RCU callback, thus in BH context.
      
      The issue was spotted by syzbot, as rhashtable_init()
      was called with a spinlock held, which is bad since GFP_KERNEL
      allocations can sleep.
      
      Note to developers : Please make sure your patches are tested
      with CONFIG_DEBUG_ATOMIC_SLEEP=y
      
      BUG: sleeping function called from invalid context at mm/slab.h:565
      in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9582, name: syz-executor610
      2 locks held by syz-executor610/9582:
       #0: ffffffff8a34eb80 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:72 [inline]
       #0: ffffffff8a34eb80 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x3f9/0xad0 net/core/rtnetlink.c:5437
       #1: ffffffff8a3961b8 (zones_lock){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
       #1: ffffffff8a3961b8 (zones_lock){+...}, at: tcf_ct_flow_table_get+0xa3/0x1700 net/sched/act_ct.c:67
      Preemption disabled at:
      [<0000000000000000>] 0x0
      CPU: 0 PID: 9582 Comm: syz-executor610 Not tainted 5.6.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x188/0x20d lib/dump_stack.c:118
       ___might_sleep.cold+0x1f4/0x23d kernel/sched/core.c:6798
       slab_pre_alloc_hook mm/slab.h:565 [inline]
       slab_alloc_node mm/slab.c:3227 [inline]
       kmem_cache_alloc_node_trace+0x272/0x790 mm/slab.c:3593
       __do_kmalloc_node mm/slab.c:3615 [inline]
       __kmalloc_node+0x38/0x60 mm/slab.c:3623
       kmalloc_node include/linux/slab.h:578 [inline]
       kvmalloc_node+0x61/0xf0 mm/util.c:574
       kvmalloc include/linux/mm.h:645 [inline]
       kvzalloc include/linux/mm.h:653 [inline]
       bucket_table_alloc+0x8b/0x480 lib/rhashtable.c:175
       rhashtable_init+0x3d2/0x750 lib/rhashtable.c:1054
       nf_flow_table_init+0x16d/0x310 net/netfilter/nf_flow_table_core.c:498
       tcf_ct_flow_table_get+0xe33/0x1700 net/sched/act_ct.c:82
       tcf_ct_init+0xba4/0x18a6 net/sched/act_ct.c:1050
       tcf_action_init_1+0x697/0xa20 net/sched/act_api.c:945
       tcf_action_init+0x1e9/0x2f0 net/sched/act_api.c:1001
       tcf_action_add+0xdb/0x370 net/sched/act_api.c:1411
       tc_ctl_action+0x366/0x456 net/sched/act_api.c:1466
       rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5440
       netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2478
       netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
       netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
       netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x6b9/0x7d0 net/socket.c:2343
       ___sys_sendmsg+0x100/0x170 net/socket.c:2397
       __sys_sendmsg+0xec/0x1b0 net/socket.c:2430
       do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4403d9
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffd719af218 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004403d9
      RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000005 R09: 00000000004002c8
      R10: 0000000000000008 R11: 00000000000
      
      Fixes: c34b961a ("net/sched: act_ct: Create nf flow table per zone")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Paul Blakey <paulb@mellanox.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      138470a9
    • Taehee Yoo's avatar
      net: rmnet: set NETIF_F_LLTX flag · 376d5307
      Taehee Yoo authored
      The rmnet_vnd_setup(), which is the callback of ->ndo_start_xmit() is
      allowed to call concurrently because it uses RCU protected data.
      So, it doesn't need tx lock.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      376d5307
    • David S. Miller's avatar
      Merge branch 'bareudp-several-code-cleanup-for-bareudp-module' · 1033a215
      David S. Miller authored
      Taehee Yoo says:
      
      ====================
      bareudp: several code cleanup for bareudp module
      
      This patchset is to cleanup bareudp module code.
      
      1. The first patch is to add module alias
      In the current bareudp code, there is no module alias.
      So, RTNL couldn't load bareudp module automatically.
      
      2. The second patch is to add extack message.
      The extack error message is useful for noticing specific errors
      when command is failed.
      
      3. The third patch is to remove unnecessary udp_encap_enable().
      In the bareudp_socket_create(), udp_encap_enable() is called.
      But, the it's already called in the setup_udp_tunnel_sock().
      So, it could be removed.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1033a215
    • Taehee Yoo's avatar
      bareudp: remove unnecessary udp_encap_enable() in bareudp_socket_create() · 2baecda3
      Taehee Yoo authored
      In the current code, udp_encap_enable() is called in
      bareudp_socket_create().
      But, setup_udp_tunnel_sock() internally calls udp_encap_enable().
      So, udp_encap_enable() is unnecessary.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2baecda3
    • Taehee Yoo's avatar
      bareudp: print error message when command fails · c46a49a4
      Taehee Yoo authored
      When bareudp netlink command fails, it doesn't print any error message.
      So, users couldn't know the exact reason.
      In order to tell the exact reason to the user, the extack error message
      is used in this patch.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c46a49a4
    • Taehee Yoo's avatar
      bareudp: add module alias · eea45da4
      Taehee Yoo authored
      In the current bareudp code, there is no module alias.
      So, RTNL couldn't load bareudp module automatically.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eea45da4
    • David S. Miller's avatar
      Merge branch 'cxgb4-chcr-ktls-tx-ofld-support-on-T6-adapter' · 31de3f56
      David S. Miller authored
      Rohit Maheshwari says:
      
      ====================
      cxgb4/chcr: ktls tx ofld support on T6 adapter
      
      This series of patches add support for kernel tls offload in Tx direction,
      over Chelsio T6 NICs. SKBs marked as decrypted will be treated as tls plain
      text packets and then offloaded to encrypt using network device (chelsio T6
      adapter).
      
      This series is broken down as follows:
      
      Patch 1 defines a new macro and registers tls_dev_add and tls_dev_del
      callbacks. When tls_dev_add gets called we send a connection request to
      our hardware and to make HW understand about tls offload. Its a partial
      connection setup and only ipv4 part is done.
      
      Patch 2 handles the HW response of the connection request and then we
      request to update TCB and handle it's HW response as well. Also we save
      crypto key locally. Only supporting TLS_CIPHER_AES_GCM_128_KEY_SIZE.
      
      Patch 3 handles tls marked skbs (decrypted bit set) and sends it to ULD for
      crypto handling. This code has a minimal portion of tx handler, to handle
      only one complete record per skb.
      
      Patch 4 hanldes partial end part of records. Also added logic to handle
      multiple records in one single skb. It also adds support to send out tcp
      option(/s) if exists in skb. If a record is partial but has end part of a
      record, we'll fetch complete record and then only send it to HW to generate
      HASH on complete record.
      
      Patch 5 handles partial first or middle part of record, it uses AES_CTR to
      encrypt the partial record. If we are trying to send middle record, it's
      start should be 16 byte aligned, so we'll fetch few earlier bytes from the
      record and then send it to HW for encryption.
      
      Patch 6 enables ipv6 support and also includes ktls startistics.
      
      v1->v2:
      - mark tcb state to close in tls_dev_del.
      - u_ctx is now picked from adapter structure.
      - clear atid in case of failure.
      - corrected ULP_CRYPTO_KTLS_INLINE value.
      - optimized tcb update using control queue.
      - state machine handling when earlier states received.
      - chcr_write_cpl_set_tcb_ulp  function is shifted to patch3.
      - un-necessary updating left variable.
      
      v2->v3:
      - add empty line after variable declaration.
      - local variable declaration in reverse christmas tree ordering.
      
      v3->v4:
      - replaced kfree_skb with dev_kfree_skb_any.
      - corrected error message reported by kbuild test robot <lkp@intel.com>
      - mss calculation logic.
      - correct place for Alloc skb check.
      - Replaced atomic_t with atomic64_t
      - added few more statistics counters.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31de3f56
    • Rohit Maheshwari's avatar
      cxgb4/chcr: Add ipv6 support and statistics · 62370a4f
      Rohit Maheshwari authored
      Adding ipv6 support and ktls related statistics.
      
      v1->v2:
      - added blank lines at 2 places.
      
      v3->v4:
      - Replaced atomic_t with atomic64_t
      - added few necessary stat counters.
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62370a4f
    • Rohit Maheshwari's avatar
      chcr: Handle first or middle part of record · dc05f3df
      Rohit Maheshwari authored
      This patch contains handling of first part or middle part of the record.
      When we get a middle record, we will fetch few already sent bytes to
      make packet start 16 byte aligned.
      And if the packet has only the header part, we don't need to send it for
      packet encryption, send that packet as a plaintext.
      
      v1->v2:
      - un-necessary updating left variable.
      
      v3->v4:
      - replaced kfree_skb with dev_kfree_skb_any.
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc05f3df
    • Rohit Maheshwari's avatar
      chcr: handle partial end part of a record · 429765a1
      Rohit Maheshwari authored
      TCP segment can chop a record in any order. Record can either be
      complete or it can be partial (first part which contains header,
      middle part which doesn't have header or TAG, and the end part
      which contains TAG. This patch handles partial end part of a tx
      record. In case of partial end part's, driver will send complete
      record to HW, so that HW will calculate GHASH (TAG) of complete
      packet.
      Also added support to handle multiple records in a segment.
      
      v1->v2:
      - miner change in calling chcr_write_cpl_set_tcb_ulp.
      - no need of checking return value of chcr_ktls_write_tcp_options.
      
      v3->v4:
      - replaced kfree_skb with dev_kfree_skb_any.
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      429765a1
    • Rohit Maheshwari's avatar
      cxgb4/chcr: complete record tx handling · 5a4b9fe7
      Rohit Maheshwari authored
      Added tx handling in this patch. This includes handling of segments
      contain single complete record.
      
      v1->v2:
      - chcr_write_cpl_set_tcb_ulp is added in this patch.
      
      v3->v4:
      - mss calculation logic.
      - replaced kfree_skb with dev_kfree_skb_any.
      - corrected error message reported by kbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a4b9fe7
    • Rohit Maheshwari's avatar
      cxgb4/chcr: Save tx keys and handle HW response · 8a30923e
      Rohit Maheshwari authored
      As part of this patch generated and saved crypto keys, handled HW
      response of act_open_req and set_tcb_req. Defined connection state
      update.
      
      v1->v2:
      - optimized tcb update using control queue.
      - state machine handling when earlier states received.
      
      v2->v3:
      - Added one empty line after function declaration.
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a30923e
    • Rohit Maheshwari's avatar
      cxgb4/chcr : Register to tls add and del callback · 34aba2c4
      Rohit Maheshwari authored
      A new macro is defined to enable ktls tx offload support on Chelsio
      T6 adapter. And if this macro is enabled, cxgb4 will send mailbox to
      enable or disable ktls settings on HW.
      In chcr, enabled tx offload flag in netdev and registered tls_dev_add
      and tls_dev_del.
      
      v1->v2:
      - mark tcb state to close in tls_dev_del.
      - u_ctx is now picked from adapter structure.
      - clear atid in case of failure.
      - corrected ULP_CRYPTO_KTLS_INLINE value.
      
      v2->v3:
      - add empty line after variable declaration.
      - local variable declaration in reverse christmas tree ordering.
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34aba2c4
    • David S. Miller's avatar
      Merge branch 'net-allow-user-specify-TC-action-HW-stats-type' · 9d2e4e16
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      net: allow user specify TC action HW stats type
      
      Currently, when user adds a TC action and the action gets offloaded,
      the user expects the HW stats to be counted and included in stats dump.
      However, since drivers may implement different types of counting, there
      is no way to specify which one the user is interested in.
      
      For example for mlx5, only delayed counters are available as the driver
      periodically polls for updated stats.
      
      In case of mlxsw, the counters are queried on dump time. However, the
      HW resources for this type of counters is quite limited (couple of
      thousands). This limits the amount of supported offloaded filters
      significantly. Without counter assigned, the HW is capable to carry
      millions of those.
      
      On top of that, mlxsw HW is able to support delayed counters as well in
      greater numbers. That is going to be added in a follow-up patch.
      
      This patchset allows user to specify one of the following types of HW
      stats for added action:
      immediate - queried during dump time
      delayed - polled from HW periodically or sent by HW in async manner
      disabled - no stats needed
      
      Note that if "hw_stats" option is not passed, user does not care about
      the type, just expects any type of stats.
      
      Examples:
      $ tc filter add dev enp0s16np28 ingress proto ip handle 1 pref 1 flower skip_sw dst_ip 192.168.1.1 action drop hw_stats disabled
      $ tc -s filter show dev enp0s16np28 ingress
      filter protocol ip pref 1 flower chain 0
      filter protocol ip pref 1 flower chain 0 handle 0x1
        eth_type ipv4
        dst_ip 192.168.1.1
        skip_sw
        in_hw in_hw_count 2
              action order 1: gact action drop
               random type none pass val 0
               index 1 ref 1 bind 1 installed 7 sec used 2 sec
              Action statistics:
              Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0
              hw_stats disabled
      
      $ tc filter add dev enp0s16np28 ingress proto ip handle 1 pref 1 flower skip_sw dst_ip 192.168.1.1 action drop hw_stats immediate
      $ tc -s filter show dev enp0s16np28 ingress
      filter protocol ip pref 1 flower chain 0
      filter protocol ip pref 1 flower chain 0 handle 0x1
        eth_type ipv4
        dst_ip 192.168.1.1
        skip_sw
        in_hw in_hw_count 2
              action order 1: gact action drop
               random type none pass val 0
               index 1 ref 1 bind 1 installed 11 sec used 4 sec
              Action statistics:
              Sent 102 bytes 1 pkt (dropped 1, overlimits 0 requeues 0)
              Sent software 0 bytes 0 pkt
              Sent hardware 102 bytes 1 pkt
              backlog 0b 0p requeues 0
              hw_stats immediate
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d2e4e16