1. 16 Nov, 2019 11 commits
  2. 15 Nov, 2019 29 commits
    • Petar Penkov's avatar
      tun: fix data-race in gro_normal_list() · c39e342a
      Petar Penkov authored
      There is a race in the TUN driver between napi_busy_loop and
      napi_gro_frags. This commit resolves the race by adding the NAPI struct
      via netif_tx_napi_add, instead of netif_napi_add, which disables polling
      for the NAPI struct.
      
      KCSAN reported:
      BUG: KCSAN: data-race in gro_normal_list.part.0 / napi_busy_loop
      
      write to 0xffff8880b5d474b0 of 4 bytes by task 11205 on cpu 0:
       gro_normal_list.part.0+0x77/0xb0 net/core/dev.c:5682
       gro_normal_list net/core/dev.c:5678 [inline]
       gro_normal_one net/core/dev.c:5692 [inline]
       napi_frags_finish net/core/dev.c:5705 [inline]
       napi_gro_frags+0x625/0x770 net/core/dev.c:5778
       tun_get_user+0x2150/0x26a0 drivers/net/tun.c:1976
       tun_chr_write_iter+0x79/0xd0 drivers/net/tun.c:2022
       call_write_iter include/linux/fs.h:1895 [inline]
       do_iter_readv_writev+0x487/0x5b0 fs/read_write.c:693
       do_iter_write fs/read_write.c:970 [inline]
       do_iter_write+0x13b/0x3c0 fs/read_write.c:951
       vfs_writev+0x118/0x1c0 fs/read_write.c:1015
       do_writev+0xe3/0x250 fs/read_write.c:1058
       __do_sys_writev fs/read_write.c:1131 [inline]
       __se_sys_writev fs/read_write.c:1128 [inline]
       __x64_sys_writev+0x4e/0x60 fs/read_write.c:1128
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      read to 0xffff8880b5d474b0 of 4 bytes by task 11168 on cpu 1:
       gro_normal_list net/core/dev.c:5678 [inline]
       napi_busy_loop+0xda/0x4f0 net/core/dev.c:6126
       sk_busy_loop include/net/busy_poll.h:108 [inline]
       __skb_recv_udp+0x4ad/0x560 net/ipv4/udp.c:1689
       udpv6_recvmsg+0x29e/0xe90 net/ipv6/udp.c:288
       inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592
       sock_recvmsg_nosec net/socket.c:871 [inline]
       sock_recvmsg net/socket.c:889 [inline]
       sock_recvmsg+0x92/0xb0 net/socket.c:885
       sock_read_iter+0x15f/0x1e0 net/socket.c:967
       call_read_iter include/linux/fs.h:1889 [inline]
       new_sync_read+0x389/0x4f0 fs/read_write.c:414
       __vfs_read+0xb1/0xc0 fs/read_write.c:427
       vfs_read fs/read_write.c:461 [inline]
       vfs_read+0x143/0x2c0 fs/read_write.c:446
       ksys_read+0xd5/0x1b0 fs/read_write.c:587
       __do_sys_read fs/read_write.c:597 [inline]
       __se_sys_read fs/read_write.c:595 [inline]
       __x64_sys_read+0x4c/0x60 fs/read_write.c:595
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 11168 Comm: syz-executor.0 Not tainted 5.4.0-rc6+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 94317099 ("tun: enable NAPI for TUN/TAP driver")
      Signed-off-by: default avatarPetar Penkov <ppenkov@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c39e342a
    • Eric Dumazet's avatar
      selftests: net: tcp_mmap should create detached threads · 20021578
      Eric Dumazet authored
      Since we do not plan using pthread_join() in the server do_accept()
      loop, we better create detached threads, or risk increasing memory
      footprint over time.
      
      Fixes: 192dc405 ("selftests: net: add tcp_mmap program")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20021578
    • Tonghao Zhang's avatar
      net: openvswitch: don't call pad_packet if not necessary · 61ca533c
      Tonghao Zhang authored
      The nla_put_u16/nla_put_u32 makes sure that
      *attrlen is align. The call tree is that:
      
      nla_put_u16/nla_put_u32
        -> nla_put		attrlen = sizeof(u16) or sizeof(u32)
        -> __nla_put		attrlen
        -> __nla_reserve	attrlen
        -> skb_put(skb, nla_total_size(attrlen))
      
      nla_total_size returns the total length of attribute
      including padding.
      
      Cc: Joe Stringer <joe@ovn.org>
      Cc: William Tu <u9012063@gmail.com>
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61ca533c
    • David S. Miller's avatar
      Merge branch 'DSA-driver-for-Vitesse-Felix-switch' · 3bb884a4
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      DSA driver for Vitesse Felix switch
      
      This series builds upon the previous "Accomodate DSA front-end into
      Ocelot" topic and does the following:
      
      - Reworks the Ocelot (VSC7514) driver to support one more switching core
        (VSC9959), used in NPI mode. Some code which was thought to be
        SoC-specific (ocelot_board.c) wasn't, and vice versa, so it is being
        accordingly moved.
      - Exports ocelot driver structures and functions to include/soc/mscc.
      - Adds a DSA ocelot front-end for VSC9959, which is a PCI device and
        uses the exported ocelot functionality for hardware configuration.
      - Adds a tagger driver for the Vitesse injection/extraction DSA headers.
        This is known to be compatible with at least Ocelot and Felix.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3bb884a4
    • Vladimir Oltean's avatar
      net: dsa: ocelot: add driver for Felix switch family · 56051948
      Vladimir Oltean authored
      This supports an Ethernet switching core from Vitesse / Microsemi /
      Microchip (VSC9959) which is part of the Ocelot family (a brand name),
      and whose code name is Felix. The switch can be (and is) integrated on
      different SoCs as a PCIe endpoint device.
      
      The functionality is provided by the core of the Ocelot switch driver
      (drivers/net/ethernet/mscc). In this regard, the current driver is an
      instance of Microsemi's Ocelot core driver, with a DSA front-end. It
      inherits its name from VSC9959's code name, to distinguish itself from
      the switchdev ocelot driver.
      
      The patch adds the logic for probing a PCI device and defines the
      register map for the VSC9959 switch core, since it has some differences
      in register addresses and bitfield mappings compared to the other Ocelot
      switches (VSC7511, VSC7512, VSC7513, VSC7514).
      
      The Felix driver declares the register map as part of the "instance
      table". Currently the VSC9959 inside NXP LS1028A is the only instance,
      but presumably it can support other switches in the Ocelot family, when
      used in DSA mode (Linux running on the external CPU, and not on the
      embedded MIPS).
      
      In a few cases, some h/w operations have to be done differently on
      VSC9959 due to missing bitfields.  This is the case for the switch core
      reset and init.  Because for this operation Ocelot uses some bits that
      are not present on Felix, the latter has to use a register from the
      global registers block (GCB) instead.
      
      Although it is a PCI driver, it relies on DT bindings for compatibility
      with DSA (CPU port link, PHY library). It does not have any custom
      device tree bindings, since we would like to minimize its dependency on
      device tree though.
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56051948
    • Vladimir Oltean's avatar
      net: dsa: ocelot: add tagger for Ocelot/Felix switches · 8dce89aa
      Vladimir Oltean authored
      While it is entirely possible that this tagger format is in fact more
      generic than just these 2 switch families, I don't have that knowledge.
      The Seville switch in NXP T1040 has a similar frame format, but there
      are enough differences (e.g. DEST field starts at bit 57 instead of 56)
      that calling this file tag_vitesse.c is a bit of a stretch at the
      moment. The frame format has been listed in a comment so that people who
      add support for further Vitesse switches can rework this tagger while
      keeping compatibility with Felix.
      
      The "ocelot" name was chosen instead of "felix" because even the Ocelot
      switch can act as a DSA device when it is used in NPI mode, and the Felix
      tagger format is almost identical. Currently it is only used for the
      Felix switch embedded in the NXP LS1028A chip.
      
      The ABI for this tagger should be considered "not stable" at the moment.
      The DSA tag is always placed before the Ethernet header and therefore,
      we are using the long prefix for RX tags to avoid putting the DSA master
      port in promiscuous mode. Once there will be an API in DSA for drivers
      to request DSA masters to be in promiscuous mode unconditionally, we
      will switch to the "no prefix" extraction frame header, which will save
      16 padding bytes for each RX frame.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8dce89aa
    • Vladimir Oltean's avatar
      net: mscc: ocelot: publish ocelot_sys.h to include/soc/mscc · a030dfe1
      Vladimir Oltean authored
      The Felix DSA driver needs to write to SYS_RAM_INIT_RAM_INIT for its own
      chip initialization process.
      
      Also update the MAINTAINERS file such that the headers exported by the
      ocelot driver are under the same maintainers' umbrella as the driver
      itself.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a030dfe1
    • Vladimir Oltean's avatar
      net: mscc: ocelot: publish structure definitions to include/soc/mscc/ocelot.h · 5e256365
      Vladimir Oltean authored
      We will be registering another switch driver based on ocelot, which
      lives under drivers/net/dsa.
      
      Make sure the Felix DSA front-end has the necessary abstractions to
      implement a new Ocelot driver instantiation. This includes the function
      prototypes for implementing DSA callbacks.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e256365
    • Vladimir Oltean's avatar
      net: mscc: ocelot: separate the implementation of switch reset · 3a77b593
      Vladimir Oltean authored
      The Felix switch has a different reset procedure, so a function pointer
      needs to be created and added to the ocelot_ops structure.
      
      The reset procedure has been moved into ocelot_init.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a77b593
    • Vladimir Oltean's avatar
      net: mscc: ocelot: adjust MTU on the CPU port in NPI mode · ba551bc3
      Vladimir Oltean authored
      When using the NPI port, the DSA tag is passed through Ethernet, so the
      switch's MAC needs to accept it as it comes from the DSA master. Increase
      the MTU on the external CPU port to account for the length of the
      injection header.
      
      Without this patch, MTU-sized frames are dropped by the switch's CPU
      port on xmit, which is especially obvious in TCP sessions.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba551bc3
    • Vladimir Oltean's avatar
      net: mscc: ocelot: export a constant for the tag length in bytes · f24711fd
      Vladimir Oltean authored
      This constant will be used in a future patch to increase the MTU on NPI
      ports, and will also be used in the tagger driver for Felix.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f24711fd
    • Vladimir Oltean's avatar
      net: mscc: ocelot: create a helper for changing the port MTU · fa914e9c
      Vladimir Oltean authored
      Since in an NPI/DSA setup, not all ports will have the same MTU, we need
      to make sure the watermarks for pause frames and/or tail dropping logic
      that existed in the driver is still coherent for the new MTU values.
      
      We need to do this because the NPI (aka external CPU) port needs an
      increased MTU for the DSA tag. This will be done in a future patch.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa914e9c
    • Vladimir Oltean's avatar
      net: mscc: ocelot: move invariant configs out of adjust_link · 5bc9d2e6
      Vladimir Oltean authored
      It doesn't make sense to rewrite all these registers every time the PHY
      library notifies us about a link state change.
      
      In a future patch we will customize the MTU for the CPU port, and since
      the MTU was previously configured from adjust_link, if we don't make
      this change, its value would have got overridden.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5bc9d2e6
    • Claudiu Manoil's avatar
      net: mscc: ocelot: filter out ocelot SoC specific PCS config from common path · dc3de2a2
      Claudiu Manoil authored
      The adjust_link routine should be generic enough to be (re)used by
      any SoC that integrates a switch core compatible with the Ocelot
      core switch driver.  Currently all configurations are generic except
      for the PCS settings that are SoC specific.  Move these out to the
      Ocelot SoC/board instance.
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc3de2a2
    • Claudiu Manoil's avatar
      net: mscc: ocelot: move resource ioremap and regmap init to common code · 259630e0
      Claudiu Manoil authored
      Let's make this ioremap and regmap init code common.  It should not
      be platform dependent as it should be usable by PCI devices too.
      Use better names where necessary to avoid clashes.
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      259630e0
    • David S. Miller's avatar
      Merge branch 'net-smc-improve-termination-handling-part-3' · e7be235f
      David S. Miller authored
      Karsten Graul says:
      
      ====================
      net/smc: improve termination handling (part 3)
      
      Part 3 of the SMC termination patches improves the link group
      termination processing and introduces the ability to immediately
      terminate a link group.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7be235f
    • Ursula Braun's avatar
      net/smc: immediate termination for SMCR link groups · 0b29ec64
      Ursula Braun authored
      If the SMC module is unloaded or an IB device is thrown away, the
      immediate link group freeing introduced for SMCD is exploited for SMCR
      as well. That means SMCR-specifics are added to smc_conn_kill().
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b29ec64
    • Ursula Braun's avatar
      net/smc: wait for tx completions before link freeing · 6a37ad3d
      Ursula Braun authored
      Make sure all pending work requests are completed before freeing
      a link.
      Dismiss tx pending slots already when terminating a link group to
      exploit termination shortcut in tx completion queue handler.
      
      And kill the completion queue tasklets after destroy of the
      completion queues, otherwise there is a time window for another
      tasklet schedule of an already killed tasklet.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a37ad3d
    • Ursula Braun's avatar
      net/smc: abnormal termination without orderly flag · 2c1d3e50
      Ursula Braun authored
      For abnormal termination issue an LLC DELETE_LINK without the
      orderly flag.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c1d3e50
    • Ursula Braun's avatar
      net/smc: no WR buffer wait for terminating link group · 15e1b99a
      Ursula Braun authored
      Avoid waiting for a free work request buffer, if the link group
      is already terminating.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15e1b99a
    • Ursula Braun's avatar
      net/smc: introduce bookkeeping of SMCD link groups · 5edd6b9c
      Ursula Braun authored
      If the ism module is unloaded return control from exit routine only,
      if all link groups are freed.
      If an IB device is thrown away return control from device removal only,
      if all link groups belonging to this device are freed.
      A counters for the total number of SMCD link groups per ISM device is
      introduced. ism module unloading continues only if the total number of
      SMCD link groups for all ISM devices is zero. ISM device
      removal continues only it the total number of SMCD link groups per ISM
      device has decreased to zero.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5edd6b9c
    • Ursula Braun's avatar
      net/smc: abnormal termination of SMCD link groups · 5421ec28
      Ursula Braun authored
      A final cleanup due to SMCD device removal means immediate freeing
      of all link groups belonging to this device in interrupt context.
      
      This patch introduces a separate SMCD link group termination routine,
      which terminates all link groups of an SMCD device.
      
      This new routine smcd_terminate_all ()is reused if the smc module is
      unloaded.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5421ec28
    • Ursula Braun's avatar
      net/smc: immediate termination for SMCD link groups · 42bfba9e
      Ursula Braun authored
      SMCD link group termination is called when peer signals its shutdown
      of its corresponding link group. For regular shutdowns no connections
      exist anymore. For abnormal shutdowns connections must be killed and
      their DMBs must be unregistered immediately. That means the SMCR method
      to delay the link group freeing several seconds does not fit.
      
      This patch adds immediate termination of a link group and its SMCD
      connections and makes sure all SMCD link group related cleanup steps
      are finished.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42bfba9e
    • Ursula Braun's avatar
      net/smc: fix final cleanup sequence for SMCD devices · 50c6b20e
      Ursula Braun authored
      If peer announces shutdown, use the link group terminate worker for
      local cleanup of link groups and connections to terminate link group
      in proper context.
      
      Make sure link groups are cleaned up first before destroying the
      event queue of the SMCD device, because link group cleanup may
      raise events.
      
      Send signal shutdown only if peer has not done it already.
      
      Send socket abort or close only, if peer has not already announced
      shutdown.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50c6b20e
    • David S. Miller's avatar
      Merge branch 'net-stmmac-CPU-Performance-Improvements' · 43da44c8
      David S. Miller authored
      Jose Abreu says:
      
      ====================
      net: stmmac: CPU Performance Improvements
      
      CPU Performance improvements for stmmac. Please check bellow for results
      before and after the series.
      
      Patch 1/7, allows RX Interrupt on Completion to be disabled and only use the
      RX HW Watchdog.
      
      Patch 2/7, setups the default RX coalesce settings instead of using the
      minimum value.
      
      Patch 3/7 and 4/7, removes the uneeded computations for RX Flow Control
      activation/de-activation, on some cases.
      
      Patch 5/7, tunes-up the default coalesce settings.
      
      Patch 6/7, re-works the TX coalesce timer activation logic.
      
      Patch 7/7, removes the now uneeded TBU interrupt.
      
      NetPerf UDP Results:
      --------------------
      
      Socket  Message  Elapsed      Messages                   CPU      Service
      Size    Size     Time         Okay Errors   Throughput   Util     Demand
      bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
      --- XGMAC@2.5G: Before
      212992    1400   10.00     2100620      0     2351.7     36.69    5.112
      212992           10.00     2100539            2351.6     26.18    3.648
      --- XGMAC@2.5G: After
      212992    1400   10.00     2108972      0     2361.5     21.73    3.015
      212992           10.00     2097038            2348.1     19.21    2.666
      
      --- GMAC5@1G: Before
      212992    1400   10.00      786000      0      880.2     34.71    12.923
      212992           10.00      786000             880.2     23.42    8.719
      --- GMAC5@1G: After
      212992    1400   10.00      842648      0      943.7     14.12    4.903
      212992           10.00      842648             943.7     12.73    4.418
      
      Perf TCP Results on RX Path:
      ----------------------------
      --- XGMAC@2.5G: Before
      22.51%  swapper          [stmmac]           [k] dwxgmac2_dma_interrupt
      10.82%  swapper          [stmmac]           [k] dwxgmac2_host_mtl_irq_status
       5.21%  swapper          [stmmac]           [k] dwxgmac2_host_irq_status
       4.67%  swapper          [stmmac]           [k] dwxgmac3_safety_feat_irq_status
       3.63%  swapper          [kernel.kallsyms]  [k] stack_trace_consume_entry
       2.74%  iperf3           [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
       2.52%  swapper          [kernel.kallsyms]  [k] update_stack_state
       1.94%  ksoftirqd/0      [stmmac]           [k] dwxgmac2_dma_interrupt
       1.45%  iperf3           [kernel.kallsyms]  [k] queued_spin_lock_slowpath
       1.26%  swapper          [kernel.kallsyms]  [k] create_object
      --- XGMAC@2.5G: After
       7.43%  swapper          [kernel.kallsyms]   [k] stack_trace_consume_entry
       5.86%  swapper          [stmmac]            [k] dwxgmac2_dma_interrupt
       5.68%  swapper          [kernel.kallsyms]   [k] update_stack_state
       4.71%  iperf3           [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
       2.88%  swapper          [kernel.kallsyms]   [k] create_object
       2.69%  swapper          [stmmac]            [k] dwxgmac2_host_mtl_irq_status
       2.61%  swapper          [stmmac]            [k] stmmac_napi_poll_rx
       2.52%  swapper          [kernel.kallsyms]   [k] unwind_next_frame.part.4
       1.48%  swapper          [kernel.kallsyms]   [k] unwind_get_return_address
       1.38%  swapper          [kernel.kallsyms]   [k] arch_stack_walk
      
      --- GMAC5@1G: Before
      31.29%  swapper          [stmmac]           [k] dwmac4_dma_interrupt
      14.57%  swapper          [stmmac]           [k] dwmac4_irq_mtl_status
      10.66%  swapper          [stmmac]           [k] dwmac4_irq_status
       1.97%  swapper          [kernel.kallsyms]  [k] stack_trace_consume_entry
       1.73%  iperf3           [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
       1.59%  swapper          [kernel.kallsyms]  [k] update_stack_state
       1.15%  iperf3           [kernel.kallsyms]  [k] do_syscall_64
       1.01%  ksoftirqd/0      [stmmac]           [k] dwmac4_dma_interrupt
       0.89%  swapper          [kernel.kallsyms]  [k] __default_send_IPI_dest_field
       0.75%  swapper          [stmmac]           [k] stmmac_napi_poll_rx
      --- GMAC5@1G: After
       6.70%  swapper          [kernel.kallsyms]   [k] stack_trace_consume_entry
       5.79%  swapper          [stmmac]            [k] dwmac4_dma_interrupt
       5.29%  swapper          [kernel.kallsyms]   [k] update_stack_state
       3.52%  iperf3           [kernel.kallsyms]   [k] copy_user_enhanced_fast_string
       2.83%  swapper          [stmmac]            [k] dwmac4_irq_mtl_status
       2.62%  swapper          [kernel.kallsyms]   [k] create_object
       2.46%  swapper          [stmmac]            [k] stmmac_napi_poll_rx
       2.32%  swapper          [kernel.kallsyms]   [k] unwind_next_frame.part.4
       2.19%  swapper          [stmmac]            [k] dwmac4_irq_status
       1.39%  swapper          [kernel.kallsyms]   [k] unwind_get_return_address
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43da44c8
    • Jose Abreu's avatar
      net: stmmac: xgmac: Do not enable TBU interrupt · 8d07a793
      Jose Abreu authored
      Now that TX Coalesce has been rewritten we no longer need this
      additional interrupt enabled. This reduces CPU usage.
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d07a793
    • Jose Abreu's avatar
      net: stmmac: Rework TX Coalesce logic · c2837423
      Jose Abreu authored
      Coalesce logic currently increments the number of packets and sets the
      IC bit when the coalesced packets have passed a given limit. This does
      not reflect very well what coalesce was meant for as we can have a large
      number of packets that are coalesced and then a single one, sent later
      on that has the IC bit.
      
      Rework the logic so that it coalesces only upon a limit of packets and
      sets the IC bit for large number of packets.
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2837423
    • Jose Abreu's avatar
      net: stmmac: Tune-up default coalesce settings · da202451
      Jose Abreu authored
      Tune-up the defalt coalesce settings for optimal values. This gives the
      best performance in most of the use-cases.
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da202451
    • Jose Abreu's avatar
      net: stmmac: xgmac: Remove uneeded computation for RFA/RFD · 52f96cd1
      Jose Abreu authored
      RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
      space we have, the later we can activate Flow Control. Let's use
      hard-coded values for RFA and RFD for all FIFO sizes with the exception
      of 4k, which is a special case.
      Signed-off-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52f96cd1