1. 17 Mar, 2017 3 commits
    • Rick Farrington's avatar
      liquidio: use meaningful names for IRQs · 0c88a761
      Rick Farrington authored
      All IRQs owned by the PF and VF drivers share the same nondescript name
      "octeon"; this makes it difficult to setup interrupt affinity.
      
      Change the IRQ names to reflect their specific purpose:
      
          LiquidIO<id>-<func>-<type>-<queue pair num>
      
      Examples:
          LiquidIO0-pf0-rxtx-3
          LiquidIO1-vf1-rxtx-0
          LiquidIO0-pf0-aux
      
      We cannot use netdev->name for naming the IRQs because:
      
          1.  Early during init, the PF and VF drivers require interrupts to
              send/receive control data from the NIC firmware; so the PF and VF
              must request IRQs long before the netdev struct is registered.
      
          2.  The IRQ name can only be specified at the time it is requested.
              It cannot be changed after that.
      Signed-off-by: default avatarRick Farrington <ricardo.farrington@cavium.com>
      Signed-off-by: default avatarFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: default avatarSatanand Burla <satananda.burla@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c88a761
    • Rick Farrington's avatar
      liquidio: remove/replace invalid code · b229487b
      Rick Farrington authored
      Remove invalid call to dma_sync_single_for_cpu() because previous DMA
      allocation was coherent--not streaming.  Remove code that references fields
      in struct list_head; replace it with calls to list_empty() and
      list_first_entry().  Also, add comment to clarify complicated if statement.
      Signed-off-by: default avatarRick Farrington <ricardo.farrington@cavium.com>
      Signed-off-by: default avatarFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: default avatarDerek Chickles <derek.chickles@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b229487b
    • Nik Unger's avatar
      netem: apply correct delay when rate throttling · 5080f39e
      Nik Unger authored
      I recently reported on the netem list that iperf network benchmarks
      show unexpected results when a bandwidth throttling rate has been
      configured for netem. Specifically:
      
      1) The measured link bandwidth *increases* when a higher delay is added
      2) The measured link bandwidth appears higher than the specified limit
      3) The measured link bandwidth for the same very slow settings varies significantly across
        machines
      
      The issue can be reproduced by using tc to configure netem with a
      512kbit rate and various (none, 1us, 50ms, 100ms, 200ms) delays on a
      veth pair between network namespaces, and then using iperf (or any
      other network benchmarking tool) to test throughput. Complete detailed
      instructions are in the original email chain here:
      https://lists.linuxfoundation.org/pipermail/netem/2017-February/001672.html
      
      There appear to be two underlying bugs causing these effects:
      
      - The first issue causes long delays when the rate is slow and no
        delay is configured (e.g., "rate 512kbit"). This is because SKBs are
        not orphaned when no delay is configured, so orphaning does not
        occur until *after* the rate-induced delay has been applied. For
        this reason, adding a tiny delay (e.g., "rate 512kbit delay 1us")
        dramatically increases the measured bandwidth.
      
      - The second issue is that rate-induced delays are not correctly
        applied, allowing SKB delays to occur in parallel. The indended
        approach is to compute the delay for an SKB and to add this delay to
        the end of the current queue. However, the code does not detect
        existing SKBs in the queue due to improperly testing sch->q.qlen,
        which is nonzero even when packets exist only in the
        rbtree. Consequently, new SKBs do not wait for the current queue to
        empty. When packet delays vary significantly (e.g., if packet sizes
        are different), then this also causes unintended reordering.
      
      I modified the code to expect a delay (and orphan the SKB) when a rate
      is configured. I also added some defensive tests that correctly find
      the latest scheduled delivery time, even if it is (unexpectedly) for a
      packet in sch->q. I have tested these changes on the latest kernel
      (4.11.0-rc1+) and the iperf / ping test results are as expected.
      Signed-off-by: default avatarNik Unger <njunger@uwaterloo.ca>
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5080f39e
  2. 16 Mar, 2017 26 commits
  3. 15 Mar, 2017 11 commits
    • David S. Miller's avatar
      Merge branch 'dsa-check-out-of-range-ageing-time' · 02cb24e9
      David S. Miller authored
      Vivien Didelot says:
      
      ====================
      net: dsa: check out-of-range ageing time
      
      The ageing time limits supported by DSA drivers vary depending on the
      switch model. If a driver returns -ERANGE for out-of-range values, the
      switchdev commit phase will fail with the following stacktrace:
      
          # brctl setageing br0 4
          [ 8530.082179] WARNING: CPU: 0 PID: 910 at net/switchdev/switchdev.c:291 switchdev_port_attr_set_now+0xbc/0xc0
          [ 8530.090679] br0: Commit of attribute (id=5) failed.
          [ 8530.094256] Modules linked in:
          [ 8530.096032] CPU: 0 PID: 910 Comm: kworker/0:4 Tainted: G        W       4.10.0 #361
          [ 8530.102412] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
          [ 8530.107571] Workqueue: events switchdev_deferred_process_work
          [ 8530.112039] Backtrace:
          [ 8530.113224] [<8010ca34>] (dump_backtrace) from [<8010cd3c>] (show_stack+0x20/0x24)
          [ 8530.119521]  r6:00000000 r5:80834da0 r4:80ca7e48 r3:8120ca3c
          [ 8530.123908] [<8010cd1c>] (show_stack) from [<8037ad40>] (dump_stack+0x24/0x28)
          [ 8530.129873] [<8037ad1c>] (dump_stack) from [<80118de4>] (__warn+0xf4/0x10c)
          [ 8530.135545] [<80118cf0>] (__warn) from [<80118e44>] (warn_slowpath_fmt+0x48/0x50)
          [ 8530.141760]  r9:00000000 r8:81252bec r7:80f19d90 r6:9dc3c000 r5:80ca7e7c r4:80834de8
          [ 8530.148235] [<80118e00>] (warn_slowpath_fmt) from [<80670b20>] (switchdev_port_attr_set_now+0xbc/0xc0)
          [ 8530.156240]  r3:9dc3c000 r2:80834de8
          [ 8530.158539]  r4:ffffffde
          [ 8530.159788] [<80670a64>] (switchdev_port_attr_set_now) from [<80670b44>] (switchdev_port_attr_set_deferred+0x20/0x6c)
          [ 8530.169118]  r7:806705a8 r6:9dc3c000 r5:80f19d90 r4:80f19d80
          [ 8530.173500] [<80670b24>] (switchdev_port_attr_set_deferred) from [<80670580>] (switchdev_deferred_process+0x50/0xe8)
          [ 8530.182742]  r6:80ca6000 r5:81252bec r4:80f19d80 r3:80670b24
          [ 8530.187115] [<80670530>] (switchdev_deferred_process) from [<80670930>] (switchdev_deferred_process_work+0x1c/0x24)
          [ 8530.196277]  r8:00000000 r7:9ffdc100 r6:8120ad6c r5:9ddefc00 r4:81252bf4 r3:9de343c0
          [ 8530.202756] [<80670914>] (switchdev_deferred_process_work) from [<8012f770>] (process_one_work+0x120/0x3b0)
          [ 8530.211231] [<8012f650>] (process_one_work) from [<8012fa70>] (worker_thread+0x70/0x534)
          [ 8530.218046]  r10:9ddefc00 r9:8120ad6c r8:80ca6038 r7:8120ad80 r6:81211f80 r5:9ddefc18
          [ 8530.224579]  r4:8120ad6c
          [ 8530.225830] [<8012fa00>] (worker_thread) from [<80135640>] (kthread+0x114/0x144)
          [ 8530.231955]  r10:9f4e9e94 r9:9de1fe58 r8:8012fa00 r7:9ddefc00 r6:9de1fdc0 r5:00000000
          [ 8530.238497]  r4:9de1fe40
          [ 8530.239750] [<8013552c>] (kthread) from [<80108cd8>] (ret_from_fork+0x14/0x3c)
          [ 8530.245679]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:8013552c
          [ 8530.252234]  r4:9de1fdc0 r3:80ca6000
          [ 8530.254512] ---[ end trace 87475cc71b80ef73 ]---
          [ 8530.257852] br0: failed (err=-34) to set attribute (id=5)
      
      This patchset fixes this by adding ageing_time_min and ageing_time_max
      fields to the dsa_switch structure, which can optionally be set by a DSA
      driver.
      
      If provided, the DSA core will check for out-of-range values in the
      SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME prepare phase and return -ERANGE
      accordingly.
      
      Finally set these limits in the mv88e6xxx driver.
      ====================
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02cb24e9
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: specify ageing time limits · 9ff74f24
      Vivien Didelot authored
      Now that DSA has ageing time limits, specify them when registering a
      switch so that out-of-range values are handled correctly by the core.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reported-by: default avatarJason Cobham <jcobham@questertangent.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ff74f24
    • Vivien Didelot's avatar
      net: dsa: check out-of-range ageing time value · 0f3da6af
      Vivien Didelot authored
      If a DSA switch driver cannot program an ageing time value due to it
      being out-of-range, switchdev will raise a stack trace before failing.
      
      To fix this, add ageing_time_min and ageing_time_max members to the
      dsa_switch in order for the switch drivers to optionally specify their
      supported ageing time limits.
      
      The DSA core will now check for provided ageing time limits and return
      -ERANGE from the switchdev prepare phase if the value is out-of-range.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f3da6af
    • Vivien Didelot's avatar
      net: dsa: dsa_fastest_ageing_time return unsigned · e893de1b
      Vivien Didelot authored
      The ageing time is defined as unsigned int, so make
      dsa_fastest_ageing_time return an unsigned int instead of int.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e893de1b
    • David S. Miller's avatar
      Merge branch 'mqprio-offload-more-info' · 1aed1814
      David S. Miller authored
      Alexander Duyck says:
      
      ====================
      Add support for passing more information in mqprio offload
      
      This patch series lays the groundwork for future work to allow us to make
      full use of the mqprio options when offloading them to hardware.
      
      Currently when we specify the hardware offload for mqprio the queue
      configuration is completely ignored and the hardware is only notified of
      the total number of traffic classes.  The problem is this leads to multiple
      issues, one specific issue being you can pass the queue configuration you
      want and it is totally ignored by the hardware.
      
      What I am planning to do is add support for "hw" values in the
      configuration greater than 1.  So for example we might have one mode of
      mqprio offload that uses 1 and only offloads the TC counts like we
      currently do.  Then we might look at adding an option 2 which would factor
      in the TCs and the queue count information. This way we can select between
      the type of offload we actually want and existing drivers that don't
      support this can just fall back to their legacy configuration.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1aed1814
    • Amritha Nambiar's avatar
      mqprio: Modify mqprio to pass user parameters via ndo_setup_tc. · 56f36acd
      Amritha Nambiar authored
      The configurable priority to traffic class mapping and the user specified
      queue ranges are used to configure the traffic class, overriding the
      hardware defaults when the 'hw' option is set to 0. However, when the 'hw'
      option is non-zero, the hardware QOS defaults are used.
      
      This patch makes it so that we can pass the data the user provided to
      ndo_setup_tc. This allows us to pull in the queue configuration if the
      user requested it as well as any additional hardware offload type
      requested by using a value other than 1 for the hw value.
      
      Finally it also provides a means for the device driver to return the level
      supported for the offload type via the qopt->hw value. Previously we were
      just always assuming the value to be 1, in the future values beyond just 1
      may be supported.
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56f36acd
    • Alexander Duyck's avatar
      mqprio: Change handling of hw u8 to allow for multiple hardware offload modes · 2026fecf
      Alexander Duyck authored
      This patch is meant to allow for support of multiple hardware offload type
      for a single device. There is currently no bounds checking for the hw
      member of the mqprio_qopt structure.  This results in us being able to pass
      values from 1 to 255 with all being treated the same.  On retreiving the
      value it is returned as 1 for anything 1 or greater being set.
      
      With this change we are currently adding limited bounds checking by
      defining an enum and using those values to limit the reported hardware
      offloads.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2026fecf
    • Colin Ian King's avatar
      netxen_nic: remove redundant check if retries is zero · 5b769649
      Colin Ian King authored
      At the end of the timeout loop, retries will always be zero so
      the check for zero is redundant so remove it.  Also replace
      printk with pr_err as recommended by checkpatch.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b769649
    • David S. Miller's avatar
      Merge branch 'stmmac-dma-ops-multiqueue' · a28453c0
      David S. Miller authored
      Joao Pinto says:
      
      ====================
      net: stmmac: prepare dma operations for multiple queues
      
      As agreed with David Miller, this patch-set is the second of 3 to enable
      multiple queues in stmmac.
      
      This second one concentrates on dma operations adding functionalities as:
      a) DMA Operation Mode configuration per channel and done in the multiple
      queues configuration function
      b) DMA IRQ enable and Disable by channel
      c) DMA start and stop by channel
      d) RX and TX ring length configuration by channel
      e) RX and TX set tail pointer by channel
      f) DMA Channel initialization broke into Channel comon, RX and TX
      initialization
      g) TSO being configured for all available channels
      h) DMA interrupt treatment by channel
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a28453c0
    • Joao Pinto's avatar
      net: stmmac: stmmac interrupt treatment prepared for multiple queues · 7bac4e1e
      Joao Pinto authored
      This patch prepares the main ISR for multiple queues.
      Signed-off-by: default avatarJoao Pinto <jpinto@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bac4e1e
    • Joao Pinto's avatar
      net: stmmac: tso init prepared for multiple queues · 146617b8
      Joao Pinto authored
      This patch configures TSO for all available tx queues.
      Signed-off-by: default avatarJoao Pinto <jpinto@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      146617b8