1. 22 Sep, 2016 40 commits
    • Arek Lichwa's avatar
      Bluetooth: Fix NULL pointer dereference in mgmt context · dd7e39bb
      Arek Lichwa authored
      Adds missing callback assignment to cmd_complete in pending management command
      context. Dump path involves security procedure performed on legacy (pre-SSP)
      devices with service security requirements set to HIGH (16digits PIN).
      It fails when shorter PIN is delivered by user.
      
      [    1.517950] Bluetooth: PIN code is not 16 bytes long
      [    1.518491] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [    1.518584] IP: [<          (null)>]           (null)
      [    1.518584] PGD 9e08067 PUD 9fdf067 PMD 0
      [    1.518584] Oops: 0010 [#1] SMP
      [    1.518584] Modules linked in:
      [    1.518584] CPU: 0 PID: 1002 Comm: kworker/u3:2 Not tainted 4.8.0-rc6-354649-gaf4168c5 #16
      [    1.518584] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.9.3-20160701_074356-anatol 04/01/2014
      [    1.518584] Workqueue: hci0 hci_rx_work
      [    1.518584] task: ffff880009ce14c0 task.stack: ffff880009e10000
      [    1.518584] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
      [    1.518584] RSP: 0018:ffff880009e13bc8  EFLAGS: 00010293
      [    1.518584] RAX: 0000000000000000 RBX: ffff880009eed100 RCX: 0000000000000006
      [    1.518584] RDX: ffff880009ddc000 RSI: 0000000000000000 RDI: ffff880009eed100
      [    1.518584] RBP: ffff880009e13be0 R08: 0000000000000000 R09: 0000000000000001
      [    1.518584] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [    1.518584] R13: ffff880009e13ccd R14: ffff880009ddc000 R15: ffff880009ddc010
      [    1.518584] FS:  0000000000000000(0000) GS:ffff88000bc00000(0000) knlGS:0000000000000000
      [    1.518584] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    1.518584] CR2: 0000000000000000 CR3: 0000000009fdd000 CR4: 00000000000006f0
      [    1.518584] Stack:
      [    1.518584]  ffffffff81909808 ffff880009e13cce ffff880009e0d40b ffff880009e13c68
      [    1.518584]  ffffffff818f428d 00000000024000c0 ffff880009e13c08 ffffffff810ca903
      [    1.518584]  ffff880009e13c48 ffffffff811ade34 ffffffff8178c31f ffff880009ee6200
      [    1.518584] Call Trace:
      [    1.518584]  [<ffffffff81909808>] ? mgmt_pin_code_neg_reply_complete+0x38/0x60
      [    1.518584]  [<ffffffff818f428d>] hci_cmd_complete_evt+0x69d/0x3200
      [    1.518584]  [<ffffffff810ca903>] ? rcu_read_lock_sched_held+0x53/0x60
      [    1.518584]  [<ffffffff811ade34>] ? kmem_cache_alloc+0x1a4/0x200
      [    1.518584]  [<ffffffff8178c31f>] ? skb_clone+0x4f/0xa0
      [    1.518584]  [<ffffffff818f9d81>] hci_event_packet+0x8e1/0x28e0
      [    1.518584]  [<ffffffff81a421f1>] ? _raw_spin_unlock_irqrestore+0x31/0x50
      [    1.518584]  [<ffffffff810aea3e>] ? trace_hardirqs_on_caller+0xee/0x1b0
      [    1.518584]  [<ffffffff818e6bd1>] hci_rx_work+0x1e1/0x5b0
      [    1.518584]  [<ffffffff8107e4bd>] ? process_one_work+0x1ed/0x6b0
      [    1.518584]  [<ffffffff8107e538>] process_one_work+0x268/0x6b0
      [    1.518584]  [<ffffffff8107e4bd>] ? process_one_work+0x1ed/0x6b0
      [    1.518584]  [<ffffffff8107e9c3>] worker_thread+0x43/0x4e0
      [    1.518584]  [<ffffffff8107e980>] ? process_one_work+0x6b0/0x6b0
      [    1.518584]  [<ffffffff8107e980>] ? process_one_work+0x6b0/0x6b0
      [    1.518584]  [<ffffffff8108505f>] kthread+0xdf/0x100
      [    1.518584]  [<ffffffff81a4297f>] ret_from_fork+0x1f/0x40
      [    1.518584]  [<ffffffff81084f80>] ? kthread_create_on_node+0x210/0x210
      Signed-off-by: default avatarArek Lichwa <arek.lichwa@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      dd7e39bb
    • David S. Miller's avatar
      Merge branch 'ftgmac100-ast2500-support' · cdd0766d
      David S. Miller authored
      Joel Stanley says:
      
      ====================
      ftgmac100 support for ast2500
      
      This series adds support to the ftgmac100 driver for the Aspeed ast2400 and
      ast2500 SoCs. In particular, they ensure the driver works correctly on the
      ast2500 where the MAC block has seen some changes in register layout.
      
      They have been tested on ast2400 and ast2500 systems with the NCSI stack and
      with a directly attached PHY.
      
      V2 reworks the two patches relating to PHYSTS_CHG into the one patch that
      disables the interrupt instead of playing with interrupt sensitivity. I kept
      patch 4 'net/faraday: Clear stale interrupts' which was first introduced to
      clear the stale PHYSTS_CHG interrupt, as it helps keep us safe from unhygienic
      (vendor) bootloaders.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdd0766d
    • Joel Stanley's avatar
      net/faraday: Mask out PHYSTS_CHG interrupt · edcd692f
      Joel Stanley authored
      The PHYSTS_CHG (the ftgmac100's PHY IRQ) is telling the system to go
      look at the PHY registers for a link status change.
      
      The interrupt was causing issues on Aspeed SoC where some board designs
      had an active high configuration, some active low, and in some cases
      repurposed for other functions. When misconfigured Linux would chew 100%
      of CPU cycles servicing interrupts:
      
       [   20.280000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
       [   20.280000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
       [   20.280000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
       [   20.300000] ftgmac100 1e660000.ethernet eth0: [ISR] = 0x200: PHYSTS_CHG
      
      While in the ftgmac100 IP can be configured for high, low and edge
      sensitivity the current driver always polls the PHY, so we chose to mask
      out the interrupt.
      
      See https://patchwork.ozlabs.org/patch/672099/ for more discussion.
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edcd692f
    • Joel Stanley's avatar
      net/faraday: Configure old MDIO interface on Aspeed SoCs · e07dc63b
      Joel Stanley authored
      The Aspeed SoCs have a new MDIO interface as an option in the G4 and G5
      SoCs. The old one is still available, so select it in order to remain
      compatible with the ftgmac100 driver.
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e07dc63b
    • Gavin Shan's avatar
      net/faraday: Clear stale interrupts · 08c9c126
      Gavin Shan authored
      There is stale interrupt (PHYSTS_CHG in ISR, bit#6 in 0x0) from
      the bootloader (uboot) when enabling the MAC. The stale interrupts
      aren't part of kernel and should be cleared.
      
      This clears the stale interrupts in ISR (0x0) when enabling the MAC.
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08c9c126
    • Joel Stanley's avatar
      net/faraday: Adapt for Aspeed SoCs · 2a0ab8eb
      Joel Stanley authored
      The RXDES and TXDES registers bits in the ftgmac100 indicates EDO{R,T}R
      at bit position 15 for the Faraday Tech IP. However, the version of this
      IP present in the Aspeed SoCs has these bits at position 30 in the
      registers.
      
      It appers that ast2400 SoCs support both positions, with the 15th bit
      marked as reserved but still functional. In the ast2500 this bit is
      reused for another function, so we need a work around.
      
      This was confirmed with engineers from Aspeed that using bit 30 is
      correct for both the ast2400 and ast2500 SoCs.
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a0ab8eb
    • Andrew Jeffery's avatar
      net/faraday: Make EDO{R,T}R bits configurable · 7906a4da
      Andrew Jeffery authored
      These bits are #defined at a fixed location. In order to support future
      hardware that has chosen to move these bits around move the bits into a
      member of the struct ftgmac100.
      Signed-off-by: default avatarAndrew Jeffery <andrew@aj.id.au>
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7906a4da
    • Andrew Jeffery's avatar
      net/faraday: Separate rx page storage from rxdesc · ada66b54
      Andrew Jeffery authored
      The ftgmac100 hardware revision in e.g. the Aspeed AST2500 no longer
      reserves all bits in RXDES#2 but instead uses the bottom 16 bits to
      store MAC frame metadata. Avoid corruption by shifting struct page
      pointers out to their own member in struct ftgmac100.
      Signed-off-by: default avatarAndrew Jeffery <andrew@aj.id.au>
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ada66b54
    • Wei Yongjun's avatar
      cxgb4: Convert to use simple_open() · 524605e5
      Wei Yongjun authored
      Remove an open coded simple_open() function and replace file
      operations references to the function with simple_open()
      instead.
      
      Generated by: scripts/coccinelle/api/simple_open.cocci
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      524605e5
    • Wei Yongjun's avatar
      net: dsa: qca8k: use mdio_module_driver to simplify the code · a084ab33
      Wei Yongjun authored
      mdio_module_driver() makes the code simpler by eliminating
      boilerplate code.
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a084ab33
    • Wei Yongjun's avatar
      net: dsa: qca8k: fix non static symbol warning · fcfbfd68
      Wei Yongjun authored
      Fixes the following sparse warning:
      
      drivers/net/dsa/qca8k.c:259:22: warning:
       symbol 'qca8k_regmap_config' was not declared. Should it be static?
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcfbfd68
    • David S. Miller's avatar
      Merge branch 'sctp-align' · 9ba62f95
      David S. Miller authored
      Marcelo Ricardo Leitner says:
      
      ====================
      Rename WORD_TRUNC/ROUND macros and use them
      
      This patchset aims to rename these macros to a non-confusing name, as
      reported by David Laight and David Miller, and to update all remaining
      places to make use of it, which was 1 last remaining spot.
      
      v3:
      - Name it SCTP_PAD4 instead of SCTP_ALIGN4, as suggested by David Laight
      v2:
      - fixed 2nd patch summary
      
      Details on the specific changelogs.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ba62f95
    • Marcelo Ricardo Leitner's avatar
      sctp: make use of SCTP_TRUNC4 macro · 4a225ce3
      Marcelo Ricardo Leitner authored
      And avoid the usage of '&~3'. This is the last place still not using
      the macro.
      Also break the line to make it easier to read.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a225ce3
    • Marcelo Ricardo Leitner's avatar
      sctp: rename WORD_TRUNC/ROUND macros · e2f036a9
      Marcelo Ricardo Leitner authored
      To something more meaningful these days, specially because this is
      working on packet headers or lengths and which are not tied to any CPU
      arch but to the protocol itself.
      
      So, WORD_TRUNC becomes SCTP_TRUNC4 and WORD_ROUND becomes SCTP_PAD4.
      Reported-by: default avatarDavid Laight <David.Laight@ACULAB.COM>
      Reported-by: default avatarDavid Miller <davem@davemloft.net>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2f036a9
    • David S. Miller's avatar
      Merge branch 'mlx5e-xdp' · b80b8d7a
      David S. Miller authored
      Tariq Toukan says:
      
      ====================
      mlx5e XDP support
      
      This series adds XDP support in mlx5e driver.
      This includes the use cases: XDP_DROP, XDP_PASS, and XDP_TX.
      
      Single stream performance tests show 16.5 Mpps for XDP_DROP,
      and 12.4 Mpps for XDP_TX, with nice scalability for multiple streams/rings.
      
      This rate of XDP_DROP is lower than the 32 Mpps we got in previous
      implementation, when Striding RQ was used.
      
      We moved to non-Striding RQ, as some XDP_TX requirements (like headroom,
      packet-per-page) cannot be satisfied with the current Striding RQ HW,
      and we decided to fully support both DROP/TX.
      
      Few directions are considered in order to enable the faster rate for XDP_DROP,
      e.g a possibility for users to enable Striding RQ so they choose optimized
      XDP_DROP on the price of partial XDP_TX functionality, or some HW changes.
      
      Series generated against net-next commit:
      cf714ac1 'ipvlan: Fix dependency issue'
      
      Thanks,
      Tariq
      
      V2:
      * patch 8:
       - when XDP_TX fails, call mlx5e_page_release and drop the packet.
       - update xdp_tx counter within mlx5e_xmit_xdp_frame.
         (mlx5e_xmit_xdp_frame return value becomes obsolete, change it to void)
       - drop the packet for unknown XDP return code.
      * patch 9:
       - use a boolean for xdp_doorbell in SQ struct, instead of dragging it
         throughout the functions calls.
       - handle doorbell and counters within mlx5e_xmit_xdp_frame.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b80b8d7a
    • Saeed Mahameed's avatar
      net/mlx5e: XDP TX xmit more · 35b510e2
      Saeed Mahameed authored
      Previously we rang XDP SQ doorbell on every forwarded XDP packet.
      
      Here we introduce a xmit more like mechanism that will queue up more
      than one packet into SQ (up to RX napi budget) w/o notifying the hardware.
      
      Once RX napi budget is consumed and we exit napi RX loop, we will
      flush (doorbell) all XDP looped packets in case there are such.
      
      XDP forward packet rate:
      
      Comparing XDP with and w/o xmit more (bulk transmit):
      
      RX Cores    XDP TX       XDP TX (xmit more)
      ---------------------------------------------------
      1           6.5Mpps      12.4Mpps
      2          13.2Mpps      24.2Mpps
      4          25.2Mpps      36.3Mpps*
      8          36.3Mpps*     36.3Mpps*
      
      *My xmitter was limited to 36.3Mpps, so it is the bottleneck.
      It seems that receive side can handle more.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35b510e2
    • Saeed Mahameed's avatar
      net/mlx5e: XDP TX forwarding support · b5503b99
      Saeed Mahameed authored
      Adding support for XDP_TX forwarding from xdp program.
      Using XDP, now user can loop packets out of the same port.
      
      We create a dedicated TX SQ for each channel that will serve
      XDP programs that return XDP_TX action to loop packets back to
      the wire directly from the channel RQ RX path.
      
      For that RX pages will now need to be mapped bi-directionally,
      and on XDP_TX action we will sync the page back to device then
      queue it into SQ for transmission.  The XDP xmit frame function will
      report back to the RX path if the page was consumed (transmitted), if so,
      RX path will forget about that page as if it were released to the stack.
      Later on, on XDP TX completion, the page will be released back to the
      page cache.
      
      For simplicity this patch will hit a doorbell on every XDP TX packet.
      
      Next patch will introduce a xmit more like mechanism that will
      queue up more than one packet into SQ w/o notifying the hardware,
      once RX napi loop is done we will hit doorbell once for all XDP TX
      packets form the previous loop.  This should drastically improve
      XDP TX performance.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5503b99
    • Saeed Mahameed's avatar
      net/mlx5e: Have a clear separation between different SQ types · f10b7cc7
      Saeed Mahameed authored
      Make a clear separate between Regular SQ (TXQ) and ICO SQ creation,
      destruction and union their mutual information structures.
      
      Don't allocate redundant TXQ skb/wqe_info/dma_fifo arrays for ICO SQ.
      And have a different SQ edge for ICO SQ than TXQ SQ, to be more
      accurate.
      
      In preparation for XDP TX support.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f10b7cc7
    • Rana Shahout's avatar
      net/mlx5e: XDP fast RX drop bpf programs support · 86994156
      Rana Shahout authored
      Add support for the BPF_PROG_TYPE_PHYS_DEV hook in mlx5e driver.
      
      When XDP is on we make sure to change channels RQs type to
      MLX5_WQ_TYPE_LINKED_LIST rather than "striding RQ" type to
      ensure "page per packet".
      
      On XDP set, we fail if HW LRO is set and request from user to turn it
      off.  Since on ConnectX4-LX HW LRO is always on by default, this will be
      annoying, but we prefer not to enforce LRO off from XDP set function.
      
      Full channels reset (close/open) is required only when setting XDP
      on/off.
      
      When XDP set is called just to exchange programs, we will update
      each RQ xdp program on the fly and for synchronization with current
      data path RX activity of that RQ, we temporally disable that RQ and
      ensure RX path is not running, quickly update and re-enable that RQ,
      for that we do:
      	- rq.state = disabled
      	- napi_synnchronize
      	- xchg(rq->xdp_prg)
      	- rq.state = enabled
      	- napi_schedule // Just in case we've missed an IRQ
      
      Packet rate performance testing was done with pktgen 64B packets and on
      TX side and, TC drop action on RX side compared to XDP fast drop.
      
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      
      Comparison is done between:
      	1. Baseline, Before this patch with TC drop action
      	2. This patch with TC drop action
      	3. This patch with XDP RX fast drop
      
      RX Cores  Baseline(TC drop)    TC drop    XDP fast Drop
      --------------------------------------------------------------
      1            5.3Mpps           5.3Mpps     16.5Mpps
      2           10.2Mpps          10.2Mpps     31.3Mpps
      4           20.5Mpps          19.9Mpps     36.3Mpps*
      
      *My xmitter was limited to 36.3Mpps, so it is the bottleneck.
      It seems that receive side can handle more.
      Signed-off-by: default avatarRana Shahout <ranas@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86994156
    • Saeed Mahameed's avatar
      net/mlx5e: Dynamic RQ type infrastructure · 2fc4bfb7
      Saeed Mahameed authored
      Add two helper functions to allow dynamic changes of RQ type.
      
      mlx5e_set_rq_priv_params and mlx5e_set_rq_type_params will be
      used on netdev creation to determine the default RQ type.
      
      This will be needed later for downstream patches of XDP support.
      When enabling XDP we will dynamically move from striding RQ to
      linked list RQ type.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2fc4bfb7
    • Saeed Mahameed's avatar
      net/mlx5e: Slightly reduce hardware LRO size · e4b85508
      Saeed Mahameed authored
      Before this patch LRO size was 64K, now with build_skb requires
      extra room, headroom + sizeof(skb_shared_info) added to the data
      buffer will make  wqe size or page_frag_size slightly larger than
      64K which will demand order 5 page instead of order 4 in 4K page systems.
      
      We take those extra bytes from hardware LRO data size in order to not
      increase the required page order for when hardware LRO is enabled.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4b85508
    • Saeed Mahameed's avatar
      net/mlx5e: Union RQ RX info per RQ type · 21c59685
      Saeed Mahameed authored
      We have two types of RX RQs, and they use two separate sets of
      info arrays and structures in RX data path function.  Today those
      structures are mutually exclusive per RQ type, hence one kind is
      allocated on RQ creation according to the RQ type.
      
      For better cache locality and to minimalize the
      sizeof(struct mlx5e_rq), in this patch we define them as a union.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21c59685
    • Saeed Mahameed's avatar
      net/mlx5e: Build RX SKB on demand · 1bfecfca
      Saeed Mahameed authored
      For non-striding RQ configuration before this patch we had a ring
      with pre-allocated SKBs and mapped the SKB->data buffers for
      device.
      
      For robustness and better RX data buffers management, we allocate a
      page per packet and build_skb around it.
      
      This patch (which is a prerequisite for XDP) will actually reduce
      performance for normal stack usage, because we are now hitting a bottleneck
      in the page allocator. We use the page-cache to restore or even improve
      performance in comparison to the old RX scheme.
      
      Packet rate performance testing was done with pktgen 64B packets on xmit
      side and TC ingress dropping action on RX side.
      
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      
      Comparison is done between:
       1.Baseline, before 'net/mlx5e: Build RX SKB on demand'
       2.Build SKB with RX page cache (This patch)
      
      RX Cores  Baseline    Build SKB+page-cache    Improvement
      -----------------------------------------------------------
      1          4.16Mpps       5.33Mpps                28%
      2          7.16Mpps      10.24Mpps                43%
      4         13.61Mpps      20.51Mpps                51%
      8         25.32Mpps      32.00Mpps                26%
      
      All respective cores were 100% utilized.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bfecfca
    • Eric Dumazet's avatar
      tcp: implement TSQ for retransmits · f9616c35
      Eric Dumazet authored
      We saw sch_fq drops caused by the per flow limit of 100 packets and TCP
      when dealing with large cwnd and bursts of retransmits.
      
      Even after increasing the limit to 1000, and even after commit
      10d3be56 ("tcp-tso: do not split TSO packets at retransmit time"),
      we can still have these drops.
      
      Under certain conditions, TCP can spend a considerable amount of
      time queuing thousands of skbs in a single tcp_xmit_retransmit_queue()
      invocation, incurring latency spikes and stalls of other softirq
      handlers.
      
      This patch implements TSQ for retransmits, limiting number of packets
      and giving more chance for scheduling packets in both ways.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9616c35
    • David S. Miller's avatar
      Merge branch 'mv88e6390-prep' · 0f1100c1
      David S. Miller authored
      Andrew Lunn says:
      
      ====================
      Preparation for mv88e6390
      
      These two patches are a couple of preparation steps for supporting the
      the MV88E6390 family of chips. This is a new generation from Marvell,
      and will need more feature flags than are currently available in an
      unsigned long. Expand to an unsigned long long. The MV88E6390 also
      places its port registers somewhere else, so add a wrapper around port
      register access.
      
      v2:
       Rework wrappers to use mv88e6xxx_{read|write}
       Simpliy some (err < ) to (err)
      Add Reviewed by tag.
      
      v3::
       reg = reg & foo -> reg &= foo
       Fix over zealous s/ret/err
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f1100c1
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Convert flag bits to unsigned long long · d6b1023a
      Andrew Lunn authored
      We are soon going to run out of flag bits on 32bit systems. Convert to
      unsigned long long.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6b1023a
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Add helper for accessing port registers · 0e7b9925
      Andrew Lunn authored
      There is a device coming soon which places its port registers
      somewhere different to all other Marvell switches supported so far.
      Add helper functions for reading/writing port registers, making it
      easier to handle this new device.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e7b9925
    • Nicolas Pitre's avatar
      ptp_clock: future-proofing drivers against PTP subsystem becoming optional · efee95f4
      Nicolas Pitre authored
      Drivers must be ready to accept NULL from ptp_clock_register() if the
      PTP clock subsystem is configured out.
      
      This patch documents that and ensures that all drivers cope well
      with a NULL return.
      Signed-off-by: default avatarNicolas Pitre <nico@linaro.org>
      Reviewed-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Acked-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      efee95f4
    • Philippe Reynes's avatar
      net: ethernet: hisilicon: hns: use new api ethtool_{get|set}_link_ksettings · d270f76c
      Philippe Reynes authored
      The ethtool api {get|set}_settings is deprecated.
      We move this driver to new api {get|set}_link_ksettings.
      Signed-off-by: default avatarPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d270f76c
    • Philippe Reynes's avatar
      net: ethernet: hisilicon: hns: use phydev from struct net_device · 262b38cd
      Philippe Reynes authored
      The private structure contain a pointer to phydev, but the structure
      net_device already contain such pointer. So we can remove the pointer
      phydev in the private structure, and update the driver to use the
      one contained in struct net_device.
      Signed-off-by: default avatarPhilippe Reynes <tremyfr@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      262b38cd
    • Sean Wang's avatar
      net: ethernet: mediatek: fix missing changes merged for conflicts overlapping commits · e82f7148
      Sean Wang authored
      add the missing commits about
      1)
      Commit d3bd1ce4
      ("remove redundant free_irq for devm_request_ir allocated irq")
      2)
      Commit 7c6b0d76
      ("fix logic unbalance between probe and remove")
      
      during merge for conflicts overlapping commits by
      Commit b20b378d
      ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
      Signed-off-by: default avatarSean Wang <sean.wang@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e82f7148
    • David S. Miller's avatar
      Merge branch 'cxgb4-tc-offload' · f9d1846f
      David S. Miller authored
      Rahul Lakkireddy says:
      
      ====================
      cxgb4: add support for offloading TC u32 filters
      
      This series of patches add support to offload TC u32 filters onto
      Chelsio NICs.
      
      Patch 1 moves current common filter code to separate files
      in order to provide a common api for performing packet classification
      and filtering in Chelsio NICs.
      
      Patch 2 enables filters for normal NIC configuration and implements
      common api for setting and deleting filters.
      
      Patches 3-5 add support for TC u32 offload via ndo_setup_tc.
      
      ---
      v3:
      
      Based on review and suggestion from David Miller <davem@davemloft.net>
      - Fixed all local variable declarations by placing them in longest line
        first and shortest line last order.
      
      v2:
      
      Based on review and suggestions from Jiri Pirko <jiri@resnulli.us>:
      - Replaced macros S and U with appropriate static helper functions.
      - Moved completion code for set and delete filters to respective
        functions cxgb4_set_filter() and cxgb4_del_filter().  Renamed the
        original functions to __cxgb4_set_filter() and __cxgb4_del_filter()
        in case synchronization is not required.
      - Dropped debugfs patch.
      - Merged code for inserting and deleting u32 filters into a single
        patch.
      - Reworked and fixed bugs with traversing the actions list.
      - Removed all unnecessary extra ().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9d1846f
    • Rahul Lakkireddy's avatar
      cxgb4: add support for drop and redirect actions · b20ff726
      Rahul Lakkireddy authored
      Add support for dropping matched packets in hardware.  Also add support
      for re-directing matched packets to a specified port in hardware.
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b20ff726
    • Rahul Lakkireddy's avatar
      cxgb4: add support for offloading u32 filters · d8931847
      Rahul Lakkireddy authored
      Add support for offloading u32 filter onto hardware.  Links are stored
      in a jump table to perform necessary jumps to match TCP/UDP header.
      When inserting rules in the linked bucket, the TCP/UDP match fields
      in the corresponding entry of the jump table are appended to the filter
      rule before insertion.  If a link is deleted, then all corresponding
      filters associated with the link are also deleted.  Also enable
      hardware tc offload as a supported feature.
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8931847
    • Rahul Lakkireddy's avatar
      cxgb4: add parser to translate u32 filters to internal spec · 2e8aad7b
      Rahul Lakkireddy authored
      Parse information sent by u32 into internal filter specification.
      Add support for parsing several fields in IPv4, IPv6, TCP, and UDP.
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e8aad7b
    • Rahul Lakkireddy's avatar
      cxgb4: add common api support for configuring filters · 578b46b9
      Rahul Lakkireddy authored
      Enable filters for non-offload configuration and add common api support
      for setting and deleting filters in LE-TCAM region of the hardware.
      
      IPv4 filters occupy one slot.  IPv6 filters occupy 4 slots and must
      be on a 4-slot boundary.  IPv4 filters can not occupy a slot belonging
      to IPv6 and the vice-versa is also true.
      
      Filters are set and deleted asynchronously.  Use completion to wait
      for reply from firmware in order to allow for synchronization if needed.
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      578b46b9
    • Rahul Lakkireddy's avatar
      cxgb4: move common filter code to separate file · d57fd6ca
      Rahul Lakkireddy authored
      Move common filter code to separate files.  Also fix the following
      checkpatch checks.
      
      CHECK: Comparison to NULL could be written "!f->l2t"
      +               if (f->l2t == NULL) {
      
      CHECK: spaces preferred around that '/' (ctx:VxV)
      +       fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr)/16));
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d57fd6ca
    • Shmulik Ladkani's avatar
      net: skbuff: Coding: Use eth_type_vlan() instead of open coding it · ecf4ee41
      Shmulik Ladkani authored
      Fix 'skb_vlan_pop' to use eth_type_vlan instead of directly comparing
      skb->protocol to ETH_P_8021Q or ETH_P_8021AD.
      Signed-off-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Reviewed-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ecf4ee41
    • Shmulik Ladkani's avatar
      net: skbuff: Remove errornous length validation in skb_vlan_pop() · 636c2628
      Shmulik Ladkani authored
      In 93515d53
        "net: move vlan pop/push functions into common code"
      skb_vlan_pop was moved from its private location in openvswitch to
      skbuff common code.
      
      In case skb has non hw-accel vlan tag, the original 'pop_vlan()' assured
      that skb->len is sufficient (if skb->len < VLAN_ETH_HLEN then pop was
      considered a no-op).
      
      This validation was moved as is into the new common 'skb_vlan_pop'.
      
      Alas, in its original location (openvswitch), there was a guarantee that
      'data' points to the mac_header, therefore the 'skb->len < VLAN_ETH_HLEN'
      condition made sense.
      However there's no such guarantee in the generic 'skb_vlan_pop'.
      
      For short packets received in rx path going through 'skb_vlan_pop',
      this causes 'skb_vlan_pop' to fail pop-ing a valid vlan hdr (in the non
      hw-accel case) or to fail moving next tag into hw-accel tag.
      
      Remove the 'skb->len < VLAN_ETH_HLEN' condition entirely:
      It is superfluous since inner '__skb_vlan_pop' already verifies there
      are VLAN_ETH_HLEN writable bytes at the mac_header.
      
      Note this presents a slight change to skb_vlan_pop() users:
      In case total length is smaller than VLAN_ETH_HLEN, skb_vlan_pop() now
      returns an error, as opposed to previous "no-op" behavior.
      Existing callers (e.g. tc act vlan, ovs) usually drop the packet if
      'skb_vlan_pop' fails.
      
      Fixes: 93515d53 ("net: move vlan pop/push functions into common code")
      Signed-off-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Cc: Pravin Shelar <pshelar@ovn.org>
      Reviewed-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      636c2628
    • David S. Miller's avatar
      Merge branch 'vlan_act_modify' · 1fbafcb8
      David S. Miller authored
      Shmulik Ladkani says:
      
      ====================
      act_vlan: Introduce TCA_VLAN_ACT_MODIFY vlan action
      
      TCA_VLAN_ACT_MODIFY allows one to change an existing tag.
      
      It accepts same attributes as TCA_VLAN_ACT_PUSH (protocol, id,
      priority).
      If packet is vlan tagged, then the tag gets overwritten according to
      user specified attributes.
      
      For example, this allows user to replace a tag's vid while preserving
      its priority bits (as opposed to "action vlan pop pipe action vlan push").
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fbafcb8