1. 11 May, 2016 20 commits
    • David S. Miller's avatar
      Merge branch 'mlx5-next' · 6a47a570
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox 100G mlx5 CQE compression
      
      Introducing ConnectX-4 CQE (Completion Queue Entry) compression feature
      for mlx5 etherent driver.
      
      CQE Compressing reduces PCI overhead by coalescing and compressing multiple CQEs into a
      single merged CQE.  Successful compressing improves message rate especially for small packet
      traffic.
      
      CQE Compressing in details:
      
      Instead of writing full CQEs to memory, multiple almost identical CQEs are merged and compressed.
      Information that is shared between the CQEs is written once, regardless of the number of
      compressed CQEs.  In addition, only the unique information (small amount of bytes compared to
      full CQE size) is written per CQE.
      
      CQE Compression Block:
      
      This block contains multiple compressed CQEs.  CQE Compression Block contains a single copy
      of CQEs properties which are shared between all the compressed CQEs (called Title, see below)
      and multiple mini CQEs (CQEs in compressed form).
      
      Title:
      
      The Title holds information which is shared between all the compressed CQEs in the CQE Compression
      Block.  In each Compression Block there is only a single Title regardless of the number
      of compressed CQEs.
      
      Mini CQE:
      
      A CQE in compressed form that holds some data needed to extract a single full CQE, for example
      8 Bytes instead of 64 Bytes.
      The shared information between all compressed CQEs, which belong to the same CQE Compression
      Block called Title, is written once, and only the unique information in each compressed
      CQE, for example 8 bytes, is written per compressed CQE, called mini CQE.
      
      Since CQE Compression can add overhead to the software (CPU),
      it will be only enabled on "weak/slow" PCI slots, where it can actually help.
      
      Applied on top: c047c3b1 ('netfilter: conntrack: remove uninitialized shadow variable')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a47a570
    • Saeed Mahameed's avatar
      net/mlx5e: Enable CQE compression when PCI is slower than link · b797a684
      Saeed Mahameed authored
      We turn the feature ON, only for servers with PCI BW < MAX LINK BW, as it
      helps reducing PCI pressure on weak PCI slots, but it adds some software
      overhead.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b797a684
    • Tariq Toukan's avatar
      net/mlx5e: Expand WQE stride when CQE compression is enabled · d9d9f156
      Tariq Toukan authored
      Make the MPWQE/Striding RQ default configuration dynamic and not
      statically set at compile time.  Now at driver load we set
      stride size and num strides dynamically.
      
      By default we use same values as before, but when CQE compression
      is enabled, we set larger stride size to benefit from CQE
      compression for larger packets.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9d9f156
    • Tariq Toukan's avatar
      net/mlx5e: CQE compression · 7219ab34
      Tariq Toukan authored
      CQE compression feature is meant to save PCIe bandwidth by
      compressing few CQEs into smaller amount of bytes on PCIe.
      CQE compression can be selectively enabled per CQ.  By default
      is disabled for now and will be enabled later on.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7219ab34
    • David S. Miller's avatar
      Merge branch 'more-dsa-probing' · c1869d58
      David S. Miller authored
      Andrew Lunn says:
      
      ====================
      More enabler patches for DSA probing
      
      The complete set of patches for the reworked DSA probing is too big to
      post as once. These subset contains some enablers which are easy to
      review.
      
      Eventually, the Marvell driver will instantiate its own internal MDIO
      bus, rather than have the framework do it, thus allows devices on the
      bus to be listed in the device tree. Initialize the main mutex as soon
      as it is created, to avoid lifetime issues with the mdio bus.
      
      A previous patch renamed all the DSA probe functions to make room for
      a true device probe. However the recent merging of all the Marvell
      switch drivers resulted in mv88e6xxx going back to the old probe
      name. Rename it again, so we can have a driver probe function.
      
      Add minimum support for the Marvell switch driver to probe as an MDIO
      device, as well as an DSA driver. Later patches will then register
      this device with the new DSA core framework.
      
      Move the GPIO reset code out of the DSA code. Different drivers may
      need different reset mechanisms, e.g. via a reset controller for
      memory mapped devices. Don't clutter up the core with this. Let each
      driver implement what it needs.
      
      master_dev is no longer needed in the switch drivers, since they have
      access to a device pointer from the probe function. Remove it.
      
      Let the switch parse the eeprom length from its one device tree
      node. This is required with the new binding when the central DSA
      platform device no longer exists.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1869d58
    • Andrew Lunn's avatar
      dsa: mv88e6xxx: Handle eeprom-length property · f8cd8753
      Andrew Lunn authored
      A switch can export an attached EEPROM using the standard ethtool API.
      However the switch itself cannot determine the size of the EEPROM, and
      multiple sizes are allowed. Thus a device tree property is supported
      to indicate the length of the EEPROM. Parse this property during
      device probe, and implement a callback function to retrieve it.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8cd8753
    • Andrew Lunn's avatar
      dsa: Rename switch chip data to cd · ff04955c
      Andrew Lunn authored
      The dsa_switch structure contains a dsa_chip_data member called pd.
      However in the rest of the code, pd is used for dsa_platform_data.
      This is confusing. Rename it cd, which is already often used in dsa.c
      and slave.c for this data type.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff04955c
    • Andrew Lunn's avatar
      dsa: Remove master_dev from switch structure · c33063d6
      Andrew Lunn authored
      The switch drivers only use the master_dev member for dev_info()
      messages.  Now that the device is passed to the old style probe, and
      new style drivers are probed as true linux drivers, this is no longer
      needed.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c33063d6
    • Andrew Lunn's avatar
      dsa: Move gpio reset into switch driver · 52638f71
      Andrew Lunn authored
      Resetting the switch is something the driver does, not the framework.
      So move the parsing of this property into the driver.
      
      There are no in kernel users of this property, so moving it does not
      break anything. There is however a board which will make use of this
      property making its way into the kernel.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52638f71
    • Andrew Lunn's avatar
      dsa: Add mdio device support to Marvell switches · 14c7b3c3
      Andrew Lunn authored
      Allow Marvell switches to be mdio devices. Currently the driver just
      allocate the private structure and detects what device is on the
      bus. Later patches will make them register with the DSA framework.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14c7b3c3
    • Andrew Lunn's avatar
      dsa: mv88e6xxx: Rename probe function to fit the normal pattern · fcdce7d0
      Andrew Lunn authored
      All other DSA drivers use _drv_ in there DSA probe function name, thus
      allowing for a true linux driver probe function to use the
      conventional name. Make mv88e6xxx fit this pattern.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcdce7d0
    • Andrew Lunn's avatar
      dsa: mv88e6xxx: Initialise the mutex as soon as it is created · b681957a
      Andrew Lunn authored
      By initialising immediately it, we don't run the danger of using it
      before it is initialised.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b681957a
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: add STU capability · cb9b9020
      Vivien Didelot authored
      Some switch models have a STU (per VLAN port state database). Add a new
      capability flag to switches info, instead of checking their family.
      
      Also if the 6165 family has an STU, it must have a VTU, so add the
      MV88E6XXX_FLAG_VTU to its family flags.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb9b9020
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: abstract VTU/STU data access · 15d7d7d4
      Vivien Didelot authored
      Both VTU and STU operations use the same routine to access their
      (common) data registers, with a different offset.
      
      Add VTU and STU specific read and write functions to the data registers
      to abstract the required offset.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15d7d7d4
    • David S. Miller's avatar
      Merge branch 'vrf-pktinfo' · c3f1010b
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: vrf: Fixup PKTINFO to return enslaved device index
      
      Applications such as OSPF and BFD need the original ingress device not
      the VRF device; the latter can be derived from the former. To that end
      move the packet intercept from an rx handler that is invoked by
      __netif_receive_skb_core to the ipv4 and ipv6 receive processing.
      
      IPv6 already saves the skb_iif to the control buffer in ipv6_rcv. Since
      the skb->dev has not been switched the cb has the enslaved device. Make
      the same happen for IPv4 by adding the skb_iif to inet_skb_parm and set
      it in ipv4 code after clearing the skb control buffer similar to IPv6.
      From there the pktinfo can just pull it from cb with the PKTINFO_SKB_CB
      cast.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3f1010b
    • David Ahern's avatar
      net: original ingress device index in PKTINFO · 0b922b7a
      David Ahern authored
      Applications such as OSPF and BFD need the original ingress device not
      the VRF device; the latter can be derived from the former. To that end
      add the skb_iif to inet_skb_parm and set it in ipv4 code after clearing
      the skb control buffer similar to IPv6. From there the pktinfo can just
      pull it from cb with the PKTINFO_SKB_CB cast.
      
      The previous patch moving the skb->dev change to L3 means nothing else
      is needed for IPv6; it just works.
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b922b7a
    • David Ahern's avatar
      net: l3mdev: Add hook in ip and ipv6 · 74b20582
      David Ahern authored
      Currently the VRF driver uses the rx_handler to switch the skb device
      to the VRF device. Switching the dev prior to the ip / ipv6 layer
      means the VRF driver has to duplicate IP/IPv6 processing which adds
      overhead and makes features such as retaining the ingress device index
      more complicated than necessary.
      
      This patch moves the hook to the L3 layer just after the first NF_HOOK
      for PRE_ROUTING. This location makes exposing the original ingress device
      trivial (next patch) and allows adding other NF_HOOKs to the VRF driver
      in the future.
      
      dev_queue_xmit_nit is exported so that the VRF driver can cycle the skb
      with the switched device through the packet taps to maintain current
      behavior (tcpdump can be used on either the vrf device or the enslaved
      devices).
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74b20582
    • Nicolas Dichtel's avatar
      ipv6: fix 4in6 tunnel receive path · ca4aa976
      Nicolas Dichtel authored
      Protocol for 4in6 tunnel is IPPROTO_IPIP. This was wrongly changed by
      the last cleanup.
      
      CC: Tom Herbert <tom@herbertland.com>
      Fixes: 0d3c703a ("ipv6: Cleanup IPv6 tunnel receive path")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca4aa976
    • Lawrence Brakmo's avatar
      tcp: replace cnt & rtt with struct in pkts_acked() · 756ee172
      Lawrence Brakmo authored
      Replace 2 arguments (cnt and rtt) in the congestion control modules'
      pkts_acked() function with a struct. This will allow adding more
      information without having to modify existing congestion control
      modules (tcp_nv in particular needs bytes in flight when packet
      was sent).
      
      As proposed by Neal Cardwell in his comments to the tcp_nv patch.
      Signed-off-by: default avatarLawrence Brakmo <brakmo@fb.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      756ee172
    • David S. Miller's avatar
      Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · cf88585b
      David S. Miller authored
      Antonio Quartulli says:
      
      ====================
      Included changes:
      - remove useless skb size check in batadv_interface_rx
      - basic netns support introduced by Andrew Lunn:
          - prevent virtual interface from changing netns by setting
            NETIF_F_NETNS_LOCAL
          - create virtual interface within the netns of the first
            hard-interface
      - introduce detection of complex bridge loops and report event
        to the user (via udev) when the Bridge Loop Avoidance mechanism
        can't prevent them
      - minor reference counting bugfixes for the hard_iface object that
        couldn't make it via the net tree
      - use kref_get() instead of kref_get_unless_zero() to make reference
        counting bug more visible
      - use batadv_compare_eth() all over the code when possible instead of
        plain memcmp()
      - minor code cleanup and style adjustments
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf88585b
  2. 10 May, 2016 20 commits