Commits · 6fa01ccd883021105e9f8af7d04b9f156fa3494a · Kirill Smelkov / linux

25 Apr, 2016 40 commits

skbuff: Add pskb_extract() helper function · 6fa01ccd

Sowmini Varadhan authored Apr 22, 2016

A pattern of skb usage seen in modules such as RDS-TCP is to
extract `to_copy' bytes from the received TCP segment, starting
at some offset `off' into a new skb `clone'. This is done in
the ->data_ready callback, where the clone skb is queued up for rx on
the PF_RDS socket, while the parent TCP segment is returned unchanged
back to the TCP engine.

The existing code uses the sequence
	clone = skb_clone(..);
	pskb_pull(clone, off, ..);
	pskb_trim(clone, to_copy, ..);
with the intention of discarding the first `off' bytes. However,
skb_clone() + pskb_pull() implies pksb_expand_head(), which ends
up doing a redundant memcpy of bytes that will then get discarded
in __pskb_pull_tail().

To avoid this inefficiency, this commit adds pskb_extract() that
creates the clone, and memcpy's only the relevant header/frag/frag_list
to the start of `clone'. pskb_trim() is then invoked to trim clone
down to the requested to_copy bytes.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fa01ccd

fq: add fair queuing framework · 557fc4a0

Michal Kazior authored Apr 22, 2016

This works on the same implementation principle as
codel*.h, i.e. there's a generic header with
structures and macros and a implementation header
carrying function definitions to include in given,
e.g. driver or module.

The fairness logic comes from
net/sched/sch_fq_codel.c but is generalized so it
is more flexible and easier to re-use.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

557fc4a0

Merge branch 'reusable-codel' · 05d82c42

David S. Miller authored Apr 25, 2016

Michal Kazior says:

====================
codel: make it reuseable beyond qdiscs

There's an ongoing effort in fixing wireless
bufferbloat. As part of that fq_codel is being
ported into mac80211. To prevent code duplication
codel.h needs to be slightly modified before it
can be used in mac80211 (or other drivers FWIW).

For more background please see:

  https://www.spinics.net/lists/linux-wireless/msg149976.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

05d82c42

codel: split into multiple files · d068ca2a

Michal Kazior authored Apr 22, 2016

It was impossible to include codel.h for the
purpose of having access to codel_params or
codel_vars structure definitions and using them
for embedding in other more complex structures.

This splits allows codel.h itself to be treated
like any other header file while codel_qdisc.h and
codel_impl.h contain function definitions with
logic that was previously in codel.h.

This copies over copyrights and doesn't involve
code changes other than adding a few additional
include directives to net/sched/sch*codel.c.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d068ca2a

codel: generalize the implementation · 79bdc4c8

Michal Kazior authored Apr 22, 2016

This strips out qdisc specific bits from the code
and makes it slightly more reusable. Codel will be
used by wireless/mac80211 in the future.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79bdc4c8

macsec: Convert to using IFF_NO_QUEUE · e425974f

Phil Sutter authored Apr 22, 2016

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

e425974f

route: move lwtunnel state to a single place · 0868e253

Jiri Benc authored Apr 22, 2016

Commit 751a587a ("route: fix breakage after moving lwtunnel state")
moved lwtstate to the end of dst_entry for 32bit archs. This makes it share
the cacheline with __refcnt which had an unkown effect on performance. For
this reason, the pointer was kept in place for 64bit archs.

However, later performance measurements showed this is of no concern. It
turns out that every performance sensitive path that accesses lwtstate
accesses also struct rtable or struct rt6_info which share the same cache
line.

Thus, to get rid of a few #ifdefs, move the field to the end of the struct
also for 64bit.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0868e253

Merge branch 'qed-next' · 6a025a50

David S. Miller authored Apr 25, 2016

Yuval Mintz says:

====================
qed*: driver updates

[Was previous termed 'eeprom access et al.', but seemed a bit
inappropriate given we've dropped the eeprom patch for now.
Still waiting for some inputs on that one, BTW]

This patch series contains some ethtool-related enhancements.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6a025a50

qed: add support for link pause configuration. · a43f235f

Sudarsana Reddy Kalluru authored Apr 22, 2016

The APIs for making this sort of configuration [e.g., via ethtool] are
already present in qede, but the current configuration flow in qed doesn't
respect it.
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a43f235f

qed*: Conditions for changing link · fe7cd2bf

Yuval Mintz authored Apr 22, 2016

There's some inconsistency in current logic determining whether the
link settings of a given interface can be changed; I.e., in all modes
other than the so-called `deault' mode the interfaces are forbidden from
changing the configuration - but even this rule is not applied to all
user APIs that may change the configuration.

Instead, let the core-module [qed] decide whether an interface can change
the configuration by supporting a new API function. We also revise the
current rule, allowing all interfaces to change their configurations while
laying the infrastructure for future modes where an interface would be
blocked from making such a configuration.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fe7cd2bf

qede: Add support for ethtool private flags · f3e72109

Yuval Mintz authored Apr 22, 2016

Adds a getter for the interfaces private flags.
The only parameter currently supported is whether the interface is a
coupled function [required for supporting 100g].
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f3e72109

qed*: Align statistics names · d4967cf3

Yuval Mintz authored Apr 22, 2016

There's a difference in statsitics' names starting at qed and
propagating to qede, where egress counters indicate ranges while ingress
counters indiciate high-end.
Align all statistcs to follow the same conventions - name indicates range.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d4967cf3

net: better drop monitoring in ip{6}_recv_error() · 960a2628

Eric Dumazet authored Apr 21, 2016

We should call consume_skb(skb) when skb is properly consumed,
or kfree_skb(skb) when skb must be dropped in error case.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

960a2628

tcp: SYN packets are now simply consumed · 0aea76d3

Eric Dumazet authored Apr 21, 2016

We now have proper per-listener but also per network namespace counters
for SYN packets that might be dropped.

We replace the kfree_skb() by consume_skb() to be drop monitor [1]
friendly, and remove an obsolete comment.
FastOpen SYN packets can carry payload in them just fine.

[1] perf record -a -g -e skb:kfree_skb sleep 1; perf report
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0aea76d3

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 1bc7fe64

David S. Miller authored Apr 25, 2016

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2016-04-25

This series contains updates to ixgbe and ixgbevf.

Emil provides several patches, starting with the consolidation of the
logic behind configuring spoof checking.  Fixed an issue which was
causing link issues for backplane devices because x550em_a/x devices
did not have a default value for mac->ops.setup_link.  Refactored the
ethtool stats to bring the logic closer to how ixgbe handles stats and
sets up per-queue stats for ixgbevf.

Mark adds a new register to wait for previous register writes to complete
before issuing a register read, which is needed when slower links are
in use.  Fixed the flow control setup for x550em_a, the incorrect
fc_setup function was being used.

Don added a workaround for empty SFP+ cage crosstalk, since on some
systems the crosstalk could lead to link flap on empty SFP+ cages.

Jake converts ixgbe and ixgbevf to use the BIT() macro.

Alex Duyck adds support for partial GSO segmentation in the case of
tunnels for ixgbe and ixgbevf.  Then preps for HyperV by moving the API
negotiation into mac_ops.

Arnd Bergmann provides a fix for the ARM compile warnings in linux-next
by converting the use of a udelay() to msleep().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1bc7fe64

Merge branch 'nla_align-set-2' · e7157f28

David S. Miller authored Apr 25, 2016

Nicolas Dichtel says:

====================
netlink: align attributes when needed (patchset #2)

This is the continuation (series #2) of the work done to align netlink
attributes when these attributes contain some 64-bit fields.

In patch #3, I didn't modify the function ila_encap_nlsize(). I was waiting
feedback for this patch: http://patchwork.ozlabs.org/patch/613766/
If it's approved, there will be an update to switch nla_total_size() to
nla_total_size_64bit() after the merge of net in net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e7157f28

wireless: use nla_put_u64_64bit() · 2dad624e

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2dad624e

netfilter/ipvs: use nla_put_u64_64bit() · cbdeafd7

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cbdeafd7

ieee802154: use nla_put_u64_64bit() · a558da09

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a558da09

l2tp: use nla_put_u64_64bit() · 1c714a92

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c714a92

bridge: use nla_put_u64_64bit() · 12a0faa3

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

12a0faa3

ovs: use nla_put_u64_64bit() · 0238b720

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0238b720

ipv6: use nla_put_u64_64bit() · f13a82d8

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f13a82d8

sched: use nla_put_u64_64bit() · 2a51c1e8

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2a51c1e8

rtnl: use nla_put_u64_64bit() · 343a6d8e

Nicolas Dichtel authored Apr 25, 2016

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

343a6d8e

soreuseport: Resolve merge conflict for v4/v6 ordering fix · d296ba60

Craig Gallek authored Apr 25, 2016

d894ba18 ("soreuseport: fix ordering for mixed v4/v6 sockets")
was merged as a bug fix to the net tree.  Two conflicting changes
were committed to net-next before the above fix was merged back to
net-next:
ca065d0c ("udp: no longer use SLAB_DESTROY_BY_RCU")
3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under synflood")

These changes switched the datastructure used for TCP and UDP sockets
from hlist_nulls to hlist.  This patch applies the necessary parts
of the net tree fix to net-next which were not automatic as part of the
merge.

Fixes: 1602f49b ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d296ba60

sock: relax WARN_ON() in sock_owned_by_user() · 5e91f6ce

Eric Dumazet authored Apr 25, 2016

Valdis reported tons of stack dumps caused by WARN_ON() in
sock_owned_by_user()

This test needs to be relaxed if/when lockdep disables itself.

Note that other lockdep_sock_is_held() callers are all from
rcu_dereference_protected() sections which already are disabled
if/when lockdep has been disabled.

Fixes: fafc4e1e ("sock: tigthen lockdep checks for sock_owned_by_user")
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e91f6ce

ixgbe: use msleep for long delays · d4f90d9d

Arnd Bergmann authored Apr 16, 2016

The newly added x550em_a support causes a link failure on ARM because of
an overly long time passed into udelay():

ERROR: "__bad_udelay" [drivers/net/ethernet/intel/ixgbe/ixgbe.ko] undefined!

There are multiple variants of the ixgbe_acquire_swfw_sync_*() function,
and the other ones all use msleep(), so we can safely assume that all
callers are allowed to sleep, which makes msleep() a better replacement
than mdelay().
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 49425dfc ("ixgbe: Add support for x550em_a 10G MAC type")
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

d4f90d9d

ixgbevf: Move API negotiation function into mac_ops · 7921f4dc

Alexander Duyck authored Apr 14, 2016

This patch moves API negotiation into mac_ops. The general idea here is
that with HyperV on the way we need to make certain that anything that will
have different versions between HyperV and a standard VF needs to be
abstracted enough so that we can have a separate function between the two
so we can avoid changes in one breaking something in the other.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

7921f4dc

ixgbe/ixgbevf: Add support for GSO partial · b83e3010

Alexander Duyck authored Apr 14, 2016

This patch adds support for partial GSO segmentation in the case of
tunnels. Specifically with this change the driver an perform segmentation
as long as the frame either has IPv6 inner headers, or we are allowed to
mangle the IP IDs on the inner header. This is needed because we will not
be modifying any fields from the start of the start of the outer transport
header to the start of the inner transport header as we are treating them
like they are just a block of IP options.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b83e3010

ixgbevf: make use of BIT() macro to avoid shift of signed values · 8d055cc0

Jacob Keller authored Apr 13, 2016

Also cleanup a case where we're bit shifting a value into place, and use
an unsigned constant. Make use of the unsigned postfix in places where
BIT() macro is not appropriate.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

8d055cc0

ixgbe: resolve shift of negative value warning · 3e973dc4

Jacob Keller authored Apr 13, 2016

Make use of GENMASK instead of open coding the equivalent operation
incorrectly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

3e973dc4

ixgbe: use BIT() macro · b4f47a48

Jacob Keller authored Apr 13, 2016

Several areas of ixgbe were written before widespread usage of the
BIT(n) macro. With the impending release of GCC 6 and its associated new
warnings, some usages such as (1 << 31) have been noted within the ixgbe
driver source. Fix these wholesale and prevent future issues by simply
using BIT macro instead of hand coded bit shifts.

Also fix a few shifts that are shifting values into place by using the
'u' prefix to indicate unsigned. It doesn't strictly matter in these
cases because we're not shifting by too large a value, but these are all
unsigned values and should be indicated as such.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b4f47a48

ixgbe: Add work around for empty SFP+ cage crosstalk · 4319a797

Don Skidmore authored Apr 12, 2016

It is possible on some systems that crosstalk could lead to link flap
on empty SFP+ cages.  A new NVM bit was defined to let SW know it
needs to implement the work around which consists of verifying that
there is a module in the cage before acting on the LSC.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4319a797

ixgbe: Use correct FC setup function for x550em_a · a0254a70

Mark Rustad authored Apr 08, 2016

Somehow the wrong fc_setup function was used for x550em_a, so
correct that. Also set setup_link to NULL as its value is
determined later, just like it is with X550EM_x.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a0254a70

ixgbevf: add support for per-queue ethtool stats · a02a5a53

Emil Tantilov authored Apr 07, 2016

Implement per-queue statistics for packets, bytes and busy poll
specific counters.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a02a5a53

ixgbevf: refactor ethtool stats handling · d72d6c19

Emil Tantilov authored Apr 07, 2016

This brings the logic closer to how we handle the stats in ixgbe and it
sets us up for introducing per-queue stats.

Use IXGBEVF_STAT and IXGBEVF_NETDEV_STAT for accessing the driver and
netdev stats respectively. This way we don't have to calculate the
stats based on register values which could lead to the counters not
being initialized properly when the interface is down.

IXGBEVF_QUEUE_STATS_LEN is set to include the number of queues.

Also some defines were renamed to use the IXGBEVF prefix.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

d72d6c19

ixgbe: Add register wait for slow links · 2f2219be

Mark Rustad authored Apr 07, 2016

Use a new register to wait for previous register writes to complete
before issuing a register read. This is needed when slower links
are in use.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

2f2219be

hv_netvsc: Fix the list processing for network change event · 15cfd407

Haiyang Zhang authored Apr 21, 2016

RNDIS_STATUS_NETWORK_CHANGE event is handled as two "half events" --
media disconnect & connect. The second half should be added to the list
head, not to the tail. So all events are processed in normal order.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

15cfd407

ixgbe: make 'action' field in struct ixgbe_fdir_filter a u64 value · 2a9ed5d1

Sridhar Samudrala authored Apr 01, 2016

This field is used to record the RX queue index for a redirect action
passed via ring_cookie field in struct ethtool_rx_flow_spec which is
a u64 value.

For ex: after adding a filter rule to redirect to a VF using ethtool
  # echo 4 > /sys/class/net/p4p1/device/sriov_numvfs
  # ethtool -N p4p1 flow-type ip4 src-ip 192.168.0.1 action 0x100000000

querying for the rule shows the Action as 'Direct to queue 0'

  # ethtool -n p4p1
  4 RX rings available
  Total 1 rules

  Filter: 2045
 	Rule Type: Raw IPv4
	Src IP addr: 192.168.0.1 mask: 0.0.0.0
	Dest IP addr: 0.0.0.0 mask: 255.255.255.255
	TOS: 0x0 mask: 0xff
	Protocol: 0 mask: 0xff
	L4 bytes: 0x0 mask: 0xffffffff
	VLAN EtherType: 0x0 mask: 0xffff
	VLAN: 0x0 mask: 0xffff
	User-defined: 0x0 mask: 0xffffffffffffffff
	Action: Direct to queue 0

With this fix, ethtool will report the right queue index even for VFs.
	Action: Direct to queue 4294967296

Here 4294967296 corresponds to 0x100000000.
We need to update 'ethtool' to report the queue index as a Hex value so
that it is more  user friendly and matches with the 'action' value that
is passed when adding the rule.
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

2a9ed5d1