Commits · 6ceed786648618f6398a54006365f688f5a32012 · Kirill Smelkov / linux

28 Jul, 2014 10 commits

Merge branch 'inet_frag_kill_lru_list' · 6ceed786

David S. Miller authored Jul 27, 2014

Nikolay Aleksandrov says:

====================
inet: frag: cleanup and update

The end goal of this patchset is to remove the LRU list and to move the
frag eviction to a work queue. It also does a couple of necessary cleanups
and fixes. Brief patch descriptions:
Patches 1 - 3 inclusive: necessary clean ups
Patch 4 moves the eviction from the softirqs to a workqueue.
Patch 5 removes the nqueues counter which was protected by the LRU lock
Patch 6 removes the, by now unused, lru list.
Patch 7 moves the rebuild timer to the workqueue and schedules the rebuilds
        only if we've hit the maximum queue length on some of the chains.
Patch 8 migrate the rwlock to a seqlock since the rehash is usually a rare
        operation.
Patch 9 introduces an artificial global memory limit based on the value of
        init_net's high_thresh which is used to cap the high_thresh of the
        other namespaces. Also introduces some sane limits on the other
        tunables, and makes it impossible to have low_thresh > high_thresh.

Here are some numbers from running netperf before and after the patchset:
Each test consists of the following setting: -I 95,5 -i 15,10

1. Bound test (-T 4,4)
1.1 Virtio before the patchset -
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.  : cpu bind
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

212992   64000   30.00      722177      0    12325.1     34.55    2.025
212992           30.00      368020            6280.9     34.05    0.752

1.2 Virtio after the patchset -
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.  : cpu bind
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

212992   64000   30.00      727030      0    12407.9     35.45    1.876
212992           30.00      505405            8625.5     34.92    0.693

2. Virtio unbound test
2.1 Before the patchset
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   64000   30.00      730008      0    12458.77
212992           30.00      416721           7112.02

2.2 After the patchset
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.122.177 () port 0 AF_INET : +/-2.500% @ 95% conf.
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   64000   30.00      731129      0    12477.89
212992           30.00      487707           8323.50

3. 10 gig unbound tests
3.1 Before the patchset
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.133.1 () port 0 AF_INET : +/-2.500% @ 95% conf.
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   64000   30.00      417209      0    7120.33
212992           30.00      416740           7112.33

3.2 After the patchset
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.133.1 () port 0 AF_INET : +/-2.500% @ 95% conf.
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   64000   30.00      438009      0    7475.33
212992           30.00      437630           7468.87

Given the options each netperf ran between 10 and 15 times for 30 seconds
to get the necessary confidence, also the tests themselves ran 3 times and
were consistent.
Another set of tests that I ran were parallel stress tests which consisted
of flooding the machine with fragmented packets from different sources with
frag timeout set to 0 (so there're lots of timeouts) and low_thresh set to
1 byte (so evictions are happening all the time) and on top of that running
a namespace create/destroy endless loop with network interfaces and
addresses that got flooded (for the brief periods they were up) in parallel.
This test ran for an hour without any issues.
====================

6ceed786

inet: frag: set limits and make init_net's high_thresh limit global · 1bab4c75

Nikolay Aleksandrov authored Jul 24, 2014

This patch makes init_net's high_thresh limit to be the maximum for all
namespaces, thus introducing a global memory limit threshold equal to the
sum of the individual high_thresh limits which are capped.
It also introduces some sane minimums for low_thresh as it shouldn't be
able to drop below 0 (or > high_thresh in the unsigned case), and
overall low_thresh should not ever be above high_thresh, so we make the
following relations for a namespace:
init_net:
 high_thresh - max(not capped), min(init_net low_thresh)
 low_thresh - max(init_net high_thresh), min (0)

all other namespaces:
 high_thresh = max(init_net high_thresh), min(namespace's low_thresh)
 low_thresh = max(namespace's high_thresh), min(0)

The major issue with having low_thresh > high_thresh is that we'll
schedule eviction but never evict anything and thus rely only on the
timers.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1bab4c75

inet: frag: use seqlock for hash rebuild · ab1c724f

Florian Westphal authored Jul 24, 2014

rehash is rare operation, don't force readers to take
the read-side rwlock.

Instead, we only have to detect the (rare) case where
the secret was altered while we are trying to insert
a new inetfrag queue into the table.

If it was changed, drop the bucket lock and recompute
the hash to get the 'new' chain bucket that we have to
insert into.

Joint work with Nikolay Aleksandrov.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ab1c724f

inet: frag: remove periodic secret rebuild timer · e3a57d18

Florian Westphal authored Jul 24, 2014

merge functionality into the eviction workqueue.

Instead of rebuilding every n seconds, take advantage of the upper
hash chain length limit.

If we hit it, mark table for rebuild and schedule workqueue.
To prevent frequent rebuilds when we're completely overloaded,
don't rebuild more than once every 5 seconds.

ipfrag_secret_interval sysctl is now obsolete and has been marked as
deprecated, it still can be changed so scripts won't be broken but it
won't have any effect. A comment is left above each unused secret_timer
variable to avoid confusion.

Joint work with Nikolay Aleksandrov.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e3a57d18

inet: frag: remove lru list · 3fd588eb

Florian Westphal authored Jul 24, 2014

no longer used.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

3fd588eb

inet: frag: don't account number of fragment queues · 434d3054

Florian Westphal authored Jul 24, 2014

The 'nqueues' counter is protected by the lru list lock,
once thats removed this needs to be converted to atomic
counter.  Given this isn't used for anything except for
reporting it to userspace via /proc, just remove it.

We still report the memory currently used by fragment
reassembly queues.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

434d3054

inet: frag: move eviction of queues to work queue · b13d3cbf

Florian Westphal authored Jul 24, 2014

When the high_thresh limit is reached we try to toss the 'oldest'
incomplete fragment queues until memory limits are below the low_thresh
value.  This happens in softirq/packet processing context.

This has two drawbacks:

1) processors might evict a queue that was about to be completed
by another cpu, because they will compete wrt. resource usage and
resource reclaim.

2) LRU list maintenance is expensive.

But when constantly overloaded, even the 'least recently used' element is
recent, so removing 'lru' queue first is not 'fairer' than removing any
other fragment queue.

This moves eviction out of the fast path:

When the low threshold is reached, a work queue is scheduled
which then iterates over the table and removes the queues that exceed
the memory limits of the namespace. It sets a new flag called
INET_FRAG_EVICTED on the evicted queues so the proper counters will get
incremented when the queue is forcefully expired.

When the high threshold is reached, no more fragment queues are
created until we're below the limit again.

The LRU list is now unused and will be removed in a followup patch.

Joint work with Nikolay Aleksandrov.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b13d3cbf

inet: frag: move evictor calls into frag_find function · 86e93e47

Florian Westphal authored Jul 24, 2014

First step to move eviction handling into a work queue.

We lose two spots that accounted evicted fragments in MIB counters.

Accounting will be restored since the upcoming work-queue evictor
invokes the frag queue timer callbacks instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

86e93e47

inet: frag: remove hash size assumptions from callers · fb3cfe6e

Florian Westphal authored Jul 24, 2014

hide actual hash size from individual users: The _find
function will now fold the given hash value into the required range.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb3cfe6e

inet: frag: constify match, hashfn and constructor arguments · 36c77782
Florian Westphal authored Jul 24, 2014
```
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
36c77782

25 Jul, 2014 21 commits

ipv6: remove obsolete comment in ip6_append_data() · ac3d2e5a

Li RongQing authored Jul 25, 2014

After 11878b40[net-timestamp: SOCK_RAW and PING timestamping], this comment
becomes obsolete since the codes check not only UDP socket, but also RAW sock;
and the codes are clear, not need the comments
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac3d2e5a

Merge branch 'macb-next' · 1e3550e4

David S. Miller authored Jul 24, 2014

Cyrille Pitchen says:

====================
net/macb: add HW features to macb driver

this series of patches adds new hardware features to macb driver. These
features can be enabled/disabled at runtime using ethtool. Depending on
hardware and design configuration, some are enabled by default whereas other
are disabled.

For instance, checksum offload features are enabled by default for gem designed
for packet buffer mode but disabled for fifo mode design or for old macb.

Besides, the scatter-gather feature is enabled and tested on macb but disabled
on sama5d3x gem. When testing this feature on sama5d3x gem, TX lockups occured
frequently.

Also, the RX checksum offload feature is enabled at GEM level only when both
IFF_PROMISC bit is clear in dev->flags and NETIF_F_RXCSUM bit is set in
dev->features.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1e3550e4

net/macb: enable scatter-gather feature and set DMA burst length for sama5d4 gem · 4b7b0e4f
Cyrille Pitchen authored Jul 24, 2014
```
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
4b7b0e4f

ARM: at91: change compatibility string for sama5d3x gem · d58c86ec

Cyrille Pitchen authored Jul 24, 2014

this new compatibility string prevents macb/gem driver from using the
scatter-gather and gso features on sama5d3x boards.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d58c86ec

net/macb: add RX checksum offload feature · 924ec53c

Cyrille Pitchen authored Jul 24, 2014

When RX checksum offload is enabled at GEM level (bit 24 set in the Network
Control Register), frames with invalid IP, TCP or UDP checksums are
discarted even if promiscuous mode is enabled (bit 4 set in the Network Control
Register).

This was verified with a simple userspace program, which corrupts UDP checksum
using libnetfilter_queue.

Then both IFF_PROMISC bit must be clear in dev->flags and NETIF_F_RXCSUM bit
must be set in dev->features to enable RX checksum offload at GEM level. This
way tcpdump is still able to capture corrupted frames.

Also skb->ip_summed is set to CHECKSUM_UNNECESSARY only when both TCP/IP or
UDP/IP checksums were verified by the GEM. Indeed the GEM may verify only IP
checksum but not the one for ICMP (or other protocol than TCP or UDP).
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

924ec53c

net/macb: add TX checksum offload feature · 85ff3d87

Cyrille Pitchen authored Jul 24, 2014

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85ff3d87

net/macb: add scatter-gather hw feature · a4c35ed3

Cyrille Pitchen authored Jul 24, 2014

The scatter-gather feature will allow to enable the Generic Segmentation Offload.
Generic Segmentation Offload can be enabled/disabled using ethtool -K DEVNAME gso on|off.

e.g:
ethtool -K eth0 gso off

When enabled, the driver may be provided with socket buffers splitted into many fragments.
These fragments need to be queued into the TX ring in reverse order, starting from to the
last one down to the first one, to avoid a race condition with the MAC.
Especially the 'TX_USED' bit in word 1 of the transmit buffer descriptor of the
first fragment should be cleared at the very final step of the queueing algorithm.
This will tell the hardware that fragments are ready to be sent.

Also since the MAC only update the status word of the first buffer descriptor of the
ethernet frame, the queueing algorithm can no longer expect a 'TX_USED' bit to be set by
the MAC into the buffer descriptor following the one for last fragment of the skb.
This is why the driver sets the 'TX_USED' bit before queueing any fragment, so the end of
queue position is well defined for the MAC.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a4c35ed3

net/macb: configure for FIFO mode and non-gigabit · e175587f

Nicolas Ferre authored Jul 24, 2014

This addition will also allow to configure DMA burst length.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e175587f

drivers: net: cpsw: cleanup: remove unused function · ef492001

Mugunthan V N authored Jul 24, 2014

removing unused function as part of driver cleanup.`
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef492001

net/mlx4_en: mlx4_en_[gs]et_priv_flags() can be static · 3f6148e7

Fengguang Wu authored Jul 24, 2014

Fixes sparse warning intrduced by commit 0fef9d03 ("net/mlx4_en: Disable
blueflame using ethtool private flags")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3f6148e7

bfin_mac: convert bfin Ethernet driver to NAPI framework · 159945af

Sonic Zhang authored Jul 24, 2014

Ethernet RX DMA buffers are polled in NAPI work queue other than received
directly in DMA RX interrupt handler.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

159945af

net: do not name the pointer to struct net_device net · 6b53dafe

WANG Cong authored Jul 23, 2014

"net" is normally for struct net*, pointer to struct net_device
should be named to either "dev" or "ndev" etc.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6b53dafe

isdn: use constants instead of magicnumbers · 7b18ef08

Himangi Saraogi authored Jul 24, 2014

This patch changes instances of magic numbers like 4 and 8 to equivalent
constants.

The Coccinelle semantic patch used for making the change is as follows:

// <smpl>
@r@
type T;
T E;
identifier fld;
identifier c;
@@

E->fld & c

@s@
constant C;
identifier r.c;
@@

#define c C

@@
r.T E;
identifier r.fld;
identifier r.c;
constant s.C;
@@

 E->fld &
- C
+ c
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

7b18ef08

cxgb4: Fixed incorrect check for memory operation in t4_memory_rw · c81576c2

Hariprasad Shenai authored Jul 24, 2014

Fix incorrect check introduced in commit fc5ab020 ("cxgb4: Replaced the
backdoor mechanism to access the HW memory with PCIe Window method"). We where
checking for write operation and doing a read, changed it accordingly.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c81576c2

net: filter: rename 'struct sock_filter_int' into 'struct bpf_insn' · 2695fb55

Alexei Starovoitov authored Jul 24, 2014

eBPF is used by socket filtering, seccomp and soon by tracing and
exposed to userspace, therefore 'sock_filter_int' name is not accurate.
Rename it to 'bpf_insn'
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2695fb55

fec: Simplify the PM related hooks · dd66d386

Fabio Estevam authored Jul 24, 2014

Get rid of the CONFIG_PM_SLEEP ifdef by annotating the suspend/resume functions
with '__maybe_unused' in order to keep the code simpler and shorter.

While at it, declare the suspend/resume functions in a single line.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dd66d386

net_sched: remove exceptional & on function name · e40f5c72

Himangi Saraogi authored Jul 25, 2014

In this file, function names are otherwise used as pointers without &.

A simplified version of the Coccinelle semantic patch that makes this
change is as follows:

// <smpl>
@r@
identifier f;
@@

f(...) { ... }

@@
identifier r.f;
@@

- &f
+ f
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

e40f5c72

neigh: remove exceptional & on function name · 56ec0fb1

Himangi Saraogi authored Jul 25, 2014

In this file, function names are otherwise used as pointers without &.

A simplified version of the Coccinelle semantic patch that makes this
change is as follows:

// <smpl>
@r@
identifier f;
@@

f(...) { ... }

@@
identifier r.f;
@@

- &f
+ f
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

56ec0fb1

igmp: remove exceptional & on function name · 179542a5

Himangi Saraogi authored Jul 25, 2014

In this file, function names are otherwise used as pointers without &.

A simplified version of the Coccinelle semantic patch that makes this
change is as follows:

// <smpl>
@r@
identifier f;
@@

f(...) { ... }

@@
identifier r.f;
@@

- &f
+ f
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

179542a5

Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch · 92390790

David S. Miller authored Jul 24, 2014

Pravin B Shelar says:

====================
Open vSwitch

Following patches adds three features to OVS
1. Add fairness to upcall processing.
2. Hash action.
3. Enable Tunnel GSO features.
Rest of patches are bug fixes related to patches from same series.

v2 series changes first patch according to comment from Dave Miller.
v3 series changes first patch according to comment from Nikolay Aleksandrov.
v4 series update recirc patch commit msg.
v5 series resolve conflict with net-next, updated recic action patch.
v6 series sends all patches.
v7 series drop recirc patches.
v8 series checkpatch fix
v9 series drop HASH action patch. update sample action commit msg.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

92390790

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 10f29808

David S. Miller authored Jul 24, 2014

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-07-24

This series contains updates to igb, ixgbe, i40e and i40evf.

Mark fixes a possible attempt to dereference a NULL pointer in ixgbe_probe().
Also changes some uses of strncpy to strlcpy when clearing is not needed to
prevent information leakage.

Jacob fixes a bug in the misuse of the list_for_each macro to loop over
every entry in the bus_list.  Instead of attempting to loop over the list
from a random entry point, go up to the bus and use the real list_head
entry point.  This prevents the possible read or write of unallocated or
incorrectly addressed memory.  Then provides a patch to prevent the
display of the minimum link qualification check if we might be in a
virtual machine.  This check is incorrect and misleading in this case,
since we actually do not really know what the available bandwidth is.
To do so, we simply check whether each function on the bus matches our
device id.

Carolyn adds a check and prints the error cause register value when the
hardware detects a malformed packet to assist the user.

Toralf Förster fixes a format mismatch in i40e which was found using
cppcheck.

Shannon adds nvmupdate support by implementing a state machine intended
to support the userland tool for updating the device eeprom.

Jesse fixes the extension header checksum logic for IPv6 in i40e and
i40evf.

Mitch reduces a delay in the i40evf driver where we do not need to
delay an entire millisecond to get into our critical section.

Kamil fixes an issue where access to the NVM was being blocked until
a driver reset where a check for NVM related admin queue commands
would not recognize that such a command was received and would not clear
nvm_busy flag.

Catherine fixes a couple of firmware API version errors.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

10f29808

24 Jul, 2014 9 commits

openvswitch: Add skb_clone NULL check for the sampling action. · d9e0ecb8

Andy Zhou authored Jul 17, 2014

Fix a bug where skb_clone() NULL check is missing in sample action
implementation.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

d9e0ecb8

openvswitch: Sample action without side effects · 651887b0

Simon Horman authored Jul 21, 2014

The sample action is rather generic, allowing arbitrary actions to be
executed based on a probability. However its use, within the Open
vSwitch
code-base is limited: only a single user-space action is ever nested.

A consequence of the current implementation of sample actions is that
depending on weather the sample action executed (due to its probability)
any side-effects of nested actions may or may not be present before
executing subsequent actions.  This has the potential to complicate
verification of valid actions by the (kernel) datapath. And indeed
adding support for push and pop MPLS actions inside sample actions
is one case where such case.

In order to allow all supported actions to be continue to be nested
inside sample actions without the potential need for complex
verification code this patch changes the implementation of the sample
action in the kernel datapath so that sample actions are more like
a function call and any side effects of nested actions are not
present when executing subsequent actions.

With the above in mind the motivation for this change is twofold:

* To contain side-effects the sample action in the hope of making it
  easier to deal with in the future and;
* To avoid some rather complex verification code introduced in the MPLS
  datapath patch.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

651887b0

openvswitch: Avoid memory corruption in queue_userspace_packet() · f53e3831

Andy Zhou authored Jul 17, 2014

In queue_userspace_packet(), the ovs_nla_put_flow return value is
not checked. This is fine as long as key_attr_size() returns the
correct value. In case it does not, the current code may corrupt buffer
memory. Add a run time assertion catch this case to avoid silent
failure.
Reported-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

f53e3831

i40e: always print aqtx answer · e3effd73

Shannon Nelson authored Jul 09, 2014

Sometimes the AQTX answer comes back with no data, but we still want to print
the descriptor that got written back.

Change-ID: I5f734d99b4c95510987413893f0a34626571d474
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e3effd73

i40e: Give link more time after setting flow control · 7d62dac6

Catherine Sullivan authored Jul 09, 2014

Give link a little more time to come back up after setting flow control
before resetting. In the new NVMs it is taking longer for link to come back.
This causes the driver to attempt to reset the link, which then errors
because the firmware was already in the middle of a reset. Also, initialize
err to 0.

Change-ID: I1cc987a944e389d8909c262da5796f50722b4d6b
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Jim Young <jmyoungx@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

7d62dac6

i40e: Fix firmware API version errors · 7aa67613

Catherine Sullivan authored Jul 09, 2014

Reword the error messages. Also add a major version check because
We only want to warn on nvm_minor > expected_minor if
nvm_major == expected_major. Lastly, change an if to an else if
because the two statements will never evaluate to true at the same time.

Change-ID: I6ddf9986f26b35f6879cbeac4fcef04a8497a383
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

7aa67613

i40e/i40evf: ARQ copy desc data even for failed commands · 77813d0a

Kamil Krawczyk authored Jul 09, 2014

Copy desc and buffer data even for ARQ events which return error status.
Previously, a check for NVM related AQ commands which is done later in this
function would not recognize that such a command was received and would
not clear nvm_busy flag. This would block access to NVM until a driver reset.
This will fix that.

Change-ID: If69ad74e165b56081c0686b97402511d2e2880c0
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

77813d0a

i40evf: don't wait so long · b65476cd

Mitch Williams authored Jul 09, 2014

We really don't need to delay an entire millisecond just to get into our
critical section. A microsecond will be sufficient, thank you.

Change-ID: I2d02ece6610007d98cabcb3f42df9a774bb54e59
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b65476cd

i40e/i40evf: fix extension header csum logic · 22d2fa1d

Jesse Brandeburg authored Jul 09, 2014

The hardware design requires that the driver avoid indicating
checksum offload success on some ipv6 frames with extension
headers.

The code needs to just check for the IPV6EXADD bit and if
it is set punt the checksum to the stack.  I don't know why
the code was checking TCP on inner protocol, as that code
doesn't make any sense to me but seems wrong, so remove it.

Change-ID: I10d3aacdbb1819fb60b4b0eb80e6cc67ef2c9599
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-By: Jim Young <jamesx.m.young@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

22d2fa1d