Commits · ca8cf341903f90070e191cc8be8f705ab7af2d4a · Kirill Smelkov / linux

24 Mar, 2015 16 commits

net: bcmgenet: propagate errors from bcmgenet_power_down · ca8cf341

Florian Fainelli authored Mar 23, 2015

If bcmgenet_power_down() fails, we would want to propagate a return
value from bcmgenet_wol_power_down_cfg() to know about this.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ca8cf341

Merge branch 'rhashtable-next' · cc330b55

David S. Miller authored Mar 23, 2015

Herbert Xu says:

====================
rhashtable: Multiple rehashing

This series introduces multiple rehashing.

Recall that the original implementation in br_multicast used
two list pointers per hash node and therefore is limited to at
most one rehash at a time since you need one list pointer for
the old table and one for the new table.

Thanks to Josh Triplett's suggestion of using a single list pointer
we're no longer limited by that.  So it is perfectly OK to have
an arbitrary number of tables in existence at any one time.

The reader and removal simply has to walk from the oldest table
to the newest table in order not to miss anything.  Insertion
without lookup are just as easy as we simply go to the last table
that we can find and add the entry there.

However, insertion with uniqueness lookup is more complicated
because we need to ensure that two simultaneous insertions of the
same key do not both succeed.  To achieve this, all insertions
including those without lookups are required to obtain the bucket
lock from the oldest hash table that is still alive.  This is
determined by having the rehasher (there is only one rehashing
thread in the system) keep a pointer of where it is up to.  If
a bucket has already been rehashed then it is dead, i.e., there
cannot be any more insertions to it, otherwise it is considered
alive.  This guarantees that the same key cannot be inserted
in two different tables in parallel.

Patch 1 is actually a bug fix for the walker.

Patch 2-5 eliminates unnecessary out-of-line copies of jhash.

Patch 6 makes rhashtable_shrink shrink to fit.

Patch 7 introduces multiple rehashing.  This means that if we
decide to grow then we will grow regardless of whether the previous
one has finished.  However, this is still asynchronous meaning
that if insertions come fast enough we may still end up with a
table that is overutilised.

Patch 8 adds support for GFP_ATOMIC allocations of struct bucket_table.

Finally patch 9 enables immediate rehashing.  This is done either
when the table reaches 100% utilisation, or when the chain length
exceeds 16 (the latter can be disabled on request, e.g., for
nft_hash.

With these patches the system should no longer have any trouble
dealing with fast insertions on a small table.  In the worst
case you end up with a list of tables that's log N in length
while the rehasher catches up.

v3 restores rhashtable_shrink and fixes a number of bugs in the
multiple rehashing patches (7 and 9).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

cc330b55

rhashtable: Add immediate rehash during insertion · ccd57b1b

Herbert Xu authored Mar 24, 2015

This patch reintroduces immediate rehash during insertion.  If
we find during insertion that the table is full or the chain
length exceeds a set limit (currently 16 but may be disabled
with insecure_elasticity) then we will force an immediate rehash.
The rehash will contain an expansion if the table utilisation
exceeds 75%.

If this rehash fails then the insertion will fail.  Otherwise the
insertion will be reattempted in the new hash table.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

ccd57b1b

rhashtable: Allow GFP_ATOMIC bucket table allocation · b9ecfdaa

Herbert Xu authored Mar 24, 2015

This patch adds the ability to allocate bucket table with GFP_ATOMIC
instead of GFP_KERNEL.  This is needed when we perform an immediate
rehash during insertion.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9ecfdaa

rhashtable: Add multiple rehash support · b824478b

Herbert Xu authored Mar 24, 2015

This patch adds the missing bits to allow multiple rehashes.  The
read-side as well as remove already handle this correctly.  So it's
only the rehasher and insertion that need modification to handle
this.

Note that this patch doesn't actually enable it so for now rehashing
is still only performed by the worker thread.

This patch also disables the explicit expand/shrink interface because
the table is meant to expand and shrink automatically, and continuing
to export these interfaces unnecessarily complicates the life of the
rehasher since the rehash process is now composed of two parts.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

b824478b

rhashtable: Shrink to fit · 18093d1c

Herbert Xu authored Mar 24, 2015

This patch changes rhashtable_shrink to shrink to the smallest
size possible rather than halving the table.  This is needed
because with multiple rehashing we will defer shrinking until
all other rehashing is done, meaning that when we do shrink
we may be able to shrink a lot.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

18093d1c

tipc: Use default rhashtable hashfn · 6d022949

Herbert Xu authored Mar 24, 2015

This patch removes the explicit jhash value for the hashfn parameter
of rhashtable.  The default is now jhash so removing the setting
makes no difference apart from making one less copy of jhash in
the kernel.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

6d022949

netlink: Use default rhashtable hashfn · 11b58ba1

Herbert Xu authored Mar 24, 2015

This patch removes the explicit jhash value for the hashfn parameter
of rhashtable.  As the key length is a multiple of 4, this means that
we will actually end up using jhash2.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

11b58ba1

rhashtable: Allow hashfn to be unset · 31ccde2d

Herbert Xu authored Mar 24, 2015

Since every current rhashtable user uses jhash as their hash
function, the fact that jhash is an inline function causes each
user to generate a copy of its code.

This function provides a solution to this problem by allowing
hashfn to be unset.  In which case rhashtable will automatically
set it to jhash.  Furthermore, if the key length is a multiple
of 4, we will switch over to jhash2.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

31ccde2d

rhashtable: Eliminate unnecessary branch in rht_key_hashfn · de91b25c

Herbert Xu authored Mar 24, 2015

When rht_key_hashfn is called from rhashtable itself and params
is equal to ht->p, there is no point in checking params.key_len
and falling back to ht->p.key_len.

For some reason gcc couldn't figure out that params is the same
as ht->p.  So let's help it by only checking params.key_len when
it's a constant.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

de91b25c

rhashtable: Add barrier to ensure we see new tables in walker · d88252f9

Herbert Xu authored Mar 24, 2015

The walker is a lockless reader so it too needs an smp_rmb before
reading the future_tbl field in order to see any new tables that
may contain elements that we should have walked over.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

d88252f9

Merge tag 'linux-can-next-for-4.1-20150323' of... · e167359b

David S. Miller authored Mar 23, 2015

Merge tag 'linux-can-next-for-4.1-20150323' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2015-03-23

this is a pull request of 6 patches for net-next/master.

A patch by Florian Westphal, converts the skb->destructor to use
sock_efree() instead of own destructor. Ahmed S. Darwish's patch
converts the kvaser_usb driver to use unregister_candev(). A patch by
me removes a return from a void function in the m_can driver. Yegor
Yefremov contributes a patch for combined rx/tx LED trigger support. A
sparse warning in the esd_usb2 driver was fixes by Thomas Körper. Ben
Dooks converts the at91_can driver to use endian agnostic IO accessors.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e167359b

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 40451fd0

David S. Miller authored Mar 23, 2015

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next.
Basically, more incremental updates for br_netfilter from Florian
Westphal, small nf_tables updates (including one fix for rb-tree
locking) and small two-liner to add extra validation for the REJECT6
target.

More specifically, they are:

1) Use the conntrack status flags from br_netfilter to know that DNAT is
   happening. Patch for Florian Westphal.

2) nf_bridge->physoutdev == NULL already indicates that the traffic is
   bridged, so let's get rid of the BRNF_BRIDGED flag. Also from Florian.

3) Another patch to prepare voidization of seq_printf/seq_puts/seq_putc,
   from Joe Perches.

4) Consolidation of nf_tables_newtable() error path.

5) Kill nf_bridge_pad used by br_netfilter from ip_fragment(),
   from Florian Westphal.

6) Access rb-tree root node inside the lock and remove unnecessary
   locking from the get path (we already hold nfnl_lock there), from
   Patrick McHardy.

7) You cannot use a NFT_SET_ELEM_INTERVAL_END when the set doesn't
   support interval, also from Patrick.

8) Enforce IP6T_F_PROTO from ip6t_REJECT to make sure the core is
   actually restricting matches to TCP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

40451fd0

af_packet: pass checksum validation status to the user · 682f048b

Alexander Drozdov authored Mar 23, 2015

Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
af_packet user that at least the transport header checksum
has been already validated.

For now, the flag may be set for incoming packets only.
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

682f048b

af_packet: make tpacket_rcv to not set status value before run_filter · 68c2e5de

Alexander Drozdov authored Mar 23, 2015

It is just an optimization. We don't need the value of status variable
if the packet is filtered.
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

68c2e5de

inet: fix double request socket freeing · c6973669

Fan Du authored Mar 23, 2015

Eric Hugne reported following error :

I'm hitting this warning on latest net-next when i try to SSH into a machine
with eth0 added to a bridge (but i think the problem is older than that)

Steps to reproduce:
node2 ~ # brctl addif br0 eth0
[  223.758785] device eth0 entered promiscuous mode
node2 ~ # ip link set br0 up
[  244.503614] br0: port 1(eth0) entered forwarding state
[  244.505108] br0: port 1(eth0) entered forwarding state
node2 ~ # [  251.160159] ------------[ cut here ]------------
[  251.160831] WARNING: CPU: 0 PID: 3 at include/net/request_sock.h:102 tcp_v4_err+0x6b1/0x720()
[  251.162077] Modules linked in:
[  251.162496] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc3+ #18
[  251.163334] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  251.164078]  ffffffff81a8365c ffff880038a6ba18 ffffffff8162ace4 0000000000009898
[  251.165084]  0000000000000000 ffff880038a6ba58 ffffffff8104da85 ffff88003fa437c0
[  251.166195]  ffff88003fa437c0 ffff88003fa74e00 ffff88003fa43bb8 ffff88003fad99a0
[  251.167203] Call Trace:
[  251.167533]  [<ffffffff8162ace4>] dump_stack+0x45/0x57
[  251.168206]  [<ffffffff8104da85>] warn_slowpath_common+0x85/0xc0
[  251.169239]  [<ffffffff8104db65>] warn_slowpath_null+0x15/0x20
[  251.170271]  [<ffffffff81559d51>] tcp_v4_err+0x6b1/0x720
[  251.171408]  [<ffffffff81630d03>] ? _raw_read_lock_irq+0x3/0x10
[  251.172589]  [<ffffffff81534e20>] ? inet_del_offload+0x40/0x40
[  251.173366]  [<ffffffff81569295>] icmp_socket_deliver+0x65/0xb0
[  251.174134]  [<ffffffff815693a2>] icmp_unreach+0xc2/0x280
[  251.174820]  [<ffffffff8156a82d>] icmp_rcv+0x2bd/0x3a0
[  251.175473]  [<ffffffff81534ea2>] ip_local_deliver_finish+0x82/0x1e0
[  251.176282]  [<ffffffff815354d8>] ip_local_deliver+0x88/0x90
[  251.177004]  [<ffffffff815350f0>] ip_rcv_finish+0xf0/0x310
[  251.177693]  [<ffffffff815357bc>] ip_rcv+0x2dc/0x390
[  251.178336]  [<ffffffff814f5da3>] __netif_receive_skb_core+0x713/0xa20
[  251.179170]  [<ffffffff814f7fca>] __netif_receive_skb+0x1a/0x80
[  251.179922]  [<ffffffff814f97d4>] process_backlog+0x94/0x120
[  251.180639]  [<ffffffff814f9612>] net_rx_action+0x1e2/0x310
[  251.181356]  [<ffffffff81051267>] __do_softirq+0xa7/0x290
[  251.182046]  [<ffffffff81051469>] run_ksoftirqd+0x19/0x30
[  251.182726]  [<ffffffff8106cc23>] smpboot_thread_fn+0x153/0x1d0
[  251.183485]  [<ffffffff8106cad0>] ? SyS_setgroups+0x130/0x130
[  251.184228]  [<ffffffff8106935e>] kthread+0xee/0x110
[  251.184871]  [<ffffffff81069270>] ? kthread_create_on_node+0x1b0/0x1b0
[  251.185690]  [<ffffffff81631108>] ret_from_fork+0x58/0x90
[  251.186385]  [<ffffffff81069270>] ? kthread_create_on_node+0x1b0/0x1b0
[  251.187216] ---[ end trace c947fc7b24e42ea1 ]---
[  259.542268] br0: port 1(eth0) entered forwarding state

Remove the double calls to reqsk_put()

[edumazet] :

I got confused because reqsk_timer_handler() _has_ to call
reqsk_put(req) after calling inet_csk_reqsk_queue_drop(), as
the timer handler holds a reference on req.
Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Erik Hugne <erik.hugne@ericsson.com>
Fixes: fa76ce73 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: David S. Miller <davem@davemloft.net>

c6973669

23 Mar, 2015 24 commits

vxlan: simplify if clause in dev_close · 24c0e683

Marcelo Ricardo Leitner authored Mar 23, 2015

Dan Carpenter's static checker warned that in vxlan_stop we are checking
if 'vs' can be NULL while later we simply derreference it.

As after commit 56ef9c90 ("vxlan: Move socket initialization to
within rtnl scope") 'vs' just cannot be NULL in vxlan_stop() anymore, as
the interface won't go up if the socket initialization fails. So we are
good to just remove the check and make it consistent.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

24c0e683

fib_trie: Fix regression in handling of inflate/halve failure · b6f15f82

Alexander Duyck authored Mar 23, 2015

When I updated the code to address a possible null pointer dereference in
resize I ended up reverting an exception handling fix for the suffix length
in the event that inflate or halve failed. This change is meant to correct
that by reverting the earlier fix and instead simply getting the parent
again after inflate has been completed to avoid the possible null pointer
issue.

Fixes: ddb4b9a1 ("fib_trie: Address possible NULL pointer dereference in resize")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6f15f82

bgmac: implement scatter/gather support · 9cde9450

Felix Fietkau authored Mar 23, 2015

Always use software checksumming, since the hardware does not have any
checksum offload support.
This significantly improves local TCP tx performance.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

9cde9450

bgmac: implement GRO and use build_skb · 45c9b3c0

Felix Fietkau authored Mar 23, 2015

This improves performance for routing and local rx
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

45c9b3c0

bgmac: fix descriptor frame start/end definitions · 0addb83d

Felix Fietkau authored Mar 23, 2015

The start-of-frame and end-of-frame bits were accidentally swapped.
In the current code it does not make any difference, since they are
always used together.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

0addb83d

net: Move the comment about unsettable socket-level options to default clause... · 443b5991

YOSHIFUJI Hideaki/吉藤英明 authored Mar 23, 2015

net: Move the comment about unsettable socket-level options to default clause and update its reference.

We implement the SO_SNDLOWAT etc not to be settable and return
ENOPROTOOPT per 1003.1g 7. Move the comment to appropriate
position and update the reference.
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

443b5991

Merge branch 'listener_refactor_part_15' · 86c1aee1

David S. Miller authored Mar 23, 2015

Eric Dumazet says:

====================
tcp listener refactoring part 15

I am trying to make the final patch pushing request socks into ehash
as small as possible. In this patch series, I made various adjustments
for the SYNACK generation, allowing me to reach 1 Mpps SYNACK in my
stress test (still hitting LISTENER spinlock of course, and the syn_wait
spinlock)

I also converted the ICMP handlers a bit ahead of time :

They no longer need to get the LISTENER socket, and can use
only a lookup in ehash table. No big deal if we ignore ICMP
for requests socks before the final steps.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

86c1aee1

ipv6: dccp: handle ICMP messages on DCCP_NEW_SYN_RECV request sockets · 52036a43

Eric Dumazet authored Mar 22, 2015

dccp_v6_err() can restrict lookups to ehash table, and not to listeners.

Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

52036a43

ipv4: dccp: handle ICMP messages on DCCP_NEW_SYN_RECV request sockets · 85645bab

Eric Dumazet authored Mar 22, 2015

dccp_v4_err() can restrict lookups to ehash table, and not to listeners.

Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.

New dccp_req_err() helper is exported so that we can use it in IPv6
in following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85645bab

ipv6: tcp: handle ICMP messages on TCP_NEW_SYN_RECV request sockets · 2215089b

Eric Dumazet authored Mar 22, 2015

tcp_v6_err() can restrict lookups to ehash table, and not to listeners.

2215089b

ipv4: tcp: handle ICMP messages on TCP_NEW_SYN_RECV request sockets · 26e37360

Eric Dumazet authored Mar 22, 2015

tcp_v4_err() can restrict lookups to ehash table, and not to listeners.

Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.

New tcp_req_err() helper is exported so that we can use it in IPv6
in following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

26e37360

net: convert syn_wait_lock to a spinlock · b2827053

Eric Dumazet authored Mar 22, 2015

This is a low hanging fruit, as we'll get rid of syn_wait_lock eventually.

We hold syn_wait_lock for such small sections, that it makes no sense to use
a read/write lock. A spin lock is simply faster.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2827053

inet: remove some sk_listener dependencies · 8b929ab1

Eric Dumazet authored Mar 22, 2015

listener can be source of false sharing. request sock has some
useful information like : ireq->ir_iif, ireq->ir_num, ireq->ireq_net

This patch does not solve the major problem of having to read
sk->sk_protocol which is sharing a cache line with sk->sk_wmem_alloc.
(This same field is read later in ip_build_and_send_pkt())

One idea would be to move sk_protocol close to sk_family
(using 8 bits instead of 16 for sk_family seems enough)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8b929ab1

inet: remove sk_listener parameter from syn_ack_timeout() · 42cb80a2

Eric Dumazet authored Mar 22, 2015

It is not needed, and req->sk_listener points to the listener anyway.
request_sock argument can be const.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

42cb80a2

inet: cache listen_sock_qlen() and read rskq_defer_accept once · 2b41fab7

Eric Dumazet authored Mar 22, 2015

Cache listen_sock_qlen() to limit false sharing, and read
rskq_defer_accept once as it might change under us.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2b41fab7

Merge branch 'gigaset_modem_response' · c9231f82

David S. Miller authored Mar 23, 2015

Tilman Schmidt says:

====================
isdn/gigaset: restructure modem response parser

This series of patches restructures the Gigaset ISDN driver's
modem response parser to improve code readability and conform
better to the device's specification and actual behaviour.
Could you please merge these through net-next?
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c9231f82

isdn/gigaset: restructure modem response parser (4) · 4656e0a3

Tilman Schmidt authored Mar 21, 2015

Restructure the control structure of the modem response parser
to improve readability and error handling.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

4656e0a3

isdn/gigaset: restructure modem response parser (3) · d93187eb

Tilman Schmidt authored Mar 21, 2015

Separate CID detection from main parser loop.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

d93187eb

isdn/gigaset: restructure modem response parser (2) · 4df1e525

Tilman Schmidt authored Mar 21, 2015

Separate literal string handling from main parser loop.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

4df1e525

isdn/gigaset: restructure modem response parser (1) · 6bf50114

Tilman Schmidt authored Mar 21, 2015

Factor out queueing of modem response events into helper function
add_cid_event().
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

6bf50114

switchdev: fix stp update API to work with layered netdevices · 558d51fa

Roopa Prabhu authored Mar 21, 2015

make it same as the netdev_switch_port_bridge_setlink/dellink
api (ie traverse lowerdevs to get to the switch port).

removes "WARN_ON(!ops->ndo_switch_parent_id_get)" because
direct bridge ports can be stacked netdevices (like bonds
and team of switch ports) which may not implement this ndo.

v2 to v3:
	- remove changes to bond and team. Bring back the
	transparently following lowerdevs like i initially
	had for setlink/getlink
	(http://www.spinics.net/lists/netdev/msg313436.html)
	dave and scott feldman also seem to prefer it be that
	way and move to non-transparent way of doing things
	if we see a problem down the lane.

v3 to v4:
	- fix ret initialization

v4 to v5:
	- return err on first failure (scott feldman)

v5 to v6:
	- change variable name (err) and initialize to
	-EOPNOTSUPP (scott feldman).
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

558d51fa

net: clear skb->priority when forwarding to another netns · 08b4b8ea

WANG Cong authored Mar 20, 2015

skb->priority can be set for two purposes:

1) With respect to IP TOS field, which is computed by a mask.
Ususally used for priority qdisc's (pfifo, prio etc.), on TX
side (we only have ingress qdisc on RX side).

2) Used as a classid or flowid, works in the same way with tc
classid. What's more, this can even override the classid
of tc filters.

For case 1), it has been respected within its netns, I don't
see any point of keeping it for another netns, especially
when packets will be forwarded to Rx path (no matter from TX
path or RX path).

For case 2) we care, our applications run inside a netns,
and we classify the packets by our own filters outside,
If some application sets this priority, it could bypass
our filters, therefore clear it when moving out of a netns,
it makes no sense to bypass tc filters out of its netns.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

08b4b8ea

Merge branch 'crypto_async' · a659f91a

David S. Miller authored Mar 23, 2015

Tadeusz Struk says:

====================
Add support for async socket operations

After the iocb parameter has been removed from sendmsg() and recvmsg() ops
the socket layer, and the network stack no longer support async operations.
This patch set adds support for asynchronous operations on sockets back.

Changes in v3:
* As sugested by Al Viro instead of adding new functions aio_sendmsg
  and aio_recvmsg, added a ptr to iocb into the kernel-side msghdr structure.
  This way no change to aio.c is required.

Changes in v2:
* removed redundant total_size param from aio_sendmsg and aio_recvmsg functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a659f91a

crypto: algif - change algif_skcipher to be asynchronous · a596999b

Tadeusz Struk authored Mar 19, 2015

The way the algif_skcipher works currently is that on sendmsg/sendpage it
builds an sgl for the input data and then on read/recvmsg it sends the job
for encryption putting the user to sleep till the data is processed.
This way it can only handle one job at a given time.
This patch changes it to be asynchronous by adding AIO support.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a596999b