Commits · 6f9a093b66ce7cacc110d8737c03686e80ecfda6 · Kirill Smelkov / linux

19 Jun, 2014 2 commits

net: filter: fix upper BPF instruction limit · 6f9a093b

Kees Cook authored Jun 18, 2014

The original checks (via sk_chk_filter) for instruction count uses ">",
not ">=", so changing this in sk_convert_filter has the potential to break
existing seccomp filters that used exactly BPF_MAXINSNS many instructions.

Fixes: bd4cf0ed ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # v3.15+
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6f9a093b

net: sctp: propagate sysctl errors from proc_do* properly · ff5e92c1

Daniel Borkmann authored Jun 19, 2014

sysctl handler proc_sctp_do_hmac_alg(), proc_sctp_do_rto_min() and
proc_sctp_do_rto_max() do not properly reflect some error cases
when writing values via sysctl from internal proc functions such
as proc_dointvec() and proc_dostring().

In all these cases we pass the test for write != 0 and partially
do additional work just to notice that additional sanity checks
fail and we return with hard-coded -EINVAL while proc_do*
functions might also return different errors. So fix this up by
simply testing a successful return of proc_do* right after
calling it.

This also allows to propagate its return value onwards to the user.
While touching this, also fix up some minor style issues.

Fixes: 4f3fdf3b ("sctp: add check rto_min and rto_max in sysctl")
Fixes: 3c68198e ("sctp: Make hmac algorithm selection for cookie generation dynamic")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff5e92c1

18 Jun, 2014 6 commits

net: return actual error on register_queue_kobjects · d36a4f4b

Jie Liu authored Jun 17, 2014

Return the actual error code if call kset_create_and_add() failed

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d36a4f4b

bonding: Advertize vxlan offload features when supported · 5a7baa78

Or Gerlitz authored Jun 17, 2014

When the underlying device supports TCP offloads for VXLAN/UDP
encapulated traffic, we need to reflect that through the hw_enc_features
field of the bonding net-device. This will cause the xmit path
in the core networking stack to provide bonding with encapsulated
GSO frames to offload into the HW etc.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5a7baa78

skge: Added FS A8NE-FM to the list of 32bit DMA boards · ee14eb7b

Mirko Lindner authored Jun 17, 2014

Added FUJITSU SIEMENS A8NE-FM to the list of 32bit DMA boards

>From Tomi O.:
After I added an entry to this MB into the skge.c
driver in order to enable the mentioned 64bit dma disable quirk,
the network data corruptions ended and everything is fine again.
Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee14eb7b

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 3a3ec1b2

David S. Miller authored Jun 18, 2014

Pablo Neira Ayuso says:

====================
netfilter fixes for net

The following patchset contains netfilter updates for your net tree,
they are:

1) Fix refcount leak when dumping the dying/unconfirmed conntrack lists,
   from Florian Westphal.

2) Fix crash in NAT when removing a netnamespace, also from Florian.

3) Fix a crash in IPVS when trying to remove an estimator out of the
   sysctl scope, from Julian Anastasov.

4) Add zone attribute to the routing to calculate the message size in
   ctnetlink events, from Ken-ichirou MATSUZAWA.

5) Another fix for the dying/unconfirmed list which was preventing to
   dump more than one memory page of entries (~17 entries in x86_64).

6) Fix missing RCU-safe list insertion in the rule replacement code
   in nf_tables.

7) Since the new transaction infrastructure is in place, we have to
   upgrade the chain use counter from u16 to u32 to avoid overflow
   after more than 2^16 rules are added.

8) Fix refcount leak when replacing rule in nf_tables. This problem
   was also introduced in new transaction.

9) Call the ->destroy() callback when releasing nft-xt rules to fix
   module refcount leaks.

10) Set the family in the netlink messages that contain set elements
    in nf_tables to make it consistent with other object types.

11) Don't dump NAT port information if it is unset in nft_nat.

12) Update the MAINTAINERS file, I have merged the ebtables entry
    into netfilter. While at it, also removed the netfilter users
    mailing list, the development list should be enough.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3a3ec1b2

MAINTAINERS: merge ebtables into netfilter entry · db9cf3a3

Pablo Neira Ayuso authored Jun 16, 2014

Moreover, remove reference to the netfilter users mailing list,
so they don't receive patches.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

db9cf3a3

net: fec: Don't clear IPV6 header checksum field when IP accelerator enable · 62a02c98

Fugang Duan authored Jun 18, 2014

The commit 96c50caa (net: fec: Enable IP header hardware checksum)
enable HW IP header checksum for IPV4 and IPV6, which causes IPV6 TCP/UDP
cannot work. (The issue is reported by Russell King)

For FEC IP header checksum function: Insert IP header checksum. This "IINS"
bit is written by the user. If set, IP accelerator calculates the IP header
checksum and overwrites the IINS corresponding header field with the calculated
value. The checksum field must be cleared by user, otherwise the checksum
always is 0xFFFF.

So the previous patch clear IP header checksum field regardless of IP frame
type.

In fact, IP HW detect the packet as IPV6 type, even if the "IINS" bit is set,
the IP accelerator is not triggered to calculates IPV6 header checksum because
IPV6 frame format don't have checksum.

So this results in the IPV6 frame being corrupted.

The patch just add software detect the current packet type, if it is IPV6
frame, it don't clear IP header checksum field.

Cc: Russell King <linux@arm.linux.org.uk>
Reported-and-tested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

62a02c98

17 Jun, 2014 10 commits

ptp: ptp_pch depends on x86_32 · bc56151d

Jean Delvare authored Jun 17, 2014

The ptp_pch driver is for a companion chip to the Intel Atom E600
series processors. These are 32-bit x86 processors so the driver is
only needed on X86_32.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bc56151d

hyperv: fix apparent cut-n-paste error in send path teardown · 2f18423d

Dave Jones authored Jun 16, 2014

c25aaf81: "hyperv: Enable sendbuf mechanism on the send path" added
some teardown code that looks like it was copied from the recieve path
above, but missed a variable name replacement.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2f18423d

tcp: remove unnecessary tcp_sk assignment. · 17846376

Dave Jones authored Jun 16, 2014

This variable is overwritten by the child socket assignment before
it ever gets used.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

17846376

net: tile: fix unused variable warning · 9ebe2435

Chris Metcalf authored Jun 16, 2014

'i' is unused in tile_net_dev_init() after commit d581ebf5
("net: tile: Use helpers from linux/etherdevice.h to check/set MAC").
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9ebe2435

ptp: In the testptp utility, use clock_adjtime from glibc when available · 42e1358e

Christian Riesch authored Jun 16, 2014

clock_adjtime was included in glibc 2.14. _GNU_SOURCE must be defined
to make it available.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

42e1358e

isdn: hisax: Drop duplicate Kconfig entry · ddc6fbd8

Jean Delvare authored Jun 16, 2014

There are 2 HISAX_AVM_A1_PCMCIA Kconfig entries. The kbuild system
ignores the second one, and apparently nobody noticed the problem so
far, so let's remove that second entry.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddc6fbd8

isdn: hisax: Merge Kconfig ifs · a1c33346

Jean Delvare authored Jun 16, 2014

The first half of the HiSax config options is presented if
ISDN_DRV_HISAX!=n, while the second half of the options is presented
if ISDN_DRV_HISAX. That's the same, so merge both conditionals.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

a1c33346

slcan: Port write_wakeup deadlock fix from slip · a8e83b17

Tyler Hall authored Jun 15, 2014

The commit "slip: Fix deadlock in write_wakeup" fixes a deadlock caused
by a change made in both slcan and slip. This is a direct port of that
fix.
Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Andre Naujoks <nautsch2@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

a8e83b17

slip: Fix deadlock in write_wakeup · 661f7fda

Tyler Hall authored Jun 15, 2014

Use schedule_work() to avoid potentially taking the spinlock in
interrupt context.

Commit cc9fa74e ("slip/slcan: added locking in wakeup function") added
necessary locking to the wakeup function and 367525c8/ddcde142 ("can:
slcan: Fix spinlock variant") converted it to spin_lock_bh() because the lock
is also taken in timers.

Disabling softirqs is not sufficient, however, as tty drivers may call
write_wakeup from interrupt context. This driver calls tty->ops->write() with
its spinlock held, which may immediately cause an interrupt on the same CPU and
subsequent spin_bug().

Simply converting to spin_lock_irq/irqsave() prevents this deadlock, but
causes lockdep to point out a possible circular locking dependency
between these locks:

(&(&sl->lock)->rlock){-.....}, at: slip_write_wakeup
(&port_lock_key){-.....}, at: serial8250_handle_irq.part.13

The slip transmit is holding the slip spinlock when calling the tty write.
This grabs the port lock. On an interrupt, the handler grabs the port
lock and calls write_wakeup which grabs the slip lock. This could be a
problem if a serial interrupt occurs on another CPU during the slip
transmit.

To deal with these issues, don't grab the lock in the wakeup function by
deferring the writeout to a workqueue. Also hold the lock during close
when de-assigning the tty pointer to safely disarm the worker and
timers.

This bug is easily reproducible on the first transmit when slip is
used with the standard 8250 serial driver.

[<c0410b7c>] (spin_bug+0x0/0x38) from [<c006109c>] (do_raw_spin_lock+0x60/0x1d0)
 r5:eab27000 r4:ec02754c
[<c006103c>] (do_raw_spin_lock+0x0/0x1d0) from [<c04185c0>] (_raw_spin_lock+0x28/0x2c)
 r10:0000001f r9:eabb814c r8:eabb8140 r7:40070193 r6:ec02754c r5:eab27000
 r4:ec02754c r3:00000000
[<c0418598>] (_raw_spin_lock+0x0/0x2c) from [<bf3a0220>] (slip_write_wakeup+0x50/0xe0 [slip])
 r4:ec027540 r3:00000003
[<bf3a01d0>] (slip_write_wakeup+0x0/0xe0 [slip]) from [<c026e420>] (tty_wakeup+0x48/0x68)
 r6:00000000 r5:ea80c480 r4:eab27000 r3:bf3a01d0
[<c026e3d8>] (tty_wakeup+0x0/0x68) from [<c028a8ec>] (uart_write_wakeup+0x2c/0x30)
 r5:ed68ea90 r4:c06790d8
[<c028a8c0>] (uart_write_wakeup+0x0/0x30) from [<c028dc44>] (serial8250_tx_chars+0x114/0x170)
[<c028db30>] (serial8250_tx_chars+0x0/0x170) from [<c028dffc>] (serial8250_handle_irq+0xa0/0xbc)
 r6:000000c2 r5:00000060 r4:c06790d8 r3:00000000
[<c028df5c>] (serial8250_handle_irq+0x0/0xbc) from [<c02933a4>] (dw8250_handle_irq+0x38/0x64)
 r7:00000000 r6:edd2f390 r5:000000c2 r4:c06790d8
[<c029336c>] (dw8250_handle_irq+0x0/0x64) from [<c028d2f4>] (serial8250_interrupt+0x44/0xc4)
 r6:00000000 r5:00000000 r4:c06791c4 r3:c029336c
[<c028d2b0>] (serial8250_interrupt+0x0/0xc4) from [<c0067fe4>] (handle_irq_event_percpu+0xb4/0x2b0)
 r10:c06790d8 r9:eab27000 r8:00000000 r7:00000000 r6:0000001f r5:edd52980
 r4:ec53b6c0 r3:c028d2b0
[<c0067f30>] (handle_irq_event_percpu+0x0/0x2b0) from [<c006822c>] (handle_irq_event+0x4c/0x6c)
 r10:c06790d8 r9:eab27000 r8:c0673ae0 r7:c05c2020 r6:ec53b6c0 r5:edd529d4
 r4:edd52980
[<c00681e0>] (handle_irq_event+0x0/0x6c) from [<c006b140>] (handle_level_irq+0xe8/0x100)
 r6:00000000 r5:edd529d4 r4:edd52980 r3:00022000
[<c006b058>] (handle_level_irq+0x0/0x100) from [<c00676f8>] (generic_handle_irq+0x30/0x40)
 r5:0000001f r4:0000001f
[<c00676c8>] (generic_handle_irq+0x0/0x40) from [<c000f57c>] (handle_IRQ+0xd0/0x13c)
 r4:ea997b18 r3:000000e0
[<c000f4ac>] (handle_IRQ+0x0/0x13c) from [<c00086c4>] (armada_370_xp_handle_irq+0x4c/0x118)
 r8:000003ff r7:ea997b18 r6:ffffffff r5:60070013 r4:c0674dc0
[<c0008678>] (armada_370_xp_handle_irq+0x0/0x118) from [<c0013840>] (__irq_svc+0x40/0x70)
Exception stack(0xea997b18 to 0xea997b60)
7b00:                                                       00000001 20070013
7b20: 00000000 0000000b 20070013 eab27000 20070013 00000000 ed10103e eab27000
7b40: c06790d8 ea997b74 ea997b60 ea997b60 c04186c0 c04186c8 60070013 ffffffff
 r9:eab27000 r8:ed10103e r7:ea997b4c r6:ffffffff r5:60070013 r4:c04186c8
[<c04186a4>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<c0288fc0>] (uart_start+0x40/0x44)
 r4:c06790d8 r3:c028ddd8
[<c0288f80>] (uart_start+0x0/0x44) from [<c028982c>] (uart_write+0xe4/0xf4)
 r6:0000003e r5:00000000 r4:ed68ea90 r3:0000003e
[<c0289748>] (uart_write+0x0/0xf4) from [<bf3a0d20>] (sl_xmit+0x1c4/0x228 [slip])
 r10:ed388e60 r9:0000003c r8:ffffffdd r7:0000003e r6:ec02754c r5:ea717eb8
 r4:ec027000
[<bf3a0b5c>] (sl_xmit+0x0/0x228 [slip]) from [<c0368d74>] (dev_hard_start_xmit+0x39c/0x6d0)
 r8:eaf163c0 r7:ec027000 r6:ea717eb8 r5:00000000 r4:00000000
Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Andre Naujoks <nautsch2@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

661f7fda

vmxnet3: adjust ring sizes when interface is down · f00e2b0a

Neil Horman authored Jun 13, 2014

If ethtool is used to update ring sizes on a vmxnet3 interface that isn't
running, the change isn't stored, meaning the ring update is effectively is
ignored and lost without any indication to the user.

Other network drivers store the ring size update so that ring allocation uses
the new sizes next time the interface is brought up.  This patch modifies
vmxnet3 to behave this way as well
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Shreyas Bhatewara <sbhatewara@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f00e2b0a

16 Jun, 2014 16 commits

netfilter: nf_nat: fix oops on netns removal · 945b2b2d

Florian Westphal authored Jun 07, 2014

Quoting Samu Kallio:

 Basically what's happening is, during netns cleanup,
 nf_nat_net_exit gets called before ipv4_net_exit. As I understand
 it, nf_nat_net_exit is supposed to kill any conntrack entries which
 have NAT context (through nf_ct_iterate_cleanup), but for some
 reason this doesn't happen (perhaps something else is still holding
 refs to those entries?).

 When ipv4_net_exit is called, conntrack entries (including those
 with NAT context) are cleaned up, but the
 nat_bysource hashtable is long gone - freed in nf_nat_net_exit. The
 bug happens when attempting to free a conntrack entry whose NAT hash
 'prev' field points to a slot in the freed hash table (head for that
 bin).

We ignore conntracks with null nat bindings.  But this is wrong,
as these are in bysource hash table as well.

Restore nat-cleaning for the netns-is-being-removed case.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=65191

Fixes: c2d421e1 ('netfilter: nf_nat: fix race when unloading protocol modules')
Reported-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Debugged-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

945b2b2d

netfilter: ctnetlink: add zone size to length · 4a001068

Ken-ichirou MATSUZAWA authored Jun 16, 2014

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

4a001068

Merge branch 'ipvs' · 98ca74f4

Pablo Neira Ayuso authored Jun 16, 2014

Simon Horman says:

====================
Fix for panic due use of tot_stats estimator outside of CONFIG_SYSCTL

It has been present since v3.6.39.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

98ca74f4

netfilter: nft_nat: don't dump port information if unset · 91513606

Pablo Neira Ayuso authored Jun 13, 2014

Don't include port information attributes if they are unset.
Reported-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

91513606

netfilter: nf_tables: indicate family when dumping set elements · 6403d962

Pablo Neira Ayuso authored Jun 11, 2014

Set the nfnetlink header that indicates the family of this element.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

6403d962

netfilter: nft_compat: call {target, match}->destroy() to cleanup entry · 3d9b1421

Pablo Neira Ayuso authored Jun 11, 2014

Otherwise, the reference to external objects (eg. modules) are not
released when the rules are removed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

3d9b1421

netfilter: nf_tables: fix wrong type in transaction when replacing rules · ac904ac8

Pablo Neira Ayuso authored Jun 10, 2014

In b380e5c7 ("netfilter: nf_tables: add message type to transactions"),
I used the wrong message type in the rule replacement case. The rule
that is replaced needs to be handled as a deleted rule.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ac904ac8

netfilter: nf_tables: decrement chain use counter when replacing rules · ac34b861

Pablo Neira Ayuso authored Jun 10, 2014

Thus, the chain use counter remains with the same value after the
rule replacement.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ac34b861

netfilter: nf_tables: use u32 for chain use counter · a0a7379e

Pablo Neira Ayuso authored Jun 10, 2014

Since 4fefee57 ("netfilter: nf_tables: allow to delete several objects
from a batch"), every new rule bumps the chain use counter. However,
this is limited to 16 bits, which means that it will overrun after
2^16 rules.

Use a u32 chain counter and check for overflows (just like we do for
table objects).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

a0a7379e

netfilter: nf_tables: use RCU-safe list insertion when replacing rules · 5bc5c307

Pablo Neira Ayuso authored Jun 10, 2014

The patch 5e948466 ("netfilter: nf_tables: add insert operation") did
not include RCU-safe list insertion when replacing rules.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

5bc5c307

netfilter: ctnetlink: fix refcnt leak in dying/unconfirmed list dumper · cd5f336f

Florian Westphal authored Jun 08, 2014

'last' keeps track of the ct that had its refcnt bumped during previous
dump cycle.  Thus it must not be overwritten until end-of-function.

Another (unrelated, theoretical) issue: Don't attempt to bump refcnt of a conntrack
whose reference count is already 0.  Such conntrack is being destroyed
right now, its memory is freed once we release the percpu dying spinlock.

Fixes: b7779d06 ('netfilter: conntrack: spinlock per cpu to protect special lists.')
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

cd5f336f

netfilter: ctnetlink: fix dumping of dying/unconfirmed conntracks · 266155b2

Pablo Neira Ayuso authored Jun 05, 2014

The dumping prematurely stops, it seems the callback argument that
indicates that all entries have been dumped is set after iterating
on the first cpu list. The dumping also may stop before the entire
per-cpu list content is also dumped.

With this patch, conntrack -L dying now shows the dying list content
again.

Fixes: b7779d06 ("netfilter: conntrack: spinlock per cpu to protect special lists.")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

266155b2

Linux 3.16-rc1 · 7171511e
Linus Torvalds authored Jun 15, 2014

7171511e

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · a9be2242

Linus Torvalds authored Jun 15, 2014

Pull networking fixes from David Miller:

 1) Fix checksumming regressions, from Tom Herbert.

 2) Undo unintentional permissions changes for SCTP rto_alpha and
    rto_beta sysfs knobs, from Denial Borkmann.

 3) VXLAN, like other IP tunnels, should advertize it's encapsulation
    size using dev->needed_headroom instead of dev->hard_header_len.
    From Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: sctp: fix permissions for rto_alpha and rto_beta knobs
  vxlan: Checksum fixes
  net: add skb_pop_rcv_encapsulation
  udp: call __skb_checksum_complete when doing full checksum
  net: Fix save software checksum complete
  net: Fix GSO constants to match NETIF flags
  udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup
  vxlan: use dev->needed_headroom instead of dev->hard_header_len
  MAINTAINERS: update cxgb4 maintainer

a9be2242

Merge tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux · dd1845af

Linus Torvalds authored Jun 15, 2014

Pull more clock framework updates from Mike Turquette:
 "This contains the second half the of the clk changes for 3.16.

  They are simply fixes and code refactoring for the OMAP clock drivers.
  The sunxi clock driver changes include splitting out the one
  mega-driver into several smaller pieces and adding support for the A31
  SoC clocks"

* tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits)
  clk: sunxi: document PRCM clock compatible strings
  clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support
  clk: sun6i: Protect SDRAM gating bit
  clk: sun6i: Protect CPU clock
  clk: sunxi: Rework clock protection code
  clk: sunxi: Move the GMAC clock to a file of its own
  clk: sunxi: Move the 24M oscillator to a file of its own
  clk: sunxi: Remove calls to clk_put
  clk: sunxi: document new A31 USB clock compatible
  clk: sunxi: Implement A31 USB clock
  ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies
  CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies
  ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC)
  CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck
  CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)
  dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings
  ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock
  CLK: TI: gate: add composite interface clock to OMAP2 only build
  ARM: OMAP2: clock: add DT boot support for cpufreq_ck
  CLK: TI: OMAP2: add clock init support
  ...

dd1845af

Merge git://git.infradead.org/users/willy/linux-nvme · b55b3902

Linus Torvalds authored Jun 15, 2014

Pull NVMe update from Matthew Wilcox:
 "Mostly bugfixes again for the NVMe driver.  I'd like to call out the
  exported tracepoint in the block layer; I believe Keith has cleared
  this with Jens.

  We've had a few reports from people who're really pounding on NVMe
  devices at scale, hence the timeout changes (and new module
  parameters), hotplug cpu deadlock, tracepoints, and minor performance
  tweaks"

[ Jens hadn't seen that tracepoint thing, but is ok with it - it will
  end up going away when mq conversion happens ]

* git://git.infradead.org/users/willy/linux-nvme: (22 commits)
  NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.
  NVMe: Use Log Page constants in SCSI emulation
  NVMe: Define Log Page constants
  NVMe: Fix hot cpu notification dead lock
  NVMe: Rename io_timeout to nvme_io_timeout
  NVMe: Use last bytes of f/w rev SCSI Inquiry
  NVMe: Adhere to request queue block accounting enable/disable
  NVMe: Fix nvme get/put queue semantics
  NVMe: Delete NVME_GET_FEAT_TEMP_THRESH
  NVMe: Make admin timeout a module parameter
  NVMe: Make iod bio timeout a parameter
  NVMe: Prevent possible NULL pointer dereference
  NVMe: Fix the buffer size passed in GetLogPage(CDW10.NUMD)
  NVMe: Update data structures for NVMe 1.2
  NVMe: Enable BUILD_BUG_ON checks
  NVMe: Update namespace and controller identify structures to the 1.1a spec
  NVMe: Flush with data support
  NVMe: Configure support for block flush
  NVMe: Add tracepoints
  NVMe: Protect against badly formatted CQEs
  ...

b55b3902

15 Jun, 2014 6 commits

net: sctp: fix permissions for rto_alpha and rto_beta knobs · b58537a1

Daniel Borkmann authored Jun 15, 2014

Commit 3fd091e7 ("[SCTP]: Remove multiple levels of msecs
to jiffies conversions.") has silently changed permissions for
rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
this was to discourage users from tweaking rto_alpha and
rto_beta knobs in production environments since they are key
to correctly compute rtt/srtt.

RFC4960 under section 6.3.1. RTO Calculation says regarding
rto_alpha and rto_beta under rule C3 and C4:

  [...]
  C3)  When a new RTT measurement R' is made, set

       RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'|

       and

       SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R'

       Note: The value of SRTT used in the update to RTTVAR
       is its value before updating SRTT itself using the
       second assignment. After the computation, update
       RTO <- SRTT + 4 * RTTVAR.

  C4)  When data is in flight and when allowed by rule C5
       below, a new RTT measurement MUST be made each round
       trip. Furthermore, new RTT measurements SHOULD be
       made no more than once per round trip for a given
       destination transport address. There are two reasons
       for this recommendation: First, it appears that
       measuring more frequently often does not in practice
       yield any significant benefit [ALLMAN99]; second,
       if measurements are made more often, then the values
       of RTO.Alpha and RTO.Beta in rule C3 above should be
       adjusted so that SRTT and RTTVAR still adjust to
       changes at roughly the same rate (in terms of how many
       round trips it takes them to reflect new values) as
       they would if making only one measurement per
       round-trip and using RTO.Alpha and RTO.Beta as given
       in rule C3. However, the exact nature of these
       adjustments remains a research issue.
  [...]

While it is discouraged to adjust rto_alpha and rto_beta
and not further specified how to adjust them, the RFC also
doesn't explicitly forbid it, but rather gives a RECOMMENDED
default value (rto_alpha=3, rto_beta=2). We have a couple
of users relying on the old permissions before they got
changed. That said, if someone really has the urge to adjust
them, we could allow it with a warning in the log.

Fixes: 3fd091e7 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b58537a1

Merge branch 'csum_fixes' · e4f7ae93

David S. Miller authored Jun 15, 2014

Tom Herbert says:

====================
Fixes related to some recent checksum modifications.

- Fix GSO constants to match NETIF flags
- Fix logic in saving checksum complete in __skb_checksum_complete
- Call __skb_checksum_complete from UDP if we are checksumming over
  whole packet in order to save checksum.
- Fixes to VXLAN to work correctly with checksum complete
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e4f7ae93

vxlan: Checksum fixes · f79b064c

Tom Herbert authored Jun 14, 2014

Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet
header to work properly with checksum complete.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f79b064c

net: add skb_pop_rcv_encapsulation · e5eb4e30

Tom Herbert authored Jun 14, 2014

This function is used by UDP encapsulation protocols in RX when
crossing encapsulation boundary. If ip_summed is set to
CHECKSUM_UNNECESSARY and encapsulation is not set, change to
CHECKSUM_NONE since the checksum has not been validated within the
encapsulation. Clears csum_valid by the same rationale.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5eb4e30

udp: call __skb_checksum_complete when doing full checksum · bbdff225

Tom Herbert authored Jun 14, 2014

In __udp_lib_checksum_complete check if checksum is being done over all
the data (len is equal to skb->len) and if it is call
__skb_checksum_complete instead of __skb_checksum_complete_head. This
allows checksum to be saved in checksum complete.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bbdff225

net: Fix save software checksum complete · 46fb51eb

Tom Herbert authored Jun 14, 2014

Geert reported issues regarding checksum complete and UDP.
The logic introduced in commit 7e3cead5
("net: Save software checksum complete") is not correct.

This patch:
1) Restores code in __skb_checksum_complete_header except for setting
   CHECKSUM_UNNECESSARY. This function may be calculating checksum on
   something less than skb->len.
2) Adds saving checksum to __skb_checksum_complete. The full packet
   checksum 0..skb->len is calculated without adding in pseudo header.
   This value is saved in skb->csum and then the pseudo header is added
   to that to derive the checksum for validation.
3) In both __skb_checksum_complete_header and __skb_checksum_complete,
   set skb->csum_valid to whether checksum of zero was computed. This
   allows skb_csum_unnecessary to return true without changing to
   CHECKSUM_UNNECESSARY which was done previously.
4) Copy new csum related bits in __copy_skb_header.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

46fb51eb