Commits · 610438b74496b2986a9025f8e23c134cb638e338 · nexedi / linux

11 Dec, 2013 25 commits

udp: ipv4: fix potential use after free in udp_v4_early_demux() · 610438b7

Eric Dumazet authored Dec 11, 2013

pskb_may_pull() can reallocate skb->head, we need to move the
initialization of iph and uh pointers after its call.

Fixes: 421b3885 ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

610438b7

macvtap: signal truncated packets · ce232ce0

Jason Wang authored Dec 11, 2013

macvtap_put_user() never return a value grater than iov length, this in fact
bypasses the truncated checking in macvtap_recvmsg(). Fix this by always
returning the size of packet plus the possible vlan header to let the trunca
checking work.

Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce232ce0

tun: unbreak truncated packet signalling · e6fd07c8

Jason Wang authored Dec 11, 2013

Commit 6680ec68
(tuntap: hardware vlan tx support) breaks the truncated packet signal by nev
return a length greater than iov length in tun_put_user(). This patch fixes
by always return the length of packet plus possible vlan header. Caller can
detect the truncated packet by comparing the return value and the size of io
length.

Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e6fd07c8

net: sched: htb: fix the calculation of quantum · 1598f7cb

Yang Yingliang authored Dec 10, 2013

Now, 32bit rates may be not the true rate.
So use rate_bytes_ps which is from
max(rate32, rate64) to calcualte quantum.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1598f7cb

net: sched: tbf: fix the calculation of max_size · cc106e44

Yang Yingliang authored Dec 10, 2013

Current max_size is caluated from rate table. Now, the rate table
has been replaced and it's wrong to caculate max_size based on this
rate table. It can lead wrong calculation of max_size.

The burst in kernel may be lower than user asked, because burst may gets
some loss when transform it to buffer(E.g. "burst 40kb rate 30mbit/s")
and it seems we cannot avoid this loss. Burst's value(max_size) based on
rate table may be equal user asked. If a packet's length is max_size, this
packet will be stalled in tbf_dequeue() because its length is above the
burst in kernel so that it cannot get enough tokens. The max_size guards
against enqueuing packet sizes above q->buffer "time" in tbf_enqueue().

To make consistent with the calculation of tokens, this patch add a helper
psched_ns_t2l() to calculate burst(max_size) directly to fix this problem.

After this fix, we can support to using 64bit rates to calculate burst as well.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cc106e44

micrel: add support for KSZ8041RNLI · 4bd7b512

Sergei Shtylyov authored Dec 10, 2013

Renesas R-Car development boards use KSZ8041RNLI PHY which for some reason has
ID of 0x00221537 that is not documented for KSZ8041-family PHYs and does not
match the documented ID of 0x0022151x (where 'x' is the revision). We have
to add the new #define PHY_ID_* and new ksphy_driver[] entry, almost the same
as KSZ8041 one, differing only in the 'phy_id' and 'name' fields.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4bd7b512

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · d0977e2b

David S. Miller authored Dec 11, 2013

John W. Linville says:

====================
Just one patch this time -- a fix from Felix Fietkau to fix the
duration calculation for non-aggregated packets in ath9k.  This is
a small change and it is obviously specific to ath9k.

Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d0977e2b

Merge branch 'master' of... · 33457ff7

John W. Linville authored Dec 11, 2013

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

33457ff7

udp: ipv4: fix an use after free in __udp4_lib_rcv() · 8afdd99a

Eric Dumazet authored Dec 10, 2013

Dave Jones reported a use after free in UDP stack :

[ 5059.434216] =========================
[ 5059.434314] [ BUG: held lock freed! ]
[ 5059.434420] 3.13.0-rc3+ #9 Not tainted
[ 5059.434520] -------------------------
[ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there!
[ 5059.434815]  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435012] 3 locks held by named/863:
[ 5059.435086]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940
[ 5059.435295]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410
[ 5059.435500]  #2:  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435734]
stack backtrace:
[ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365]
[ 5059.436052] Hardware name:                  /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010
[ 5059.436223]  0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0
[ 5059.436390]  ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0
[ 5059.436554]  ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48
[ 5059.436718] Call Trace:
[ 5059.436769]  <IRQ>  [<ffffffff8153a372>] dump_stack+0x4d/0x66
[ 5059.436904]  [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160
[ 5059.437037]  [<ffffffff8141c490>] ? __sk_free+0x110/0x230
[ 5059.437147]  [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150
[ 5059.437260]  [<ffffffff8141c490>] __sk_free+0x110/0x230
[ 5059.437364]  [<ffffffff8141c5c9>] sk_free+0x19/0x20
[ 5059.437463]  [<ffffffff8141cb25>] sock_edemux+0x25/0x40
[ 5059.437567]  [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280
[ 5059.437685]  [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.437805]  [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240
[ 5059.437925]  [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70
[ 5059.438038]  [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0
[ 5059.438155]  [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00
[ 5059.438269]  [<ffffffff8149d7f5>] udp_rcv+0x15/0x20
[ 5059.438367]  [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410
[ 5059.438492]  [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410
[ 5059.438621]  [<ffffffff81468653>] ip_local_deliver+0x43/0x80
[ 5059.438733]  [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0
[ 5059.438843]  [<ffffffff81468926>] ip_rcv+0x296/0x3f0
[ 5059.438945]  [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940
[ 5059.439074]  [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940
[ 5059.442231]  [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10
[ 5059.442231]  [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60
[ 5059.442231]  [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0
[ 5059.442231]  [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0
[ 5059.442231]  [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169]
[ 5059.442231]  [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0
[ 5059.442231]  [<ffffffff810478cd>] __do_softirq+0xed/0x240
[ 5059.442231]  [<ffffffff81047e25>] irq_exit+0x125/0x140
[ 5059.442231]  [<ffffffff81004241>] do_IRQ+0x51/0xc0
[ 5059.442231]  [<ffffffff81542bef>] common_interrupt+0x6f/0x6f

We need to keep a reference on the socket, by using skb_steal_sock()
at the right place.

Note that another patch is needed to fix a race in
udp_sk_rx_dst_set(), as we hold no lock protecting the dst.

Fixes: 421b3885 ("udp: ipv4: Add udp early demux")
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8afdd99a

Merge branch 'sctp' · 4585a79d

David S. Miller authored Dec 10, 2013

Wang Weidong says:

====================
sctp: check the rto_min and rto_max

v6 -> v7:
  -patch2: fix the whitespace issues which pointed out by Daniel

v5 -> v6:
  split the v5' first patch to patch1 and patch2, and remove the
  macro in constants.h

  -patch1: do rto_min/max socket option handling in its own patch, and
   fix the check of rto_min/max.
  -patch2: do rto_min/max sysctl handling in its own patch.
  -patch3: add Suggested-by Daniel.

v4 -> v5:
  - patch1: add marco in constants.h and fix up spacing as
    suggested by Daniel
  - patch2: add a patch for fix up do_hmac_alg for according
    to do_rto_min[max]

v3 -> v4:
  -patch1: fix use init_net directly which suggested by Vlad.

v2 -> v3:
  -patch1: add proc_handler for check rto_min and rto_max which suggested
   by Vlad

v1 -> v2:
  -patch1: fix the From Name which pointed out by David, and
   add the ACK by Neil
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4585a79d

sctp: fix up a spacing · b486b228

wangweidong authored Dec 11, 2013

fix up spacing of proc_sctp_do_hmac_alg for according to the
proc_sctp_do_rto_min[max] in sysctl.c
Suggested-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b486b228

sctp: add check rto_min and rto_max in sysctl · 4f3fdf3b

wangweidong authored Dec 11, 2013

rto_min should be smaller than rto_max while rto_max should be larger
than rto_min. Add two proc_handler for the checking.
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4f3fdf3b

sctp: check the rto_min and rto_max in setsockopt · 85f935d4

wangweidong authored Dec 11, 2013

When we set 0 to rto_min or rto_max, just not change the value. Also
we should check the rto_min > rto_max.
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85f935d4

ipv6: do not erase dst address with flow label destination · ce7a3bdf

Florent Fourcot authored Dec 10, 2013

This patch is following b579035f
	"ipv6: remove old conditions on flow label sharing"

Since there is no reason to restrict a label to a
destination, we should not erase the destination value of a
socket with the value contained in the flow label storage.

This patch allows to really have the same flow label to more
than one destination.
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce7a3bdf

sctp: properly latch and use autoclose value from sock to association · 9f70f46b

Neil Horman authored Dec 10, 2013

Currently, sctp associations latch a sockets autoclose value to an association
at association init time, subject to capping constraints from the max_autoclose
sysctl value.  This leads to an odd situation where an application may set a
socket level autoclose timeout, but sliently sctp will limit the autoclose
timeout to something less than that.

Fix this by modifying the autoclose setsockopt function to check the limit, cap
it and warn the user via syslog that the timeout is capped.  This will allow
getsockopt to return valid autoclose timeout values that reflect what subsequent
associations actually use.

While were at it, also elimintate the assoc->autoclose variable, it duplicates
whats in the timeout array, which leads to multiple sources for the same
information, that may differ (as the former isn't subject to any capping).  This
gives us the timeout information in a canonical place and saves some space in
the association structure as well.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
CC: Wang Weidong <wangweidong1@huawei.com>
CC: David Miller <davem@davemloft.net>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

9f70f46b

Merge branch 'tipc' · 231df15f

David S. Miller authored Dec 10, 2013

Jon Maloy says:

====================
tipc: corrections related to tasklet job mechanism

These commits correct two bugs related to tipc' service for launching
functions for asynchronous execution in a separate tasklet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

231df15f

tipc: protect handler_enabled variable with qitem_lock spin lock · 00ede977

Ying Xue authored Dec 09, 2013

'handler_enabled' is a global flag indicating whether the TIPC
signal handling service is enabled or not. The lack of lock
protection for this flag incurs a risk for contention, so that
a tipc_k_signal() call might queue a signal handler to a destroyed
signal queue, with unpredictable results. To correct this, we let
the already existing 'qitem_lock' protect the flag, as it already
does with the queue itself. This way, we ensure that the flag
always is consistent across all cores.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

00ede977

tipc: correct the order of stopping services at rmmod · 993b858e

Jon Paul Maloy authored Dec 09, 2013

The 'signal handler' service in TIPC is a mechanism that makes it
possible to postpone execution of functions, by launcing them into
a job queue for execution in a separate tasklet, independent of
the launching execution thread.

When we do rmmod on the tipc module, this service is stopped after
the network service. At the same time, the stopping of the network
service may itself launch jobs for execution, with the risk that these
functions may be scheduled for execution after the data structures
meant to be accessed by the job have already been deleted. We have
seen this happen, most often resulting in an oops.

This commit ensures that the signal handler is the very first to be
stopped when TIPC is shut down, so there are no surprises during
the cleanup of the other services.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

993b858e

tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0 · 388d3335

Nat Gurumoorthy authored Dec 09, 2013

The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120)
uninitialized. From power on reset this register may have garbage in it. The
Register Base Address register defines the device local address of a
register. The data pointed to by this location is read or written using
the Register Data register (PCI config offset 128). When REG_BASE_ADDR has
garbage any read or write of Register Data Register (PCI 128) will cause the
PCI bus to lock up. The TCO watchdog will fire and bring down the system.
Signed-off-by: Nat Gurumoorthy <natg@google.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

388d3335

net: Revert macvtap/tun truncation signalling changes. · bbd37626

David S. Miller authored Dec 10, 2013

Jason Wang and Michael S. Tsirkin are still discussing how
to properly fix this.
Signed-off-by: David S. Miller <davem@davemloft.net>

bbd37626

macvtap: signal truncated packets · 730054da

Jason Wang authored Dec 09, 2013

macvtap_put_user() never return a value grater than iov length, this in fact
bypasses the truncated checking in macvtap_recvmsg(). Fix this by always
returning the size of packet plus the possible vlan header to let the truncated
checking work.

Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

730054da

tun: unbreak truncated packet signalling · 923347bb

Jason Wang authored Dec 09, 2013

Commit 6680ec68
(tuntap: hardware vlan tx support) breaks the truncated packet signal by never
return a length greater than iov length in tun_put_user(). This patch fixes this
by always return the length of packet plus possible vlan header. Caller can
detect the truncated packet by comparing the return value and the size of iov
length.
Reported-by: Vlad Yasevich <vyasevich@gmail.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

923347bb

vxlan: release rt when found circular route · fffc15a5

Fan Du authored Dec 09, 2013

Otherwise causing dst memory leakage.
Have Checked all other type tunnel device transmit implementation,
no such things happens anymore.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fffc15a5

net: unix: allow set_peek_off to fail · 12663bfc

Sasha Levin authored Dec 07, 2013

unix_dgram_recvmsg() will hold the readlock of the socket until recv
is complete.

In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
unix_dgram_recvmsg() will complete (which can take a while) without allowing
us to break out of it, triggering a hung task spew.

Instead, allow set_peek_off to fail, this way userspace will not hang.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

12663bfc

Merge branch 'sfc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc · 88b07b36

David S. Miller authored Dec 10, 2013

Ben Hutchings says:

====================
Several fixes for the PTP hardware support added in 3.7:
1. Fix filtering of PTP packets on the TX path to be robust against bad
header lengths.
2. Limit logging on the RX path in case of a PTP packet flood, partly
from Laurence Evans.
3. Disable PTP hardware when the interface is down so that we don't
receive RX timestamp events, from Alexandre Rames.
4. Maintain clock frequency adjustment when a time offset is applied.

Also fixes for the SFC9100 family support added in 3.12:
5. Take the RX prefix length into account when applying NET_IP_ALIGN,
from Andrew Rybchenko.
6. Work around a bug that breaks communication between the driver and
firmware, from Robert Stonehouse.

Please also queue these up for the appropriate stable branches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

88b07b36

10 Dec, 2013 8 commits

net: allwinner: emac: Add missing free_irq · e9c56f8d

Maxime Ripard authored Dec 10, 2013

The sun4i-emac driver uses devm_request_irq at .ndo_open time, but relies on
the managed device mechanism to actually free it. This causes an issue whenever
someone wants to restart the interface, the interrupt still being held, and not
yet released.

Fall back to using the regular request_irq at .ndo_open time, and introduce a
free_irq during .ndo_stop.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: David S. Miller <davem@davemloft.net>

e9c56f8d

inet: fix NULL pointer Oops in fib(6)_rule_suppress · 673498b8

Stefan Tomanek authored Dec 10, 2013

This changes ensures that the routing entry investigated by the suppress
function actually does point to a device struct before following that pointer,
fixing a possible kernel oops situation when verifying the interface group
associated with a routing table entry.

According to Daniel Golle, this Oops can be triggered by a user process trying
to establish an outgoing IPv6 connection while having no real IPv6 connectivity
set up (only autoassigned link-local addresses).

Fixes: 6ef94cfa ("fib_rules: add route suppression based on ifgroup")
Reported-by: Daniel Golle <daniel.golle@gmail.com>
Tested-by: Daniel Golle <daniel.golle@gmail.com>
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

673498b8

net: drop_monitor: fix the value of maxattr · d323e92c

Changli Gao authored Dec 08, 2013

maxattr in genl_family should be used to save the max attribute
type, but not the max command type. Drop monitor doesn't support
any attributes, so we should leave it as zero.
Signed-off-by: David S. Miller <davem@davemloft.net>

d323e92c

net: emaclite: add barriers to support Xilinx Zynq platform · ec21b6b4

Srikanth Thokala authored Dec 07, 2013

This patch adds barriers at appropriate places to ensure the driver
works on Xilinx Zynq ARM-based SoC platform.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ec21b6b4

net: emaclite: Remove unnecessary code that enables/disables interrupts on PONG buffers · 243fedd5

Srikanth Thokala authored Dec 07, 2013

There are no specific interrupts for the PONG buffer on both
transmit and receive side, same interrupt is valid for both
buffers. So, this patch removes this code.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Reviewed-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

243fedd5

ipv6: don't count addrconf generated routes against gc limit · a3300ef4

Hannes Frederic Sowa authored Dec 07, 2013

Brett Ciphery reported that new ipv6 addresses failed to get installed
because the addrconf generated dsts where counted against the dst gc
limit. We don't need to count those routes like we currently don't count
administratively added routes.

Because the max_addresses check enforces a limit on unbounded address
generation first in case someone plays with router advertisments, we
are still safe here.
Reported-by: Brett Ciphery <brett.ciphery@windriver.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a3300ef4

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 6a46ff87

David S. Miller authored Dec 09, 2013

Pablo Neira Ayuso says:

====================
The following patchset contains three Netfilter fixes for your net tree,
they are:

* fix incorrect comparison in the new netnet hash ipset type, from
  Dave Jones.

* fix splat in hashlimit due to missing removal of the content of its
  proc entry in netnamespaces, from Sergey Popovich.

* fix missing rule flushing operation by table in nf_tables. Table
  flushing was already discussed back in October but this got lost and
  no patch has hit the tree to address this issue so far, from me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6a46ff87

packet: fix send path when running with proto == 0 · 66e56cd4

Daniel Borkmann authored Dec 06, 2013

Commit e40526cb introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.

We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.

So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.

While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb.

 [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

Fixes: e40526cb ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Salam Noureddine <noureddine@aristanetworks.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

66e56cd4

09 Dec, 2013 1 commit

ath9k: fix duration calculation for non-aggregated packets · bbf807bc

Felix Fietkau authored Dec 05, 2013

When not aggregating packets, fi->framelen should be passed in as length
to calculate the duration. Before the tx path rework, ath_tx_fill_desc
was called for either one aggregate, or one single frame, with the
length of the packet or the aggregate as a parameter.
After the rework, ath_tx_sched_aggr can pass a burst of single frames to
ath_tx_fill_desc and sets len=0.
Fix broken duration calculation by overriding the length in ath_tx_fill_desc
before passing it to ath_buf_set_rate.

Cc: stable@vger.kernel.org
Reported-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

bbf807bc

07 Dec, 2013 2 commits

netfilter: nf_tables: fix missing rules flushing per table · cf9dc09d

Pablo Neira Ayuso authored Nov 24, 2013

This patch allows you to atomically remove all rules stored in
a table via the NFT_MSG_DELRULE command. You only need to indicate
the specific table and no chain to flush all rules stored in that
table.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

cf9dc09d

netfilter: xt_hashlimit: fix proc entry leak in netns destroy path · b4ef4ce0

Sergey Popovich authored Dec 06, 2013

In (32263dd1 netfilter: xt_hashlimit: fix namespace destroy path)
the hashlimit_net_exit() function is always called right before
hashlimit_mt_destroy() to release netns data. If you use xt_hashlimit
with IPv4 and IPv6 together, this produces the following splat via
netconsole in the netns destroy path:

 Pid: 9499, comm: kworker/u:0 Tainted: G        WC O 3.2.0-5-netctl-amd64-core2
 Call Trace:
  [<ffffffff8104708d>] ? warn_slowpath_common+0x78/0x8c
  [<ffffffff81047139>] ? warn_slowpath_fmt+0x45/0x4a
  [<ffffffff81144a99>] ? remove_proc_entry+0xd8/0x22e
  [<ffffffff810ebbaa>] ? kfree+0x5b/0x6c
  [<ffffffffa043c501>] ? hashlimit_net_exit+0x45/0x8d [xt_hashlimit]
  [<ffffffff8128ab30>] ? ops_exit_list+0x1c/0x44
  [<ffffffff8128b28e>] ? cleanup_net+0xf1/0x180
  [<ffffffff810369fc>] ? should_resched+0x5/0x23
  [<ffffffff8105b8f9>] ? process_one_work+0x161/0x269
  [<ffffffff8105aea5>] ? cwq_activate_delayed_work+0x3c/0x48
  [<ffffffff8105c8c2>] ? worker_thread+0xc2/0x145
  [<ffffffff8105c800>] ? manage_workers.isra.25+0x15b/0x15b
  [<ffffffff8105fa01>] ? kthread+0x76/0x7e
  [<ffffffff813581f4>] ? kernel_thread_helper+0x4/0x10
  [<ffffffff8105f98b>] ? kthread_worker_fn+0x139/0x139
  [<ffffffff813581f0>] ? gs_change+0x13/0x13
 ---[ end trace d8c3cc0ad163ef79 ]---
 ------------[ cut here ]------------
 WARNING: at /usr/src/linux-3.2.52/debian/build/source_netctl/fs/proc/generic.c:849
 remove_proc_entry+0x217/0x22e()
 Hardware name:
 remove_proc_entry: removing non-empty directory 'net/ip6t_hashlimit', leaking at least 'IN-REJECT'

This is due to lack of removal net/ip6t_hashlimit/* entries in
hashlimit_proc_net_exit(), since only IPv4 entries are deleted. Fix
it by always removing the IPv4 and IPv6 entries and their parent
directories in the netns destroy path.
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

b4ef4ce0

06 Dec, 2013 4 commits

sfc: Poll for MCDI completion once before timeout occurs · 6b294b8e

Robert Stonehouse authored Oct 09, 2013

There is an as-yet unexplained bug that sometimes prevents (or delays)
the driver seeing the completion event for a completed MCDI request on
the SFC9120. The requested configuration change will have happened
but the driver assumes it to have failed, and this can result in
further failures. We can mitigate this by polling for completion
after unsuccessfully waiting for an event.

Fixes: 8127d661 ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

6b294b8e

sfc: Refactor efx_mcdi_poll() by introducing efx_mcdi_poll_once() · 5731d7b3
Robert Stonehouse authored Oct 09, 2013
```
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
```
5731d7b3

sfc: RX buffer allocation takes prefix size into account in IP header alignment · 2ec03014

Andrew Rybchenko authored Nov 16, 2013

rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal
to 14 on EF10. So, it should be taken into account if arch requires IP header
to be 4-bytes aligned (via NET_IP_ALIGN).

Fixes: 8127d661 ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

2ec03014

sfc: Maintain current frequency adjustment when applying a time offset · cd6fe65e

Ben Hutchings authored Dec 05, 2013

There is a single MCDI PTP operation for setting the frequency
adjustment and applying a time offset to the hardware clock. When
applying a time offset we should not change the frequency adjustment.

These two operations can now be requested separately but this requires
a flash firmware update. Keep using the single operation, but
remember and repeat the previous frequency adjustment.

Fixes: 7c236c43 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

cd6fe65e