Commits · d9b2938aabf757da2d40153489b251d4fc3fdd18 · Kirill Smelkov / linux

30 Aug, 2014 8 commits

net: attempt a single high order allocation · d9b2938a

Eric Dumazet authored Aug 27, 2014

In commit ed98df33 ("net: use __GFP_NORETRY for high order
allocations") we tried to address one issue caused by order-3
allocations.

We still observe high latencies and system overhead in situations where
compaction is not successful.

Instead of trying order-3, order-2, and order-1, do a single order-3
best effort and immediately fallback to plain order-0.

This mimics slub strategy to fallback to slab min order if the high
order allocation used for performance failed.

Order-3 allocations give a performance boost only if they can be done
without recurring and expensive memory scan.

Quoting David :

The page allocator relies on synchronous (sync light) memory compaction
after direct reclaim for allocations that don't retry and deferred
compaction doesn't work with this strategy because the allocation order
is always decreasing from the previous failed attempt.

This means sync light compaction will always be encountered if memory
cannot be defragmented or reclaimed several times during the
skb_page_frag_refill() iteration.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d9b2938a

Merge branch 'mlx4-net' · bcc73547

David S. Miller authored Aug 29, 2014

Or Gerlitz says:

====================
Setup mlx4 user space Ethernet QPs to properly handle VXLAN

This short series fixes the mlx4 driver setting of user space Ethernet QPs
(e.g those opened by DPDK applications) such that they will properly handle
VXLAN traffic/offloads
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

bcc73547

mlx4: Set user-space raw Ethernet QPs to properly handle VXLAN traffic · d2fce8a9

Or Gerlitz authored Aug 27, 2014

Raw Ethernet QPs opened from user-space lack the proper setup to
recieve/handle VXLAN traffic when VXLAN offloads are enabled.

Fix that by adding a tunnel steering rule on top of the normal unicast
steering rule and set the tunnel_type field in the QP context.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d2fce8a9

net/mlx4: Move the tunnel steering helper function to mlx4_core · b95089d0

Or Gerlitz authored Aug 27, 2014

Move the function which we use to set VXLAN DMFS (flow-steering) rules
from mlx4_en to mlx4_core. This refactoring will allow the mlx4_ib driver
to call the helper for the use case of user-space RAW Ethernet QPs, such
that they can serve VXLAN traffic too.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b95089d0

stmmac: fix dma api misuse · 362b37be

Giuseppe CAVALLARO authored Aug 27, 2014

Enabling DMA_API_DEBUG, warnings are reported at runtime
because the device driver frees DMA memory with wrong functions
and it does not call dma_mapping_error after mapping dma memory.

The first problem is fixed by of introducing a flag that helps us
keeping track which mapping technique was used, so that we can use
the right API for unmap.
This approach was inspired by the e1000 driver, which uses a similar
technique.
Signed-off-by: Andre Draszik <andre.draszik@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Denis Kirjanov <kda@linux-powerpc.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

362b37be

stmmac: ptp: fix the reference clock · 5566401f

Giuseppe CAVALLARO authored Aug 27, 2014

The PTP reference clock, used for setting the addend in the Timestamp Addend
Register, was erroneously hard-coded (as reported in the databook just as
example).

The patch removes the macro named: STMMAC_SYSCLOCK and allows to use a
reference clock (clk_ptp_ref_i) that can be passed from the platform.

If not passed, the main driver clock will be used as default; note that
this can be fine on some platforms.

Note that, prior this patch, using the old STMMAC_SYSCLOCK on some platforms,
as side effect, the ptp clock can move faster/slower than the system clock.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5566401f

stmmac: fix tipo on mmc crc error · 2b78d348

Giuseppe CAVALLARO authored Aug 27, 2014

This patch is to fix a typo on mmc rx crc error when reported by ethtool.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2b78d348

stmmac: prevent false carrier sense detection · b1dee479

Giuseppe CAVALLARO authored Aug 27, 2014

This patch is to w/a a problem that happens on some boxes when run at 10Mbps
Half duplex mode.

During the transmission the CSR signal is asserted for some time and the frames
aborted because of carrier sense error.
This is reported by MMC HW counter: txcarrier signal.
This actually is a false carrier so the frames are good and there is no reason
to ask for dropping them.

This patch so disables the Carrier Sense During Transmission
and this means that the MAC transmitter ignore the CRS signal
during frame transmission in Half-Duplex mode.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: Vince Bridgers <vbridgers2013@gmail.com>
Acked-by: Ley Foon Tan <lftan@altera.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1dee479

27 Aug, 2014 1 commit

phy: fix EEE checks inside the phy_init_eee. · 7a4cecf7

Giuseppe CAVALLARO authored Aug 26, 2014

According to the Std 802.3az if the EEE Adv (Reg 7.60), Link partner ability
(Reg 7.61) and EEE capability (Register 3.20) bits return 0 this  means no EEE
is supported. So this patch fixes the checks inside the phy_init_eee function.
Signed-off-by: Nandini Sharma <nandini.sharma@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a4cecf7

26 Aug, 2014 16 commits

mvneta: Add missing if_vlan.h include. · 2d39d120

David S. Miller authored Aug 25, 2014

drivers/net/ethernet/marvell/mvneta.c: In function 'mvneta_skb_tx_csum':
drivers/net/ethernet/marvell/mvneta.c:1374:3: error: implicit declaration of function 'vlan_get_protocol' [-Werror=implicit-function-declaration]
   __be16 l3_proto = vlan_get_protocol(skb);
   ^
Reporeted-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

2d39d120

xen-netback: move netif_napi_add before binding interrupt · e24f8191

Wei Liu authored Aug 25, 2014

Interrupt is enabled when bind_interdomain_evtchn_to_irqhandler returns.
If there's interrupt pending interrupt handler is invoked.

NAPI needs to be initialised before binding interrupt otherwise the
interrupt handler will try to scheduling a NAPI instance that is not
initialised yet, resulting in kernel OOPS.

This fixes a regression introduced in ea2c5e13 ("xen-netback: move NAPI
add/remove calls").

Ideally function calls to create kthreads should also be moved before
binding but I intent to fix this regression with minimal changes and
refactor the code with another patch.
Reported-by: Thomas Leonard <talex5@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e24f8191

Merge branch 'tso_fix' · 8dbb200f

David S. Miller authored Aug 25, 2014

Vladislav Yasevich says:

====================
Fix TSO and checksum issues with non-accelerated vlan traffic.

I've recently ran across something rather interesting when testing vlans
from inside VMs.  In some scenarios I was getting awfull thruput.
Some debugging uncovered a very scary packet corruption.  I was
seeing packets that had original TSO length as IP total length
and their ip checksum was 0.  This was with e1000e driver.

A bit more debugging uncovered an assumption made by that driver
that skb->protocol will contain l3 protocol information.  This
was not the case in my setup since I manually turned off vlan
tx acceleration for the device.  This caused the driver to not
initialize the tso information correctly and resulted in
corrupt TSO frames on the wire.

I decided to do some auditing of the usage of skb->protocols
in the drivers.  Out of all the drivers, the included 8 appear
to be effected.  They all allow user to control vlan acceleration
settings, all support TSO on vlan devices, and all use
skb->protocol to decide how to encode TSO information.  Some
also have similar problems when initializing hw checksum information.
On such device, it is simple enough to reproduce the issue.
Simply turn off TX VLAN acceleration on the device, create a vlan,
and run you favorite network performance tool.

There is 1 driver I ran across that I belive will trigger a BUG
in the system when used with vlans.  That driver is tile/tilepro.c
I have not changed it in this patch set and would hope that
the maintainer has time to look at it.

V2: Fix i40ev using the wrong function name.  Full build.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8dbb200f

qlge: Fix TSO for non-accelerated vlan traffic · 1ee1cfe7

Vlad Yasevich authored Aug 25, 2014

This device claims TSO support for vlans.  It also allows a user to
control vlan acceleration offloading.  As such, it is possible to turn
off vlan acceleration and configure a vlan which will continue to send
TSO traffic.

In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO information.
This results in corrupted frames sent on the wire.

This patch extracts the protocol value correctly by using a
vlan_get_protocol() helper and corrects corrupt TSO frames.

CC: Shahed Shaikh <shahed.shaikh@qlogic.com>
CC: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
CC: Ron Mercer <ron.mercer@qlogic.com>
CC: linux-driver@qlogic.com
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1ee1cfe7

mvneta: Fix TSO and checksum for non-acceleration vlan traffic · 817dbfa5

Vlad Yasevich authored Aug 25, 2014

This driver doesn't appear to support vlan acceleration at
all.  However, it does claim to support TSO and IP checksums
for vlan devices.  Thus any configured vlan device would
end up passing down partial checksums or TSO frames.

The driver also uses the value from skb->protocol to
determine TSO and checksum offload information, but assumes
that skb->protocol holds the l3 protocol information.
As a result, vlan traffic with partial checksums or TSO
will fail those checks and TSO will not happen.

Fix this by using vlan_get_protocol() helper.

CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

817dbfa5

i40evf: Fix TSO and hw checksums for non-accelerated vlan packets. · a12c4158

Vlad Yasevich authored Aug 25, 2014

This device claims TSO and checksum support for vlans.  It also
allows a user to control vlan acceleration offloading.  As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO and hw checksums.

In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO and checksum information.
This results in corrupted frames sent on the wire.

This patch extract the protocol value correctly and corrects TSO
and checksums for non-accelerated traffic.

Fix this by using vlan_get_protocol() helper.

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
CC: Greg Rose <gregory.v.rose@intel.com>
CC: Alex Duyck <alexander.h.duyck@intel.com>
CC: John Ronciak <john.ronciak@intel.com>
CC: Mitch Williams <mitch.a.williams@intel.com>
CC: Linux NICS <linux.nics@intel.com>
CC: e1000-devel@lists.sourceforge.net
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a12c4158

i40e: Fix TSO and hw checksums for non-accelerated vlan packets. · 3d34dd03

Vlad Yasevich authored Aug 25, 2014

This device claims TSO and checksum support for vlans.  It also
allows a user to control vlan acceleration offloading.  As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO and hw checksums.

In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO and checksum information.
This results in corrupted frames sent on the wire.

This patch extract the protocol value correctly and corrects TSO
and checksums for non-accelerated traffic.

Fix this by using vlan_get_protocol() helper.

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
CC: Greg Rose <gregory.v.rose@intel.com>
CC: Alex Duyck <alexander.h.duyck@intel.com>
CC: John Ronciak <john.ronciak@intel.com>
CC: Mitch Williams <mitch.a.williams@intel.com>
CC: Linux NICS <linux.nics@intel.com>
CC: e1000-devel@lists.sourceforge.net
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d34dd03

ehea: Fix TSO and hw checksums with non-accelerated vlan packets. · be1d1486

Vlad Yasevich authored Aug 25, 2014

The driver claims that it can do TSO and IP checksums on vlan
devices and also allows user to control vlan acceleration offloading.
This makes it possible to push traffic to this driver that has TSO or
partial checksums set, but also have a non-accelearted vlan
header.  In this case, the driver will fail to correctly
identify such traffic and will not correctly perform
segmentation and checksum calculation.

Fix this by using vlan_get_protocol() helper instead of
assuming skb->protocol always has this information.

CC: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be1d1486

bna: Support TSO and partial checksum with non-accelerated vlans. · 1c53730a

Vlad Yasevich authored Aug 25, 2014

This device claims TSO and checksum support for vlans.  It also
allows a user to control vlan acceleration offloading.  As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO.

In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO information.  This results
in corrupted frames sent on the wire.

This patch extract the protocol value correctly and corrects TSO
and checksums for non-accelerated traffic.

CC: Rasesh Mody <rmody@brocade.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c53730a

e1000: Fix TSO for non-accelerated vlan traffic · 06f4d033

Vlad Yasevich authored Aug 25, 2014

This device claims TSO and checksum support for vlans.  It also
allows a user to control vlan acceleration offloading.  As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO.

In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO and checksum information.
This will results in corrupted frames sent on the wire.

This patch extract the protocol value correctly and corrects TSO
for non-accelerated traffic.

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
CC: Greg Rose <gregory.v.rose@intel.com>
CC: Alex Duyck <alexander.h.duyck@intel.com>
CC: John Ronciak <john.ronciak@intel.com>
CC: Mitch Williams <mitch.a.williams@intel.com>
CC: Linux NICS <linux.nics@intel.com>
CC: e1000-devel@lists.sourceforge.net
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

06f4d033

e1000e: Fix TSO with non-accelerated vlans · 47ccd1ed

Vlad Yasevich authored Aug 25, 2014

This device claims  TSO support for vlans.  It also allows a
user to control vlan acceleration offloading.  As such, it is
possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO.

In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO information.  This results
in corrupted frames sent on the wire.  Corruptions include
incorrect IP total length and invalid IP checksum.

This patch extract the protocol value correctly and corrects TSO
for non-accelerated traffic.

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
CC: Greg Rose <gregory.v.rose@intel.com>
CC: Alex Duyck <alexander.h.duyck@intel.com>
CC: John Ronciak <john.ronciak@intel.com>
CC: Mitch Williams <mitch.a.williams@intel.com>
CC: Linux NICS <linux.nics@intel.com>
CC: e1000-devel@lists.sourceforge.net
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

47ccd1ed

net: moxa: continue loop on skb allocation failure · 2b7890e7

Jonas Jensen authored Aug 25, 2014

If netdev_alloc_skb_ip_align() fails, subsequent code will
try to dereference an invalid pointer.

Continue to next descriptor on error.

While we're at it,

1. eliminate the chance of an endless loop, replace the main
   loop with while(rx < budget)

2. use napi_complete() and remove the explicit napi_gro_flush()
Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2b7890e7

net: moxa: synchronize DMA memory · 777fbc31

Jonas Jensen authored Aug 25, 2014

DMA memory should be synchronized before data is passed
to/from controller.

Add dma_sync_single_for_cpu(.., DMA_FROM_DEVICE) to RX path
and dma_sync_single_for_device(.., DMA_TO_DEVICE) to TX path.
Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

777fbc31

net: moxa: replace build_skb() with netdev_alloc_skb_ip_align() / memcpy() · 9fe1b3bc

Jonas Jensen authored Aug 25, 2014

build_skb() is used to make skbs out of existing RX ring memory
which is bad because the RX ring is allocated only once, on probe.
Memory corruption occur because said memory is reclaimed, i.e.
__kfree_skb() (and eventually put_page()).

Replace build_skb() with netdev_alloc_skb_ip_align() and use memcpy().

Remove SKB_DATA_ALIGN() from RX buffer size while we're at it.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=69041Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9fe1b3bc

net: moxa: clear DESC1 on ndo_start_xmit() · b853f319

Jonas Jensen authored Aug 25, 2014

TX buffer length is not cleared on ndo_start_xmit().
Failing to do so can bug/hang the controller and
cause TX interrupts to stop altogether.

Remove the readl() and compute a new value for DESC1.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=69031Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b853f319

net: fix checksum features handling in netif_skb_features() · db115037

Michal Kubeček authored Aug 25, 2014

This is follow-up to

  da08143b ("vlan: more careful checksum features handling")

which introduced more careful feature intersection in vlan code,
taking into account that HW_CSUM should be considered superset
of IP_CSUM/IPV6_CSUM. The same is needed in netif_skb_features()
in order to avoid offloading mismatch warning when vlan is
created on top of a bond consisting of slaves supporting IP/IPv6
checksumming but not vlan Tx offloading.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

db115037

25 Aug, 2014 3 commits

stmmac: set ptp_clock to NULL while unregister · f95f4045

Giuseppe CAVALLARO authored Aug 25, 2014

This is to properly put to NULL the ptp_clock while un-register the PTP support.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f95f4045

stmmac: fix rx checksum programming · 978aded4

Giuseppe CAVALLARO authored Aug 25, 2014

This patch is to fix the IPC bit into the GMAC control register
that must be done after the core initialization otherwise it will
not have any effect.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

978aded4

net: prevent of emerging cross-namespace symlinks · 4c75431a

Alexander Y. Fomichev authored Aug 25, 2014

Code manipulating sysfs symlinks on adjacent net_devices(s)
currently doesn't take into account that devices potentially
belong to different namespaces.

This patch trying to fix an issue as follows:
- check for net_ns before creating / deleting symlink.
  for now only netdev_adjacent_rename_links and
  __netdev_adjacent_dev_remove are affected, afaics
  __netdev_adjacent_dev_insert implies both net_devs
  belong to the same namespace.
- Drop all existing symlinks to / from all adj_devs before
  switching namespace and recreate them just after.
Signed-off-by: Alexander Y. Fomichev <git.user@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c75431a

23 Aug, 2014 1 commit

vxlan: fix incorrect initializer in union vxlan_addr · a45e92a5

Gerhard Stenzel authored Aug 22, 2014

The first initializer in the following

        union vxlan_addr ipa = {
            .sin.sin_addr.s_addr = tip,
            .sa.sa_family = AF_INET,
        };

is optimised away by the compiler, due to the second initializer,
therefore initialising .sin.sin_addr.s_addr always to 0.
This results in netlink messages indicating a L3 miss never contain the
missed IP address. This was observed with GCC 4.8 and 4.9. I do not know about previous versions.
The problem affects user space programs relying on an IP address being
sent as part of a netlink message indicating a L3 miss.

Changing
            .sa.sa_family = AF_INET,
to
            .sin.sin_family = AF_INET,
fixes the problem.
Signed-off-by: Gerhard Stenzel <gerhard.stenzel@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a45e92a5

22 Aug, 2014 11 commits

Merge tag 'pwm/for-3.17-rc2' of... · 451fd722

Linus Torvalds authored Aug 22, 2014

Merge tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm fix from Thierry Reding:
 "Just one bugfix for the PWM lookup table code that would cause a PWM
  channel to be set to the wrong period and polarity for non-perfect
  matches"

* tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: Fix period and polarity in pwm_get() for non-perfect matches

451fd722

mac80211: fix channel switch for chanctx-based drivers · 47e4df94

Michal Kazior authored Aug 18, 2014

The new_ctx pointer is set only for non-chanctx drivers. This yielded a
crash for chanctx-based drivers during channel switch finalization:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: ieee80211_vif_use_reserved_switch+0x71c/0xb00 [mac80211]

Use an adequate chanctx pointer to fix this.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

47e4df94

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 433ab34d

Linus Torvalds authored Aug 22, 2014

Pull networking fixes from David Miller:
 "Here are some bug fixes that have piled up during ksummit/linuxcon.

   1) Fix endian problems in ibmveth, from Anton Blanchard.

   2) IPV6 routing code does GFP_KERNEL allocation in atomic, fix from
      Benjamin Block.

   3) SCTP association fixes from Daniel Borkmann.

   4) When multiple VLAN headers are present we have to make sure the
      second and subsequent ones are pullable in the SKB otherwise we
      blindly dereference garbage.  From Jiri Benc.

   5) The argument adjustment of the signature of hlist_add_after*()
      introduced a regression in the batman-adv code, fix from Sven
      Eckelmann.

   6) Fix TX hang handling to avoid a panic in i40e, from Anjali Singhai
      Jain.

   7) PTP flag test is inverted in i40e driver, from Jesse Brandeburg.

   8) ATM LEC driver needs to hold RTNL mutex over MTU changes, from
      Chas Williams.

   9) Truncate packets larger then the TPACKET_V3 format configured
      buffers, otherwise we overwrite past the end of said buffers.
      From Eric Dumazet.

  10) Fix endianness bugs in qlcnic firmware handling, from Rajesh
      Borundia and Shahed Shaikh.

  11) CXGB4 sometimes doesn't get all of the TX completion events it
      should resulting in SKBs getting stuck in the TX queue, from
      Hariprasad Shenai.

  12) When the FEC chip's PTP clock is disabled, you can't access the
      register.  Add necessary checks to avoid the resulting hang, from
      Fugang Duan"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
  drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef
  net: sctp: fix suboptimal edge-case on non-active active/retrans path selection
  net: sctp: spare unnecessary comparison in sctp_trans_elect_best
  net: ethernet: broadcom: bnx2x: Remove redundant #ifdef
  ibmveth: Fix endian issues with rx_no_buffer statistic
  net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings()
  openvswitch: fix panic with multiple vlan headers
  net: ipv6: fib: don't sleep inside atomic lock
  net: fec: ptp: avoid register access when ipg clock is disabled
  cxgb4: Free completed tx skbs promptly
  cxgb4: Fix race condition in cleanup
  sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe
  bnx2x: Revert UNDI flushing mechanism
  qlcnic: Fix endianess issue in firmware load from file operation
  qlcnic: Fix endianess issue in FW dump template header
  qlcnic: Fix flash access interface to application
  MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver
  macvlan: Allow setting multicast filter on all macvlan types
  packet: handle too big packets for PACKET_V3
  MAINTAINERS: add entry for ec_bhf driver
  ...

433ab34d

drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef · faaa5524

Rasmus Villemoes authored Aug 22, 2014

Test for definedness of the macro which is actually defined (the
change is hard to see: it is s/SSS/SSA/).
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

faaa5524

net: sctp: fix suboptimal edge-case on non-active active/retrans path selection · aa4a83ee

Daniel Borkmann authored Aug 22, 2014

In SCTP, selection of active (T.ACT) and retransmission (T.RET)
transports is being done whenever transport control operations
(UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport().

Commits 4c47af4d ("net: sctp: rework multihoming retransmission
path selection to rfc4960") and a7288c4d ("net: sctp: improve
sctp_select_active_and_retran_path selection") have both improved
it towards a more fine-grained and optimal path selection.

Currently, the selection algorithm for T.ACT and T.RET is as follows:

1) Elect the two most recently used ACTIVE transports T1, T2 for
   T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used
2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set
   T.ACT<-T.PRI and T.RET<-T1
3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1
4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where
   T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI

Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and
T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is
still slightly suboptimal as it can lead to the following scenario:

Setup:
        <A>                                <B>
    T1: p1p1 (10.0.10.10) <==>  .'`)  <==> p1p1 (10.0.10.12)  <= T.PRI
    T2: p1p2 (10.0.10.20) <==> (_ . ) <==> p1p2 (10.0.10.22)

    net.sctp.rto_min = 1000
    net.sctp.path_max_retrans = 2
    net.sctp.pf_retrans = 0
    net.sctp.hb_interval = 1000

T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to
link flapping). Here, the first time transmission is sent over PF path
T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks
are sent over the INACTIVE path T1 (T.PRI), which is not good.

After the patch, it's choosing better transports in both cases by
modifying step 4):

4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is
   the most recently used (if avail) in PF, set T.RET<-T.ACT_new

This will still select a best possible path in PF if available (which
can also include T.PRI/T.RET), and set both T.ACT/T.RET to it.

In case sctp_assoc_control_transport() *just* put T.ACT_old into INACTIVE
as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just
for a very short while before going back ACTIVE, it will guarantee that
this path will be reselected for T.ACT/T.RET since T3 (PF) is not
available.

Previously, this was not possible, as we would only select between T.PRI
and T.RET, and a possible T3 would be NULL due to the fact that we have
just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE
and would select a suboptimal path when T.PRI/T.RET have worse properties.

In the case that T.ACT_old permanently went to INACTIVE during this
transition and there's no PF path available, plus T.PRI and T.RET are
INACTIVE as well, we would now camp on T.ACT_old, but if everything is
being INACTIVE there's really not much we can do except hoping for a
successful HB to bring one of the transports back up again and, thus
cause a new selection through sctp_assoc_control_transport().

Now both tests work fine:

Case 1:

 1. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

 2. T1 S(ACTIVE) T.ACT, T.RET
    T2 S(PF)

 3. T1 S(ACTIVE) T.ACT, T.RET
    T2 S(INACTIVE)

 5. T1 S(PF) T.ACT, T.RET
    T2 S(INACTIVE)

[ 5.1 T1 S(INACTIVE) T.ACT, T.RET
      T2 S(INACTIVE) ]

 6. T1 S(ACTIVE) T.ACT, T.RET
    T2 S(INACTIVE)

 7. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

Case 2:

 1. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

 2. T1 S(PF)
    T2 S(ACTIVE) T.ACT, T.RET

 3. T1 S(INACTIVE)
    T2 S(ACTIVE) T.ACT, T.RET

 5. T1 S(INACTIVE)
    T2 S(PF) T.ACT, T.RET

[ 5.1 T1 S(INACTIVE)
      T2 S(INACTIVE) T.ACT, T.RET ]

 6. T1 S(INACTIVE)
    T2 S(ACTIVE) T.ACT, T.RET

 7. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa4a83ee

net: sctp: spare unnecessary comparison in sctp_trans_elect_best · ea4f19c1

Daniel Borkmann authored Aug 22, 2014

When both transports are the same, we don't have to go down that
road only to realize that we will return the very same transport.
We are guaranteed that curr is always non-NULL. Therefore, just
short-circuit this special case.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ea4f19c1

net: ethernet: broadcom: bnx2x: Remove redundant #ifdef · 7d149c52

Rasmus Villemoes authored Aug 20, 2014

Nothing defines _ASM_GENERIC_INT_L64_H, it is a weird way to check for
64 bit longs, and u64 should be printed using %llx anyway.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

7d149c52

ibmveth: Fix endian issues with rx_no_buffer statistic · cbd52281

Anton Blanchard authored Aug 22, 2014

Hidden away in the last 8 bytes of the buffer_list page is a solitary
statistic. It needs to be byte swapped or else ethtool -S will
produce numbers that terrify the user.

Since we do this in multiple places, create a helper function with a
comment explaining what is going on.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

cbd52281

net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings() · c10e4caf

Iyappan Subramanian authored Aug 21, 2014

A NULL pointer dereference is possible for the argument ring->buf_pool
which is passed to xgene_enet_free_desc_ring(), as ring could be NULL.

And now since NULL pointers are being checked for before the calls to
xgene_enet_free_desc_ring(), might as well take advantage of them and
not call the function if the argument would be NULL.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c10e4caf

openvswitch: fix panic with multiple vlan headers · 2ba5af42

Jiri Benc authored Aug 21, 2014

When there are multiple vlan headers present in a received frame, the first
one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the
skb beyond the VLAN TPID may be still non-linear, including the inner TCI
and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it
does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is
executed, __pop_vlan_tci pulls the next vlan header into vlan_tci.

This leads to two things:

1. Part of the resulting ethernet header is in the non-linear part of the
skb. When eth_type_trans is called later as the result of
OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci
is in fact accessing random data when it reads past the TPID.

2. network_header points into the ethernet header instead of behind it.
mac_len is set to a wrong value (10), too.
Reported-by: Yulong Pei <ypei@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2ba5af42

net: ipv6: fib: don't sleep inside atomic lock · 793c3b40

Benjamin Block authored Aug 21, 2014

The function fib6_commit_metrics() allocates a piece of memory in mode
GFP_KERNEL while holding an atomic lock from higher up in the stack, in
the function __ip6_ins_rt(). This produces the following BUG:

> BUG: sleeping function called from invalid context at mm/slub.c:1250
> in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: dhcpcd
> 2 locks held by dhcpcd/2909:
>  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81978e67>] rtnl_lock+0x17/0x20
>  #1:  (&tb->tb6_lock){++--+.}, at: [<ffffffff81a6951a>] ip6_route_add+0x65a/0x800
> CPU: 1 PID: 2909 Comm: dhcpcd Not tainted 3.17.0-rc1 #1
> Hardware name: ASUS All Series/Q87T, BIOS 0216 10/16/2013
>  0000000000000008 ffff8800c8f13858 ffffffff81af135a 0000000000000000
>  ffff880212202430 ffff8800c8f13878 ffffffff810f8d3a ffff880212202c98
>  0000000000000010 ffff8800c8f138c8 ffffffff8121ad0e 0000000000000001
> Call Trace:
>  [<ffffffff81af135a>] dump_stack+0x4e/0x68
>  [<ffffffff810f8d3a>] __might_sleep+0x10a/0x120
>  [<ffffffff8121ad0e>] kmem_cache_alloc_trace+0x4e/0x190
>  [<ffffffff81a6bcd6>] ? fib6_commit_metrics+0x66/0x110
>  [<ffffffff81a6bcd6>] fib6_commit_metrics+0x66/0x110
>  [<ffffffff81a6cbf3>] fib6_add+0x883/0xa80
>  [<ffffffff81a6951a>] ? ip6_route_add+0x65a/0x800
>  [<ffffffff81a69535>] ip6_route_add+0x675/0x800
>  [<ffffffff81a68f2a>] ? ip6_route_add+0x6a/0x800
>  [<ffffffff81a6990c>] inet6_rtm_newroute+0x5c/0x80
>  [<ffffffff8197cf01>] rtnetlink_rcv_msg+0x211/0x260
>  [<ffffffff81978e67>] ? rtnl_lock+0x17/0x20
>  [<ffffffff81119708>] ? lock_release_holdtime+0x28/0x180
>  [<ffffffff81978e67>] ? rtnl_lock+0x17/0x20
>  [<ffffffff8197ccf0>] ? __rtnl_unlock+0x20/0x20
>  [<ffffffff819a989e>] netlink_rcv_skb+0x6e/0xd0
>  [<ffffffff81978ee5>] rtnetlink_rcv+0x25/0x40
>  [<ffffffff819a8e59>] netlink_unicast+0xd9/0x180
>  [<ffffffff819a9600>] netlink_sendmsg+0x700/0x770
>  [<ffffffff81103735>] ? local_clock+0x25/0x30
>  [<ffffffff8194e83c>] sock_sendmsg+0x6c/0x90
>  [<ffffffff811f98e3>] ? might_fault+0xa3/0xb0
>  [<ffffffff8195ca6d>] ? verify_iovec+0x7d/0xf0
>  [<ffffffff8194ec3e>] ___sys_sendmsg+0x37e/0x3b0
>  [<ffffffff8111ef15>] ? trace_hardirqs_on_caller+0x185/0x220
>  [<ffffffff81af979e>] ? mutex_unlock+0xe/0x10
>  [<ffffffff819a55ec>] ? netlink_insert+0xbc/0xe0
>  [<ffffffff819a65e5>] ? netlink_autobind.isra.30+0x125/0x150
>  [<ffffffff819a6520>] ? netlink_autobind.isra.30+0x60/0x150
>  [<ffffffff819a84f9>] ? netlink_bind+0x159/0x230
>  [<ffffffff811f989a>] ? might_fault+0x5a/0xb0
>  [<ffffffff8194f25e>] ? SYSC_bind+0x7e/0xd0
>  [<ffffffff8194f8cd>] __sys_sendmsg+0x4d/0x80
>  [<ffffffff8194f912>] SyS_sendmsg+0x12/0x20
>  [<ffffffff81afc692>] system_call_fastpath+0x16/0x1b

Fixing this by replacing the mode GFP_KERNEL with GFP_ATOMIC.
Signed-off-by: Benjamin Block <bebl@mageta.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

793c3b40