Commits · e368d27ff007f7405cf668679f48ab3174a53922 · Kirill Smelkov / linux

02 Jun, 2014 33 commits

net: cdc_ncm: drop ethtool coalesce support · e368d27f

Bjørn Mork authored May 30, 2014

The ethtool coalesce API is not applicable for this driver. Forcing
it to fit the NCM aggregation redefined the API in a driver specific
way, which is much worse than defining a clean new API. These ethtool
coalesce functions have therefore been replaced by a new sysfs API.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

e368d27f

net: cdc_ncm: use sysfs for rx/tx aggregation tuning · 289507d3

Bjørn Mork authored May 30, 2014

Attach a driver specific sysfs group to the netdev, and use it
for the rx/tx aggregation variables.

The datagram aggregation defined by the CDC NCM specification is
specific to this device class (including CDC MBIM). Using the
ethtool interrupt coalesce API as an interface to the aggregation
parameters redefined that API in a driver specific and confusing
way.  A sysfs group
 - makes it clear that this is a driver specific userspace API, and
 - allows us to export the real values instead of some translated
   version, and
 - lets us include more aggregation variables which were impossible
   to force into the ethtool API.

Additionally, using sysfs allows tuning the driver on space
constrained hosts where userspace tools like ethtool are undesired.
Suggested-by: Peter Stuge <peter@stuge.se>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

289507d3

net: cdc_ncm: inform usbnet when rx buffers are reduced · f42763db

Bjørn Mork authored May 30, 2014

It doesn't matter whether the buffer size goes up or down.  We have to
keep usbnet and device syncronized to be able to split transfers at the
correct boundaries. The spec allow skipping short packets when using
max sized transfers.  If we don't tell usbnet about our new expected rx
buffer size, then it will merge and/or split NTBs.  The driver does not
support this, and the result will be lots of framing errors.

Fix by always reallocating usbnet rx buffers when the rx_max value
changes.

Fixes: 68864abf ("net: cdc_ncm: support rx_max/tx_max updates when running")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

f42763db

net: cdc_ncm: always reallocate tx_curr_skb when tx_max increases · 1ba5d0ff

Bjørn Mork authored May 30, 2014

We are calling usbnet_start_xmit() to flush any remaining data,
depending on the side effect that tx_curr_skb is set to NULL,
ensuring a new allocation using the updated tx_max.  But this
side effect will only happen if there were any cached data ready
to transmit. If not, then an empty tx_curr_skb is still allocated
using the old tx_max size. Free it to avoid a buffer overrun.

Fixes: 68864abf ("net: cdc_ncm: support rx_max/tx_max updates when running")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

1ba5d0ff

net: cdc_ncm: reduce skb truesize in rx path · 1e2c6117

Bjørn Mork authored May 30, 2014

Cloning the big skbs we use for USB buffering chokes up TCP and
SCTP because the socket memory limits are hitting earlier than
they should. It is better to unconditionally copy the unwrapped
packets to freshly allocated skbs.
Reported-by: Jim Baxter <jim_baxter@mentor.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

1e2c6117

macvlan: fix the problem when mac address changes for passthru mode · e289fd28

dingtianhong authored May 30, 2014

The macvlan dev should always have the same mac address like lowerdev
when in the passthru mode, change the mac address alone will break the
work mechanism, so when the lowerdev or macvlan mac address changes,
we should propagate the changes to another dev.

v1->v2: Allow macvlan dev to change mac address for passthru mode and propagate to
	lowerdev.

v2->v3: Don't set the mac address to the lower dev's unicast address for
	passthru mode when mac address changes.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e289fd28

net: stmmac: Handle different error codes from platform_get_irq_byname · d7ec8584

Chen-Yu Tsai authored May 29, 2014

The following patch moved device tree interrupt resolution into
platform_get_irq_byname:

  ad69674e of/irq: do irq resolution in platform_get_irq_byname()

As a result, the function no longer only return -ENXIO on error.
This breaks DT based probing of stmmac, as seen in test runs of
linux-next next-20140526 cubie2-sunxi_defconfig:

  http://lists.linaro.org/pipermail/kernel-build-reports/2014-May/003659.html

This patch makes the stmmac_platform probe function properly handle
error codes, such as returning for deferred probing, and other codes
returned by of_irq_get_by_name.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d7ec8584

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next · 31595de2

David S. Miller authored Jun 02, 2014

John W. Linville says:

====================
pull request: wireless-next 2014-06-02

Please pull this remaining batch of updates intended for the 3.16 stream...

For the mac80211 bits, Johannes says:

"The remainder for -next right now is mostly fixes, and a handful of
small new things like some CSA infrastructure, the regdb script mW/dBm
conversion change and sending wiphy notifications."

For the bluetooth bits, Gustavo says:

"Some more patches for 3.16. There is nothing really special here, just a
bunch of clean ups, fixes plus some small improvements. Please pull."

For the nfc bits, Samuel says:

"We have:

- Felica (Type3) tags support for trf7970a
- Type 4b tags support for port100
- st21nfca DTS typo fix
- A few sparse warning fixes"

For the atheros bits, Kalle says:

"Ben added support for setting antenna configurations. Michal improved
warm reset so that we would not need to fall back to cold reset that
often, an issue where ath10k stripped protected flag while in monitor
mode and made module initialisation asynchronous to fix the problems
with firmware loading when the driver is linked to the kernel.

Luca removed unused channel_switch_beacon callbacks both from ath9k and
ath10k. Marek fixed Protected Management Frames (PMF) when using Action
Frames. Also we had other small fixes everywhere in the driver."

Along with that, there are a handful of updates to a variety
of drivers.  This includes updates to at76c50x-usb, ath9k, b43,
brcmfmac, mwifiex, rsi, rtlwifi, and wil6210.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

31595de2

inetpeer: get rid of ip_id_count · 73f156a6

Eric Dumazet authored Jun 02, 2014

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

73f156a6

of: of_mdio: export symbol of_mdiobus_link_phydev · e067ee33

Daniel Mack authored Jun 02, 2014

Make of_mdiobus_link_phydev externally available.
This fixes CONFIG_OF_MDIO=m.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 86f6cf41 ("net: of_mdio: add of_mdiobus_link_phydev()")
Signed-off-by: David S. Miller <davem@davemloft.net>

e067ee33

net: of_mdio: use int type for address variable · 4cd984b0

Daniel Mack authored Jun 02, 2014

Use int rather than u32 to fix the following warning:

drivers/of/of_mdio.c:147 of_mdiobus_register() warn: unsigned 'addr' is
never less than zero.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Fixes: 8f838288 ("net: of_mdio: factor out code to parse a phy's 'reg' property")
Signed-off-by: David S. Miller <davem@davemloft.net>

4cd984b0

Merge branch 'netdevsync' · c7bfbe51

David S. Miller authored Jun 02, 2014

Alexander Duyck says:

====================
Provide common means for device address sync

The following series implements a means for synchronizing both unicast and
multicast addresses on a device interface.  The code is based on the original
implementation of dev_uc_sync that was available for syncing a VLAN to the
lower dev.

The original reason for coming up for this patch is a driver that is still in
the early stages of development.  The nearest driver I could find that
appeared to have the same limitations as the driver I was working on was the
Cisco enic driver.  For this reason I chose it as the first driver to make use
of this interface publicly.

However, I do not have a Cisco enic interface so I have only been able to
compile test any changes made to the driver.  I tried to keep this change as
simple as possible to avoid any issues.  Any help with testing would be
greatly appreciated.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c7bfbe51

enic: Update driver to use __dev_uc/mc_sync/unsync calls · f009618a

Alexander Duyck authored May 28, 2014

This change updates the enic driver to make use of __dev_uc_sync and
__dev_mc_sync calls.  Previously the driver was doing its own list
management by storing the mc_addr and uc_addr list in a 32 address array.
With this change the sync data is stored in the netdev_addr_list structures
and instead we just track how many addresses we have written to the device.
When we encounter 32 we stop and print a message as occurred previously with
the old approach.

Other than the core change the only other bit needed was to propagate the
constant attribute with the MAC address as there were several spots where
is twas only passed as a u8 * instead of a const u8 *.

This patch is meant to maintain the original functionality without the use
of the mc_addr and uc_addr arrays.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f009618a

net: Add support for device specific address syncing · 670e5b8e

Alexander Duyck authored May 28, 2014

This change provides a function to be used in order to break the
ndo_set_rx_mode call into a set of address add and remove calls.  The code
is based on the implementation of dev_uc_sync/dev_mc_sync.  Since they
essentially do the same thing but with only one dev I simply named my
functions __dev_uc_sync/__dev_mc_sync.

I also implemented an unsync version of the functions as well to allow for
cleanup on close.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

670e5b8e

Merge branch '6lowpan-next' · 3e820811

David S. Miller authored Jun 02, 2014

Alexander Aring says:

====================
6lowpan: fragmentation fixes

This patch series fix the 6LoWPAN fragmentation which are in two cases broken.

The first case is if we have exactly two 6LoWPAN fragments only. This is fixed
by patch "6lowpan_rtnl: fix fragmentation with two fragments".
The second case is a off by one issue if we have payload which hits the fragment
boundary.

Both issues are introduced by commit d4b2816d
("6lowpan: fix fragmentation").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3e820811

6lowpan_rtnl: fix off by one while fragmentation · eb06481d

Alexander Aring authored Jun 02, 2014

This patch fix a off by one error while fragmentation. If the frag_cap
value is equal to skb_unprocessed value we need to stop the
fragmentation loop because the last fragment which has a size of
skb_unprocessed fits into the frag capability size.

This issue was introduced by commit d4b2816d
("6lowpan: fix fragmentation").
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eb06481d

6lowpan_rtnl: fix fragmentation with two fragments · 51263fff

Alexander Aring authored Jun 02, 2014

This patch fix the 6LoWPAN fragmentation for the case if we have exactly
two fragments. The problem is that the (skb_unprocessed >= frag_cap)
condition is always false on the second fragment after sending the first
fragment. A fragmentation with only one fragment doesn't make any sense.
The solution is that we use a do while loop here, that ensures we sending
always a minimum of two fragments if we need a fragmentation.

This issue was introduced by commit d4b2816d
("6lowpan: fix fragmentation").
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

51263fff

stmmac: Remove spin_lock call in stmmac_get_pauseparam() · 86c92ee3

Emil Goode authored Jun 02, 2014

The following patch removed unnecessary spin_lock/unlock calls
in ethtool_ops callback functions. In the second and final version
of the patch one spin_lock call was left behind.

commit cab6715c
Author: Yang Wei <Wei.Yang@windriver.com>
Date:   Sun May 25 09:53:44 2014 +0800

    net: driver: stmicro: Remove some useless the lock protection

This introduced the following sparse warning:

drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:424:1: warning:
	context imbalance in 'stmmac_get_pauseparam' -
	different lock contexts for basic block
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

86c92ee3

genetlink: remove superfluous assignment · 2f91abd4

Denis ChengRq authored Jun 02, 2014

the local variable ops and n_ops were just read out from family,
and not changed, hence no need to assign back.

Validation functions should operate on const parameters and not
change anything.
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2f91abd4

Merge branch 'master' of... · fcb2c0d6

John W. Linville authored Jun 02, 2014

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem

fcb2c0d6

Revert "net/mlx4_en: Use affinity hint" · 96b2e73c
David S. Miller authored Jun 02, 2014
```
This reverts commit 70a640d0.
Signed-off-by: David S. Miller <davem@davemloft.net>
```
96b2e73c

net: ks8851: Don't use regulator_get_optional() · d64eed1d

Stephen Boyd authored May 28, 2014

We shouldn't be using regulator_get_optional() here. These
regulators are always present as part of the physical design and
there isn't any way to use an internal regulator or change the
source of the reference voltage via software. Given that the only
users of this driver in the kernel are DT based, this change
should be transparent to them even if they don't specify any
supplies because the regulator framework will insert dummy
supplies as needed.

Cc: Nishanth Menon <nm@ti.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Reviewed-by: Mark Brown <broonie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d64eed1d

Merge branch 'filter-next' · c532cea9

David S. Miller authored Jun 01, 2014

Daniel Borkmann says:

====================
BPF + test suite updates

These are the last bigger BPF changes that I had in my todo
queue for now. As the first two patches from this series
contain additional test cases for the test suite, I have
rebased them on top of current net-next with the set from [1]
applied to avoid introducing any unnecessary merge conflicts.

For details, please refer to the individual patches. Test
suite runs fine with the set applied.

 [1] http://patchwork.ozlabs.org/patch/352599/
     http://patchwork.ozlabs.org/patch/352600/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c532cea9

net: filter: improve filter block macros · f8f6d679

Daniel Borkmann authored May 29, 2014

Commit 9739eef1 ("net: filter: make BPF conversion more readable")
started to introduce helper macros similar to BPF_STMT()/BPF_JUMP()
macros from classic BPF.

However, quite some statements in the filter conversion functions
remained in the old style which gives a mixture of block macros and
non block macros in the code. This patch makes the block macros itself
more readable by using explicit member initialization, and converts
the remaining ones where possible to remain in a more consistent state.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8f6d679

net: filter: get rid of BPF_S_* enum · 34805931

Daniel Borkmann authored May 29, 2014

This patch finally allows us to get rid of the BPF_S_* enum.
Currently, the code performs unnecessary encode and decode
workarounds in seccomp and filter migration itself when a filter
is being attached in order to overcome BPF_S_* encoding which
is not used anymore by the new interpreter resp. JIT compilers.

Keeping it around would mean that also in future we would need
to extend and maintain this enum and related encoders/decoders.
We can get rid of all that and save us these operations during
filter attaching. Naturally, also JIT compilers need to be updated
by this.

Before JIT conversion is being done, each compiler checks if A
is being loaded at startup to obtain information if it needs to
emit instructions to clear A first. Since BPF extensions are a
subset of BPF_LD | BPF_{W,H,B} | BPF_ABS variants, case statements
for extensions can be removed at that point. To ease and minimalize
code changes in the classic JITs, we have introduced bpf_anc_helper().

Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int),
arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we
unfortunately didn't have access, but changes are analogous to
the rest.

Joint work with Alexei Starovoitov.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Chema Gonzalez <chemag@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

34805931

net: filter: add test for loading SKF_AD_OFF limits · d50bc157

Daniel Borkmann authored May 29, 2014

This check tests that overloading BPF_LD | BPF_ABS with an
always invalid BPF extension, that is SKF_AD_MAX, fails to
make sure classic BPF behaviour is correct in filter checker.

Also, we add a test for loading at packet offset SKF_AD_OFF-1
which should pass the filter, but later on fail during runtime.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d50bc157

net: filter: add slot overlapping test with fully filled M[] · 9fe13baa

Daniel Borkmann authored May 29, 2014

Also add a test for the scratch memory store that first fills
all slots and then sucessively reads all of them back adding
up to A, and eventually returning A. This and the previous
M[] test with alternating fill/spill will detect possible JIT
errors on M[].
Suggested-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9fe13baa

bridge: fix the unbalanced promiscuous count when add_if failed · 019ee792

wangweidong authored May 29, 2014

As commit 2796d0c6 ("bridge: Automatically manage port
promiscuous mode."), make the add_if use dev_set_allmulti
instead of dev_set_promiscuous, so when add_if failed, we
should do dev_set_allmulti(dev, -1).
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

019ee792

net: Revert mlx4 cpumask changes. · ee39facb

David S. Miller authored Jun 01, 2014

This reverts commit 70a640d0
("net/mlx4_en: Use affinity hint") and commit
c8865b64 ("cpumask: Utility function
to set n'th cpu - local cpu first") because these changes break
the build when SMP is disabled amongst other things.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee39facb

net: ks8851: Don't use regulator_get_optional() · 2a82e40d

Stephen Boyd authored May 28, 2014

We shouldn't be using regulator_get_optional() here. These
regulators are always present as part of the physical design and
there isn't any way to use an internal regulator or change the
source of the reference voltage via software. Given that the only
users of this driver in the kernel are DT based, this change
should be transparent to them even if they don't specify any
supplies because the regulator framework will insert dummy
supplies as needed.

Cc: Nishanth Menon <nm@ti.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Reviewed-by: Mark Brown <broonie@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

2a82e40d

Merge branch 'mlx4-next' · b07166b2

David S. Miller authored Jun 01, 2014

Amir Vadai says:

====================
cpumask,net: Affinity hint helper function

This patchset will set affinity hint to influence IRQs to be allocated on the
same NUMA node as the one where the card resides. As discussed in
http://www.spinics.net/lists/netdev/msg271497.html

If the number of IRQs allocated is greater than the number of local NUMA cores, all
local cores will be used first, and the rest of the IRQs will be on a remote
NUMA node.
If no NUMA support - IRQ's and cores will be mapped 1:1

Since the utility function to calculate the mapping could be useful in other mq
drivers in the kernel, it was added to cpumask.[ch]

This patchset was tested and applied on top of net-next since the first
consumer is a network device (mlx4_en).  Over commit 506724c4: "tg3: Override
clock, link aware and link idle mode during NVRAM dump"

I couldn't find a maintainer for cpumask.c, so only added the kernel mailing
list

Amir

Changes from V5:
- Moved the utility function from kernel/irq/manage.c to lib/cpumask.c, and
  renamed it's name accordingly to cpumask_set_cpu_local_first()
- Added some comments as Thomas Gleixner suggested
- Changed -EINVAL to -EAGAIN, that describes the error situtation better.

Changes from V4:
- Patch 1/2: irq: Utility function to get affinity_hint by policy
  Thank you Ben for the great review:
  - Moved the function it kernel/irq/manage.c since it could be useful for
    block mq devices
  - Fixed Typo's
  - Use cpumask_t * instead of cpumask_var_t in function header
  - Restructured the function to remove NULL assignment in a cpumask_var_t
  - Fix for offline local CPU's

Changes from V3:
- Patch 2/2: net/mlx4_en: Use affinity hint
  - somehow patch file was corrupted

Changes from V2:
- Patch 1/2: net: Utility function to get affinity_hint by policy
  - Fixed style issues

Changes from V1:
- Patch 1/2: net: Utility function to get affinity_hint by policy
  - Fixed error flow to return -EINVAL on error (thanks govind)
- Patch 2/2: net/mlx4_en: Use affinity hint
  - Set ring->affinity_hint to NULL on error

Changes from V0:
- Fixed small style issues
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b07166b2

net/mlx4_en: Use affinity hint · 70a640d0

Yuval Atias authored May 25, 2014

The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Signed-off-by: Yuval Atias <yuvala@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70a640d0

cpumask: Utility function to set n'th cpu - local cpu first · c8865b64

Amir Vadai authored May 25, 2014

This function sets the n'th cpu - local cpu's first.
For example: in a 16 cores server with even cpu's local, will get the
following values:
cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
...
cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
...
cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set

Curently this function will be used by multi queue networking devices to
calculate the irq affinity mask, such that as many local cpu's as
possible will be utilized to handle the mq device irq's.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c8865b64

31 May, 2014 7 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 90d0e08e

David S. Miller authored May 30, 2014

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

This small patchset contains three accumulated Netfilter/IPVS updates,
they are:

1) Refactorize common NAT code by encapsulating it into a helper
   function, similarly to what we do in other conntrack extensions,
   from Florian Westphal.

2) A minor format string mismatch fix for IPVS, from Masanari Iida.

3) Add quota support to the netfilter accounting infrastructure, now
   you can add quotas to accounting objects via the nfnetlink interface
   and use them from iptables. You can also listen to quota
   notifications from userspace. This enhancement from Mathieu Poirier.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

90d0e08e

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 648d4feb

David S. Miller authored May 30, 2014

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to i40e and i40evf.

Kevin updates the i40e and i40evf driver i40e_check_asq_alive() to ensure
the length register offset is non-zero which indicates that the software
has initialized the admin queue.  Also removes PCTYPE definitions which are
now reserved.

Mitch enables descriptor prefetch for rings belonging to the virtual function.
Also configures the VF minimum transmit rate to 50 Mbps rather than 0 which was
be interpreted as no limit at all.  Mitch found in order for the VF to achieve
its programmed transmit rate, we need to set the max credit value to 4.
Lastly fixes a Tx hang and firmware crash that happens after setting the MTU
on a VF by not using the RESETTING state during reinit, this is because
the RESETTING state means that a catastrophic hardware bad thing is happening
and the driver needs to tiptoe around and not use the admin queue or registers.
A reinit is no big deal and we can use the admin queue (and we should) so
do not set the state to RESETTING during reinit to resolve the bug.

Akeem changes the declaration of the transmit and receive rings inside
several loops to eliminate declaring the same ring every time for the
duration of the loop and declares them just once before the loop.  Also fixes
the driver to clear the recovery pending bit if pf_reset fails instead of
falling through the setup process.

Anjali makes a change based on feedback from Ben Hutchings that cmd->data
needs to be reported in ETHTOOL_GRXCLSRLCNT and use a helper function to
calculate the total filter count.

Jesse removes storm control since the storm control features are not apart
of the hardware and were mistakenly left in the code.

Greg changes tx_lpi_status and rx_lpi_status from bool to u32 to avoid
sparse errors.

Shannon adds the clear_pxe AdminQ API call to tell the firmware that the
driver is taking over from PXE.  In addition, relaxes the firmware API
check to allow more flexibility in handling newer NICs and NVMs in the field.

Vasu ensures that FCoE is disabled for MFP modes since it is not supported
by overriding the hardware FCoE capability.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

648d4feb

af_key: Replace comma with semicolon · 47162c0b

Himangi Saraogi authored May 30, 2014

This patch replaces a comma between expression statements by a semicolon.

A simplified version of the semantic patch that performs this
transformation is as follows:

// <smpl>
@r@
expression e1,e2,e;
type T;
identifier i;
@@

 e1
-,
+;
 e2;
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

47162c0b

rds/tcp_listen: Replace comma with semicolon · 01728371

Himangi Saraogi authored May 30, 2014

This patch replaces a comma between expression statements by a semicolon.

A simplified version of the semantic patch that performs this
transformation is as follows:

// <smpl>
@r@
expression e1,e2,e;
type T;
identifier i;
@@

 e1
-,
+;
 e2;
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

01728371

RDS/RDMA: Replace comma with semicolon · cc2afe9f

Himangi Saraogi authored May 30, 2014

This patch replaces a comma between expression statements by a semicolon.

A simplified version of the semantic patch that performs this
transformation is as follows:

// <smpl>
@r@
expression e1,e2,e;
type T;
identifier i;
@@

 e1
-,
+;
 e2;
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

cc2afe9f

ipmr: Replace comma with semicolon · 70cb4a45

Himangi Saraogi authored May 30, 2014

This patch replaces a comma between expression statements by a semicolon.

A simplified version of the semantic patch that performs this
transformation is as follows:

// <smpl>
@r@
expression e1,e2,e;
type T;
identifier i;
@@

 e1
-,
+;
 e2;
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70cb4a45

Merge branch 's390-next' · e9bcbc97

David S. Miller authored May 30, 2014

Frank Blaschka says:

====================
s390: network patches for net-next V1

here are some s390 related patches for net-next
Added some style fixing reported by David Laight.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e9bcbc97