Commits · 7943c0f329d33f531607d66f5781f2210e1e278c · Kirill Smelkov / linux

18 Nov, 2014 20 commits

bpf: remove test map scaffolding and user proper types · 7943c0f3

Alexei Starovoitov authored Nov 13, 2014

proper types and function helpers are ready. Use them in verifier testsuite.
Remove temporary stubs
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7943c0f3

bpf: allow eBPF programs to use maps · d0003ec0

Alexei Starovoitov authored Nov 13, 2014

expose bpf_map_lookup_elem(), bpf_map_update_elem(), bpf_map_delete_elem()
map accessors to eBPF programs
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d0003ec0

bpf: add a testsuite for eBPF maps · ffb65f27

Alexei Starovoitov authored Nov 13, 2014

. check error conditions and sanity of hash and array map APIs
. check large maps (that kernel gracefully switches to vmalloc from kmalloc)
. check multi-process parallel access and stress test
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ffb65f27

bpf: fix BPF_MAP_LOOKUP_ELEM command return code · a1854d6a

Alexei Starovoitov authored Nov 13, 2014

fix errno of BPF_MAP_LOOKUP_ELEM command as bpf manpage
described it in commit b4fc1a46("Merge branch 'bpf-next'"):
-----
BPF_MAP_LOOKUP_ELEM
    int bpf_lookup_elem(int fd, void *key, void *value)
    {
        union bpf_attr attr = {
            .map_fd = fd,
            .key = ptr_to_u64(key),
            .value = ptr_to_u64(value),
        };

        return bpf(BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr));
    }
    bpf() syscall looks up an element with given key in  a  map  fd.
    If  element  is found it returns zero and stores element's value
    into value.  If element is not found  it  returns  -1  and  sets
    errno to ENOENT.

and further down in manpage:

   ENOENT For BPF_MAP_LOOKUP_ELEM or BPF_MAP_DELETE_ELEM,  indicates  that
          element with given key was not found.
-----

In general all BPF commands return ENOENT when map element is not found
(including BPF_MAP_GET_NEXT_KEY and BPF_MAP_UPDATE_ELEM with
 flags == BPF_MAP_UPDATE_ONLY)

Subsequent patch adds a testsuite to check return values for all of
these combinations.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a1854d6a

bpf: add array type of eBPF maps · 28fbcfa0

Alexei Starovoitov authored Nov 13, 2014

add new map type BPF_MAP_TYPE_ARRAY and its implementation

- optimized for fastest possible lookup()
  . in the future verifier/JIT may recognize lookup() with constant key
    and optimize it into constant pointer. Can optimize non-constant
    key into direct pointer arithmetic as well, since pointers and
    value_size are constant for the life of the eBPF program.
    In other words array_map_lookup_elem() may be 'inlined' by verifier/JIT
    while preserving concurrent access to this map from user space

- two main use cases for array type:
  . 'global' eBPF variables: array of 1 element with key=0 and value is a
    collection of 'global' variables which programs can use to keep the state
    between events
  . aggregation of tracing events into fixed set of buckets

- all array elements pre-allocated and zero initialized at init time

- key as an index in array and can only be 4 byte

- map_delete_elem() returns EINVAL, since elements cannot be deleted

- map_update_elem() replaces elements in an non-atomic way
  (for atomic updates hashtable type should be used instead)
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

28fbcfa0

bpf: add hashtable type of eBPF maps · 0f8e4bd8

Alexei Starovoitov authored Nov 13, 2014

add new map type BPF_MAP_TYPE_HASH and its implementation

- maps are created/destroyed by userspace. Both userspace and eBPF programs
  can lookup/update/delete elements from the map

- eBPF programs can be called in_irq(), so use spin_lock_irqsave() mechanism
  for concurrent updates

- key/value are opaque range of bytes (aligned to 8 bytes)

- user space provides 3 configuration attributes via BPF syscall:
  key_size, value_size, max_entries

- map takes care of allocating/freeing key/value pairs

- map_update_elem() must fail to insert new element when max_entries
  limit is reached to make sure that eBPF programs cannot exhaust memory

- map_update_elem() replaces elements in an atomic way

- optimized for speed of lookup() which can be called multiple times from
  eBPF program which itself is triggered by high volume of events
  . in the future JIT compiler may recognize lookup() call and optimize it
    further, since key_size is constant for life of eBPF program
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0f8e4bd8

bpf: add 'flags' attribute to BPF_MAP_UPDATE_ELEM command · 3274f520

Alexei Starovoitov authored Nov 13, 2014

the current meaning of BPF_MAP_UPDATE_ELEM syscall command is:
either update existing map element or create a new one.
Initially the plan was to add a new command to handle the case of
'create new element if it didn't exist', but 'flags' style looks
cleaner and overall diff is much smaller (more code reused), so add 'flags'
attribute to BPF_MAP_UPDATE_ELEM command with the following meaning:
 #define BPF_ANY	0 /* create new element or update existing */
 #define BPF_NOEXIST	1 /* create new element if it didn't exist */
 #define BPF_EXIST	2 /* update existing element */

bpf_update_elem(fd, key, value, BPF_NOEXIST) call can fail with EEXIST
if element already exists.

bpf_update_elem(fd, key, value, BPF_EXIST) can fail with ENOENT
if element doesn't exist.

Userspace will call it as:
int bpf_update_elem(int fd, void *key, void *value, __u64 flags)
{
    union bpf_attr attr = {
        .map_fd = fd,
        .key = ptr_to_u64(key),
        .value = ptr_to_u64(value),
        .flags = flags;
    };

    return bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr));
}

First two bits of 'flags' are used to encode style of bpf_update_elem() command.
Bits 2-63 are reserved for future use.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3274f520

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 1bbf148d

David S. Miller authored Nov 18, 2014

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-11-18

This series contains updates to i40e only.

Shannon provides a patch to clean up the driver to only warn once that
PTP is not supported when linked at 100Mbps.

Mitch provides a fix for i40e where the VF interrupt processing takes
a long time and it is possible that we could lose a VFLR event if it
happens while processing a VFLR on another VF.  To correct this situation,
we enable the VFLR interrupt cause before we begin processing any pending
resets.

Neerav provides several patches to update DCB support in i40e.  When
there are DCB configuration changes based on DCBx, the firmware suspends
the port's Tx and generates an event to the PF.  The PF is then
responsible to reconfigure the PF VSIs and switching topology as per the
updated DCB configuration and then resume the port's Tx by calling the
"Resume Port Tx" AQ command, so add this call to the flow that handles
DCB re-configuration in the PF.  Allow the driver to query and use DCB
configuration from firmware when firmware DCBx agent is in CEE mode.
Add a check whether LLDP Agent's default AdminStatus is enabled or
disabled on a given port, and sets DCBx status to disabled if the
status is disabled.  Fix an issue when the port TC configuration
changes as a result of DCBx and the driver modifies the enabled TCs for
the VEBs it manages but does not update the enabled_tc value that
was cached on a per VEB basis.  Add a new PF state so that if a port's
Tx is in suspended state the Tx queue disable flow would just put the
request for the queue to be disabled and return without waiting for the
queue to be actually disabled.  Allows the driver to enable/disable
the XPS based on the number of TCs being enabled for the given VSI.

v2: Dropped patch "i40e: Handle a single mss packet with more than 8 frags"
    while we rework the patch after we test a bit more based on feedback from
    Eric Dumazet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1bbf148d

PPC: bpf_jit_comp: Unify BPF_MOD | BPF_X and BPF_DIV | BPF_X · cadaecd2

Denis Kirjanov authored Nov 17, 2014

Reduce duplicated code by unifying
BPF_ALU | BPF_MOD | BPF_X and BPF_ALU | BPF_DIV | BPF_X

CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Daniel Borkmann<dborkman@redhat.com>
CC: Philippe Bergheaud<felix@linux.vnet.ibm.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cadaecd2

i40e: Set XPS bit mask to zero in DCB mode · 3ffa037d

Neerav Parikh authored Nov 12, 2014

Due to DCBX configuration change if the VSI needs to use more than 1 TC;
it needs to disable the XPS maps that were set when operating in 1 TC mode.
Without disabling XPS the netdev layer will select queues based on those
settings and not use the TC queue mapping to make the queue selection.

This patch allows the driver to enable/disable the XPS based on the number
of TCs being enabled for the given VSI.

Change-ID: Idc4dec47a672d2a509f6d7fe11ed1ee65b4f0e08
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

3ffa037d

i40e: Prevent link flow control settings when PFC is enabled · 4b7698cb

Neerav Parikh authored Nov 12, 2014

When PFC is enabled we should not proceed with setting the link flow control
parameters.  Also, always report the link flow Tx/Rx settings as off when
PFC is enabled.

Change-ID: Ib09ec58afdf0b2e587ac9d8851a5c80ad58206c4
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4b7698cb

i40e: Do not disable/enable FCoE VSI with DCB reconfig · d341b7a5

Neerav Parikh authored Nov 12, 2014

FCoE VSI Tx queue disable times out when reconfiguring as a result of
DCB TC configuration change event.

The hardware allows us to skip disabling and enabling of Tx queues for
VSIs with single TC enabled. As FCoE VSI is configured to have only
single TC we skip it from disable/enable flow.

Change-ID: Ia73ff3df8785ba2aa3db91e6f2c9005e61ebaec2
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

d341b7a5

i40e: Modify Tx disable wait flow in case of DCB reconfiguration · 69129dc3

Neerav Parikh authored Nov 12, 2014

When DCB TC configuration changes the firmware suspends the port's Tx.
Now, as DCB TCs may have changed the PF driver tries to reconfigure the
TC configuration of the VSIs it manages. As part of this process it disables
the VSI queues but the Tx queue disable will not complete as the port's
Tx has been suspended. So, waiting for Tx queues to go to disable state
in this flow may lead to detection of Tx queue disable timeout errors.

Hence, this patch adds a new PF state so that if a port's Tx is in
suspended state the Tx queue disable flow would just put the request for
the queue to be disabled and return without waiting for the queue to be
actually disabled.
Once the VSI(s) TC reconfiguration has been done and driver has called
firmware AQC "Resume PF Traffic" the driver checks the Tx queues requested
to be disabled are actually disabled before re-enabling them again.

Change-ID: If3e03ce4813a4e342dbd5a1eb1d2861e952b7544
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

69129dc3

i40e: Update VEB's enabled_tc after reconfiguration · 23cd1f09

Neerav Parikh authored Nov 12, 2014

When the port TC configuration changes as a result of DCBx the driver
modifies the enabled TCs for the VEBs it manages. But, in the process
it did not update the enabled_tc value that it caches on a per VEB basis.

So, when the next reconfiguration event occurs where the number of TC
value is same as the value cached in enabled_tc for a given VEB; driver
does not modify it's TC configuration by calling appropriate AQ command
believing it is running with the same configuration as requested.
Now, as the VEB is not actually enabled for the TCs that are there any
TC configuration command for VSI attached to that VEB with TCs that are
not enabled for the VEB fails.

This patch fixes this issue.

Change-ID: Ife5694469b05494228e0d850429ea1734738cf29
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

23cd1f09

i40e: Check for LLDP AdminStatus before querying DCBX · e1c4751e

Neerav Parikh authored Nov 12, 2014

This patch adds a check whether LLDP Agent's default AdminStatus is
enabled or disabled on a given port. If it is disabled then it sets
the DCBX status to disabled as well; and would not query firmware for
any DCBX configuration data.

Change-ID: I73c0b9f0adbf4cae177d14914b20a48c9a8f50fd
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e1c4751e

i40e: Add support to firmware CEE DCBX mode · 9fa61dd2

Neerav Parikh authored Nov 12, 2014

This patch allows i40e driver to query and use DCB configuration from
firmware when firmware DCBX agent is in CEE mode.

Change-ID: I30f92a67eb890f0f024f35339696e6e83d49a274
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9fa61dd2

i40e: Resume Port Tx after DCB event · 2fd75f31

Neerav Parikh authored Nov 12, 2014

When there are DCB configuration changes based on DCBX the firmware suspends
the port's Tx and generates an event to the PF. The PF is then responsible
to reconfigure the PF VSIs and switching topology as per the updated DCB
configuration and then resume the port's Tx by calling the "Resume Port Tx"
AQ command.

This patch adds this call to the flow that handles DCB re-configuration in
the PF.

Change-ID: I5b860ad48abfbf379b003143c4d3453e2ed5cc1c
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

2fd75f31

i40e: Bump version to 1.1.23 · 7bda87c7

Catherine Sullivan authored Nov 11, 2014

Bumping minor version as this will be the second SW release and it
should be 1.

Change-ID: If0bd102095d2f059ae0c9b7f4ad625535ffbbdee
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

7bda87c7

i40e: re-enable VFLR interrupt sooner · c5c2f7c3

Mitch Williams authored Nov 11, 2014

VF interrupt processing takes a looooong time, and it's possible that we
could lose a VFLR event if it happens while we're processing a VFLR on
another VF. This would leave the VF in a semi-permanent reset state,
which would not be cleared until yet another VF experiences a VFLR.

To correct this situation, we enable the VFLR interrupt cause before we
begin processing any pending resets. This means that any VFLR that
occurs during reset processing will generate another interrupt and this
routine will get called again.

This change may cause a spurious interrupt when multiple VFLRs occur
very close together in time. If this happens, then this routine will be
called again and it will detect no outstanding VFLR events and do
nothing. No harm, no foul.

Change-ID: Id0451f3e6e73a2cf6db1668296c71e129b59dc19
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

c5c2f7c3

i40e: only warn once of PTP nonsupport in 100Mbit speed · e684fa34

Shannon Nelson authored Nov 11, 2014

Only warn once that PTP is not supported when linked at 100Mbit.

Yes, using a static this way means that this once-only message is not
port specific, but once only for the life of the driver, regardless of
the number of ports.  That should be plenty.

Change-ID: Ie6476530056df408452e195ef06afd4f57caa4b2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e684fa34

16 Nov, 2014 20 commits

Merge branch 'rss_key_fill' · 65622ed4

David S. Miller authored Nov 16, 2014

Eric Dumazet says:

====================
net: provide common RSS key infrastructure

RSS (Receive Side Scaling) uses a 40 bytes key to provide hash for incoming
packets to select appropriate incoming queue on NIC.

Hash algo (Toeplitz) is also well known and documented by Microsoft
(search for "Verifying the RSS Hash Calculation")

Problem is that some drivers use a well known key.
It makes very easy for attackers to target one particular RX queue,
knowing that number of RX queues is a power of two, or at least some
small number.

Other drivers use a random value per port, making difficult
tuning on bonding setups.

Lets add a common infrastructure, so that host gets an unique
RSS key, and drivers do not have to worry about this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

65622ed4

vmxnet3: use netdev_rss_key_fill() helper · 6bf79cdd

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6bf79cdd

sfc: use netdev_rss_key_fill() helper · 7a20db37

Eric Dumazet authored Nov 16, 2014

Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a20db37

mlx4: use netdev_rss_key_fill() helper · b9d1ab7e

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.

Also provide ethtool -x support to fetch RSS key
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9d1ab7e

ixgbe: use netdev_rss_key_fill() helper · 9913c61c

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9913c61c

igb: use netdev_rss_key_fill() helper · eb31f849

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eb31f849

i40e: use netdev_rss_key_fill() helper · 22f258a1

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

22f258a1

fm10k: use netdev_rss_key_fill() helper · c41a4fba

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c41a4fba

e100e: use netdev_rss_key_fill() helper · 5c8d19da

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5c8d19da

be2net:use netdev_rss_key_fill() helper · 1dcf7b1c

Eric Dumazet authored Nov 16, 2014

Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Rename rss_hkey local variable to rss_key to have consistent name among
drivers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1dcf7b1c

bna: use netdev_rss_key_fill() helper · 0fa6aa4a

Eric Dumazet authored Nov 16, 2014

Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0fa6aa4a

tg3: use netdev_rss_key_fill() helper · 39648356

Eric Dumazet authored Nov 16, 2014

Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Prashant Sreedharan <prashant@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

39648356

bnx2x: use netdev_rss_key_fill() helper · e3ec69ca

Eric Dumazet authored Nov 16, 2014

Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e3ec69ca

amd-xgbe: use netdev_rss_key_fill() helper · b2306303

Eric Dumazet authored Nov 16, 2014

Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lendacky, Thomas <Thomas.Lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2306303

net: provide a per host RSS key generic infrastructure · 960fb622

Eric Dumazet authored Nov 16, 2014

RSS (Receive Side Scaling) typically uses Toeplitz hash and a 40 or 52 bytes
RSS key.

Some drivers use a constant (and well known key), some drivers use a random
key per port, making bonding setups hard to tune. Well known keys increase
attack surface, considering that number of queues is usually a power of two.

This patch provides infrastructure to help drivers doing the right thing.

netdev_rss_key_fill() should be used by drivers to initialize their RSS key,
even if they provide ethtool -X support to let user redefine the key later.

A new /proc/sys/net/core/netdev_rss_key file can be used to get the host
RSS key even for drivers not providing ethtool -x support, in case some
applications want to precisely setup flows to match some RX queues.

Tested:

myhost:~# cat /proc/sys/net/core/netdev_rss_key
11:63:99:bb:79:fb:a5:a7:07:45:b2:20:bf:02:42:2d:08:1a:dd:19:2b:6b:23:ac:56:28:9d:70:c3:ac:e8:16:4b:b7:c1:10:53:a4:78:41:36:40:74:b6:15:ca:27:44:aa:b3:4d:72

myhost:~# ethtool -x eth0
RX flow hash indirection table for eth0 with 8 RX ring(s):
0: 0 1 2 3 4 5 6 7
RSS hash key:
11:63:99:bb:79:fb:a5:a7:07:45:b2:20:bf:02:42:2d:08:1a:dd:19:2b:6b:23:ac:56:28:9d:70:c3:ac:e8:16:4b:b7:c1:10:53:a4:78:41
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

960fb622

Merge branch 'mv88e6171_temps' · ca245024

David S. Miller authored Nov 16, 2014

Andrew Lunn says:

====================
Add temperature reading and registers dump to mv88e6171

These patches centralize the temperature sensor reading code, and then
make use of it with the mv88e6171 which has a compatible
sensor. Additionally, support is added for reading the mv88e6171 via
ethtool.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ca245024

net: dsa: mv88e6171: Add support for reading switch registers · 03d6faa9

Andrew Lunn authored Nov 15, 2014

The infrastructure can now report switch registers to ethtool.
Add support for it to the mv88e6171 driver.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

03d6faa9

net: dsa: mv88e6171: Add support for reading the temperature · 4dd38cdb

Andrew Lunn authored Nov 15, 2014

This chip also has a temperature sensor which can be read using the
common code. In order to use it, add the needed mutex protection for
accessing registers via the shared code.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

4dd38cdb

net: dsa: Centralise code for reading the temperature sensor · eaa23765

Andrew Lunn authored Nov 15, 2014

The method to read the temperature used in the mve6123_61_65 driver
can also be used for other chips. Move the code into the shared code
base of mv88e6xxx.c.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

eaa23765

net: dsa: replace count*size kzalloc by kcalloc · 6f2aed6a

Fabian Frederick authored Nov 14, 2014

kcalloc manages count*sizeof overflow.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

6f2aed6a