- 24 Jan, 2017 8 commits
-
-
Or Gerlitz authored
Add the missing parts for offloading IPv6 tunnels. This includes route and neigh lookups and construnction of the IPv6 tunnel headers. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Or Gerlitz authored
Use more fields out of the tunnel key (e.g the tunnel source IP address) provided by upper layers for the route lookup done on the encap offload path. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Or Gerlitz authored
Currently we use subset of the input tunnel key fields (id, ip daddr, dst port) which are provided by upper layers to indentify flows that should go through the same encapsulation and maintain the HW encapsulation table. This is redundant and can get us wrong. Instead, keep a copy of the ip tunnel info provided by the user through TC and have the tunnel key part as the key to our internal hash. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Or Gerlitz authored
Move around some settings of variables as pre-step to make things more robust and clear for the ipv6 case in down-stream patch. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Or Gerlitz authored
Enhance the parsing of offloaded TC rules to set HW matching on outer IPv6 encapsulation headers. This effectively adds support for TC tunnel key release action (decapsulation) of SRIOV offloads over IPv6 tunnels. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Or Gerlitz authored
The current code is allocating the max encap size supported by the firmware and not the size requested by the caller, fix that. Also, spare a warning when the size of the encapsulation headers is bigger from what is supported by the firmware. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Krister Johansen authored
Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl that denotes the first unprivileged inet port in the namespace. To disable all privileged ports set this to zero. It also checks for overlap with the local port range. The privileged and local range may not overlap. The use case for this change is to allow containerized processes to bind to priviliged ports, but prevent them from ever being allowed to modify their container's network configuration. The latter is accomplished by ensuring that the network namespace is not a child of the user namespace. This modification was needed to allow the container manager to disable a namespace's priviliged port restrictions without exposing control of the network namespace to processes in the user namespace. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
We need to initialize im_node to NULL, otherwise in case of error path it gets passed to kfree() as uninitialized pointer. Fixes: b95a5c4d ("bpf: add a longest prefix match trie map implementation") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Jan, 2017 9 commits
-
-
David S. Miller authored
Daniel Mack says: ==================== bpf: add longest prefix match map This patch set adds a longest prefix match algorithm that can be used to match IP addresses to a stored set of ranges. It is exposed as a bpf map type. Internally, data is stored in an unbalanced tree of nodes that has a maximum height of n, where n is the prefixlen the trie was created with. Note that this has nothing to do with fib or fib6 and is in no way meant to replace or share code with it. It's rather a much simpler implementation that is specifically written with bpf maps in mind. Patch 1/2 adds the implementation, 2/2 an extensive test suite and 3/3 has benchmarking code for the new trie type. Feedback is much appreciated. Changelog: v3 -> v4: * David added a 3rd patch that augments map_perf_test for LPM trie benchmarks * Limit allocation of maps of this new type to CAP_SYS_ADMIN for now, as requested by Alexei * Add a stub .map_delete_elem so the core does not stumble over a NULL pointer when the syscall is invoked * Tests for non-power-of-2 prefix lengths were added * More comment style fixes v2 -> v3: * Store both the key match data and the caller provided value in the same byte array attached to a node. This avoids double allocations * Bring back node->flags to distinguish between 'real' and intermediate nodes * Fix comment style and some typos v1 -> v2: * Turn spin lock into raw spinlock * Lock with irqsave options during trie_update_elem() * Return -ENOMEM properly from trie_alloc() * Force attr->flags == BPF_F_NO_PREALLOC during creation * Set trie->map.pages after creation to account for map memory * Allow arbitrary value sizes * Removed node->flags and denode intermediate nodes through node->value == NULL instead rfc -> v1: * Add __rcu pointer annotations to make sparse happy * Fold _lpm_trie_find_target_node() into its only caller * Fix some minor documentation issues ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Herrmann authored
Extend the map_perf_test_{user,kern}.c infrastructure to stress test lpm-trie lookups. We hook into the kprobe on sys_gettid() and measure the latency depending on trie size and lookup count. On my Intel Haswell i7-6400U, a single gettid() syscall with an empty bpf program takes roughly 6.5us on my system. Lookups in empty tries take ~1.8us on first try, ~0.9us on retries. Lookups in tries with 8192 entries take ~7.1us (on the first _and_ any subsequent try). Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Herrmann authored
The first part of this program runs randomized tests against the lpm-bpf-map. It implements a "Trivial Longest Prefix Match" (tlpm) based on simple, linear, single linked lists. The implementation should be pretty straightforward. Based on tlpm, this inserts randomized data into bpf-lpm-maps and verifies the trie-based bpf-map implementation behaves the same way as tlpm. The second part uses 'real world' IPv4 and IPv6 addresses and tests the trie with those. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Mack authored
This trie implements a longest prefix match algorithm that can be used to match IP addresses to a stored set of ranges. Internally, data is stored in an unbalanced trie of nodes that has a maximum height of n, where n is the prefixlen the trie was created with. Tries may be created with prefix lengths that are multiples of 8, in the range from 8 to 2048. The key used for lookup and update operations is a struct bpf_lpm_trie_key, and the value is a uint64_t. The code carries more information about the internal implementation. Signed-off-by: Daniel Mack <daniel@zonque.org> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bhumika Goyal authored
Declare net_device_ops structure as const as it is only stored in the netdev_ops field of a net_device structure. This field is of type const, so net_device_ops structures having same properties can be made const too. Done using Coccinelle: @r1 disable optional_qualifier@ identifier i; position p; @@ static struct net_device_ops i@p={...}; @ok1@ identifier r1.i; position p; struct net_device ndev; @@ ndev.netdev_ops=&i@p @bad@ position p!={r1.p,ok1.p}; identifier r1.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r1.i; @@ +const struct net_device_ops i; File size before: text data bss dec hex filename 6201 744 0 6945 1b21 ethernet/xilinx/xilinx_emaclite.o File size after: text data bss dec hex filename 6745 192 0 6937 1b19 ethernet/xilinx/xilinx_emaclite.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bhumika Goyal authored
Declare net_device_ops structure as const as it is only stored in the netdev_ops field of a net_device structure. This field is of type const, so net_device_ops structures having same properties can be made const too. Done using Coccinelle: @r1 disable optional_qualifier@ identifier i; position p; @@ static struct net_device_ops i@p={...}; @ok1@ identifier r1.i; position p; struct net_device ndev; @@ ndev.netdev_ops=&i@p @bad@ position p!={r1.p,ok1.p}; identifier r1.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r1.i; @@ +const struct net_device_ops i; File size before: text data bss dec hex filename 4821 744 0 5565 15bd ethernet/moxa/moxart_ether.o File size after: text data bss dec hex filename 5373 192 0 5565 15bd ethernet/moxa/moxart_ether.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Timur Tabi authored
During reset, functions emac_mac_down() and emac_mac_up() are called, so we don't want to free and claim the IRQ unnecessarily. Move those operations to open/close. Signed-off-by: Timur Tabi <timur@codeaurora.org> Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Timur Tabi authored
The EMAC has an internal PHY that is often called the "SGMII". This SGMII is also connected to an external PHY, which is managed by phylib. These dual PHYs often cause confusion. In this case, the data structure for managing the SGMII was mis-named and located in the wrong header file. Structure emac_phy is renamed to emac_sgmii to clearly indicate it applies to the internal PHY only. It also also moved from emac_phy.h (which supports the external PHY) to emac_sgmii.h (where it belongs). To keep the changes minimal, only the structure name is changed, not the names of any variables of that type. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Commit 4cace675 ("bnx2x: Alloc 4k fragment for each rx ring buffer element") added extra put_page() and get_page() calls on arches where PAGE_SIZE=4K like x86 Reorder things to avoid this overhead. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Yuval Mintz <Yuval.Mintz@cavium.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 22 Jan, 2017 15 commits
-
-
David S. Miller authored
Florian Fainelli says: ==================== net: dsa: bcm_sf2: Add support for BCM7278 This patch series adds support for the Broadcom BCM7278 integrated switch which is a successor of the BCM7445 switch. We have a little bit of register shuffling going on, which is why most of the functional changes are to deal with that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Implement the HW design team recommended workaround in for 7278. Since the GPHY now returns its revision information in MII_PHYS_ID[23] we need to check whether the revision provided in flags is 0 or not. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Add support for the BCM7278 28nm process Gigabit Ethernet PHY. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Parse the "brcm,use-bcm-hdr" boolean property during ports identification to fill a bitmask of ports that should have Broadcom tags enabled. This is needed in some configurations where per-packet metadata can be exchanged using Broadcom tags between the switch and an on-chip acceleration device. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In preparation for enabling Broadcom tags on different ports based on configuration information, dedicate a function that is responsible for enabling Broadcom tags for a given port and update the IMP port setup to call it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Add support for the integrated switch found on BCM7278: - core_reg_align is set to 1, to force a translation into the target address space which is 8 bytes aligned - an alternate SWITCH_REG layout is provided since registers are largely bit/masks compatible but have different offsets - conditional for all CORE_STS_OVERRIDE_{IMP,GMII_P} since those got moved way out of the traditional register space Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In preparation for supporting a new device with a slightly different register layout, affecting the SWITCH_REG and SWITCH_CORE address spaces, perform a few preparatory steps: - allow matching the compatible string against a data description - convert the SWITCH_REG register accesses into an indirection table - prepare for supporting a SWITCH_CORE register alignment requirement Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
There is no point inlining the 32-bit direct register read/write part, just infer it from the existing macro. This will make it easier to centralize the address rewriting that we are going to introduce later on. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: systemport: Add support for SYSTEMPORT lite This patch series adds support for SYSTEMPORT Lite which is an evolution of the existing SYSTEMPORT adapter. The two generations are largely identical as far as the transmit/receive path are concerned, and there were just a few control path changes here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Add supporf for the SYSTEMPORT Lite Ethernet controller, this piece of hardware is largely based on the full-blown SYSTEMPORT and differs in the following: - no full-blown UniMAC, instead we have the MagicPacket matching from UniMAC at same offset, and a GMII Interface Block (GIB) for the MAC-level stuff, since we are always interfaced to an Ethernet switch which is fully Ethernet compliant shortcuts could be made - 16 transmit queues, whose interrupts are moved into the first Level-2 interrupt controller bank - slight TDMA offset change (a register was inserted after TDMA_STATUS, *sigh*) - 256 RX descriptors (512 words) and 256 TX descriptors (not visible) As a consequence of these two things, update the code paths accordingly to differentiate the full-blown from the light version. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In preparation for adding SYSTEMPORT Lite, which has twice as less transmit queues than SYSTEMPORT make sure we do allocate TX rings based on the systemport,txq property to get an appropriate memory footprint. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Since we allocate per cpu storage, let's also use NUMA hints. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
jpinto authored
This patch fixes the LS mask when setting EEE timer. LS field is 10 bits long and not 11 as currently. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Reported-By: Rayagond Kokatanur <rayagond@vayavyalabs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 20 Jan, 2017 8 commits
-
-
David S. Miller authored
Andrew Lunn says: ==================== net: dsa: Move temperature sensor code into PHY. Marvell Ethernet switches contain a temperature sensor. There appears to be one sensor, which is shared by each of the internal PHYs. Each PHY has independent registers to read this sensor, and to set a limit for when an alarm should be raised. Some Marvell discrete PHY also have the same sensor and registers. Moving the HWMON code from DSA into the PHY makes the sensor available in discrete PHYs, and removes the layering violation, the switch driver poking around in PHY registers. While moving the code into the PHY driver, it has been re-written to use the new HWMON APIs. v2: Better Cover note explaining one sensor, but multiple independent registers Simply error checking. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
Only the Marvell mv88e6xxx DSA driver made use of the HWMON support in DSA. The temperature sensor registers are actually in the embedded PHYs, and the PHY driver now supports it. So remove all HWMON support from DSA and drivers. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
Some Marvell PHYs have an inbuilt temperature sensor. Add hwmon support for this sensor. There are two different variants. The simpler, older chips have a 5 degree accuracy. The newer devices have 1 degree accuracy. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Josef Bacik authored
When comparing two sockets we need to use inet6_rcv_saddr so we get a NULL sk_v6_rcv_saddr if the socket isn't AF_INET6, otherwise our comparison function can be wrong. Fixes: 637bc8bb ("inet: reset tb->fastreuseport when adding a reuseport sk") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5 and mlx5e updates 2017-01-19 This series includes some updates for mlx5 core and mlx5e netdevice driver. From Leon, a small fix that remove an unnecessary print. From Eli Cohen, a fix to the FW version printout in case of internal error. From Eugenia Emantayev, two patches, the 1st adds mlx5 1pps (pulse per second) mlx5 infrastructure support and the 2nd adds the necessary bits for mlx5e ptp logic and structures. From Mohamad, add support for s-tagged packet receive when in promiscuous mode. Form Gal Pressman, MCAM (Management capabilities mask register) and PCAM (Ports capabilities mask register) registers infrastructure, those registers are needed in order to query the different statistics registers support in FW, in order for the driver to enable/disable query and reporting them back to user. On top of this infrastructure we've exposed new set of statistics groups: - MPCNT: Physical layer statistical counters (For symbol errors) - PPCNT: PCIe performance counters In addition to the statistics capabilities series we've moved the mlx5 HCA capabilities fields to a dedicated struct under the driver private data. At the end a small patch to update & query statistics in the most desired order. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ivan Khoronzhuk says: ==================== net: ethernet: ti: cpsw: correct common res usage This series is intended to remove unneeded redundancies connected with common resource usage function. Since v1: - changed name to cpsw_get_usage_count() - added comments to open/closw for cpsw_get_usage_count() - added patch: net: ethernet: ti: cpsw: clarify ethtool ops changing num of descs Based on net-next/master ==================== Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
After adding cpsw_set_ringparam ethtool op, better to carry out common parts of similar ops splitting descriptors in runtime. It allows to reuse these parts and shows what the ops actually do. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
No need to duplicate the same function in rx handler to get info if any interface is running. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-