- 19 Dec, 2017 7 commits
-
-
Michael Chan authored
Advertise NETIF_F_GRO_HW and turn on TPA_MODE_GRO when NETIF_F_GRO_HW is set. Disable NETIF_F_GRO_HW in bnx2x_fix_features() if the MTU does not support TPA_MODE_GRO or GRO is not set. bnx2x_change_mtu() also needs to disable NETIF_F_GRO_HW if the MTU does not support it. Original parameter disable_tpa will continue to disable LRO and GRO_HW. Preserve the original behavior of enabling LRO by default. User has to run ethtool -K to explicitly enable GRO_HW. Cc: Ariel Elior <Ariel.Elior@cavium.com> Cc: everest-linux-l2@cavium.com Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Advertise NETIF_F_GRO_HW in hw_features if hardware GRO is supported. In bnxt_fix_features(), disable GRO_HW and LRO if current hardware configuration does not allow it. GRO_HW depends on GRO. GRO_HW is also mutually exclusive with LRO. XDP setup will now rely on bnxt_fix_features() to turn off aggregation. During chip init, turn on or off hardware GRO based on NETIF_F_GRO_HW in features flag. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Hardware should not aggregate any packets when generic XDP is installed. Cc: Ariel Elior <Ariel.Elior@cavium.com> Cc: everest-linux-l2@cavium.com Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Introduce NETIF_F_GRO_HW feature flag for NICs that support hardware GRO. With this flag, we can now independently turn on or off hardware GRO when GRO is on. Previously, drivers were using NETIF_F_GRO to control hardware GRO and so it cannot be independently turned on or off without affecting GRO. Hardware GRO (just like GRO) guarantees that packets can be re-segmented by TSO/GSO to reconstruct the original packet stream. Logically, GRO_HW should depend on GRO since it a subset, but we will let individual drivers enforce this dependency as they see fit. Since NETIF_F_GRO is not propagated between upper and lower devices, NETIF_F_GRO_HW should follow suit since it is a subset of GRO. In other words, a lower device can independent have GRO/GRO_HW enabled or disabled and no feature propagation is required. This will preserve the current GRO behavior. This can be changed later if we decide to propagate GRO/ GRO_HW/RXCSUM from upper to lower devices. Cc: Ariel Elior <Ariel.Elior@cavium.com> Cc: everest-linux-l2@cavium.com Signed-off-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tonghao Zhang authored
When CONFIG_PROC_FS is disabled, we will not use the prot_inuse counter. This adds an #ifdef to hide the variable definition in that case. This is not a bugfix. But we can save bytes when there are many network namespace. Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Martin Zhang <zhangjunweimartin@didichuxing.com> Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tonghao Zhang authored
In some case, we want to know how many sockets are in use in different _net_ namespaces. It's a key resource metric. This patch add a member in struct netns_core. This is a counter for socket-inuse in the _net_ namespace. The patch will add/sub counter in the sk_alloc, sk_clone_lock and __sk_free. This patch will not counter the socket created in kernel. It's not very useful for userspace to know how many kernel sockets we created. The main reasons for doing this are that: 1. When linux calls the 'do_exit' for process to exit, the functions 'exit_task_namespaces' and 'exit_task_work' will be called sequentially. 'exit_task_namespaces' may have destroyed the _net_ namespace, but 'sock_release' called in 'exit_task_work' may use the _net_ namespace if we counter the socket-inuse in sock_release. 2. socket and sock are in pair. More important, sock holds the _net_ namespace. We counter the socket-inuse in sock, for avoiding holding _net_ namespace again in socket. It's a easy way to maintain the code. Signed-off-by: Martin Zhang <zhangjunweimartin@didichuxing.com> Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tonghao Zhang authored
Change the member name will make the code more readable. This patch will be used in next patch. Signed-off-by: Martin Zhang <zhangjunweimartin@didichuxing.com> Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 18 Dec, 2017 25 commits
-
-
Bjorn Helgaas authored
Simplify PCIe Completion Timeout setting by using the pcie_capability_clear_and_set_word() interface. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
William Tu says: ==================== net: erspan: a couple fixes Haishuang Yan reports a couple of issues (wrong return value, pskb_may_pull) on erspan V1. Since erspan V2 is in net-next, this series fix the similar issues on v2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
William Tu authored
pskb_may_pull() can change skb->data, so we need to re-load pkt_md and ershdr at the right place. Fixes: 94d7d8f2 ("ip6_gre: add erspan v2 support") Fixes: f551c91d ("net: erspan: introduce erspan v2 for ip_gre") Signed-off-by: William Tu <u9012063@gmail.com> Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
William Tu authored
If pskb_may_pull return failed, return PACKET_REJECT instead of -ENOMEM. Fixes: 94d7d8f2 ("ip6_gre: add erspan v2 support") Fixes: f551c91d ("net: erspan: introduce erspan v2 for ip_gre") Signed-off-by: William Tu <u9012063@gmail.com> Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Russell King says: ==================== More SFP/phylink fixes This series fixes a few more bits with sfp/phylink, particularly confusion with the right way to test for the RTNL mutex being held, a change in 2016 to the mdiobus_scan() behaviour that wasn't noticed, and a fix for reading module EEPROMs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
Use ASSERT_RTNL() rather than WARN_ON(!lockdep_rtnl_is_held()) which stops working when lockdep fires, and we end up with lots of warnings. Fixes: 9525ae83 ("phylink: add phylink infrastructure") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
The EEPROM reading was trying to read from the second EEPROM address if we requested the last byte from the SFF8079 EEPROM, which caused a failure when the second EEPROM is not present. Discovered with a S-RJ01 SFP module. Fix this. Fixes: 73970055 ("sfp: add SFP module support") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
The detection of a PHY changed in commit e98a3aab ("mdio_bus: don't return NULL from mdiobus_scan()") which now causes sfp to print an error message. Update for this change. Fixes: 73970055 ("sfp: add SFP module support") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Samuel Mendoza-Jonas authored
The current HNCDSC handler takes the status flag from the AEN packet and will update or change the current channel based on this flag and the current channel status. However the flag from the HNCDSC packet merely represents the host link state. While the state of the host interface is potentially interesting information it should not affect the state of the NCSI link. Indeed the NCSI specification makes no mention of any recommended action related to the host network controller driver state. Update the HNCDSC handler to record the host network driver status but take no other action. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jerome Brunet says: ==================== net: phy: meson-gxl: clean-up and improvements This patchset adds defines for the control registers and helpers to access the banked registers. The goal being to make it easier to understand what the driver actually does. Then CONFIG_A6 settings is removed since this statement was without effect Finally interrupt support is added, speeding things up a little This series has been tested on the libretech-cc and khadas VIM Changes since v2 [0]: Drop LPA corruption fix which has been merged through net. Apart from this, series remains the same. [0]: https://lkml.kernel.org/r/20171207142715.32578-1-jbrunet@baylibre.com ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
Following previous changes, join the other authors of this driver and take the blame with them Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
Enable interrupt support in meson-gxl PHY driver Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
The PHY performs just as well when left in its default configuration and it makes senses because this poke gets reset just after init. According to the documentation, all registers in the Analog/DSP bank are reset when there is a mode switch from 10BT to 100BT. The bank is also reset on power down and soft reset, so we will never see the value which may have been set by the bootloader. In the end, we have used the default configuration so far and there is no reason to change now. Remove CONFIG_A6 poke to make this clear. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
Use the generic init function to populate some of the phydev structure fields Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
Add read and write helpers to manipulate banked registers on this PHY This helps clarify the settings applied to these registers and what the driver actually does Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
Define registers and bits in meson-gxl PHY driver to make a bit more human friendly. No functional change. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jerome Brunet authored
Always check phy_write return values. Better to be safe than sorry Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Edward Cree says: ==================== sfc: Initial X2000-series (Medford2) support Basic PCI-level changes to support X2000-series NICs. Also fix unexpected-PTP-event log messages, since the timestamp format has been changed in these NICs and that causes us to fail to probe PTP (but we still get the PPS events). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bert Kenward authored
The timer mode register now has a separate field for the reload value. Since we always use this timer with the reload (for interrupt moderation) we set this to the same as the initial value. Previous hardware ignores this field, so we can safely set these bits on all hardware that uses this register. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bert Kenward authored
The RX_L4_CLASS field has shrunk from 3 bits to 2 bits. The upper bit was never used in previous hardware, so we can use the new definition throughout. The TSO OUTER_IPID field was previously spelt differently from the external definitions. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Log a message if PTP probing fails; if we then, unexpectedly, get PTP events, only log a message for the first one on each device. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Medford2 can also have 16k or 64k VI stride. This is reported by MCDI in GET_CAPABILITIES, which fortunately is called before the driver does anything sensitive to the VI stride (such as accessing or even allocating VIs past the zeroth). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Support using BAR 0 on SFC9250, even though the driver doesn't bind to such devices yet. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit e87c6bc3 ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 17 Dec, 2017 8 commits
-
-
Josef Bacik authored
Things got moved around between the original bpf_override_return patches and the final version, and now the ftrace kprobe dispatcher assumes if you modified the ip that you also enabled preemption. Make a comment of this and enable preemption, this fixes the lockdep splat that happened when using this feature. Fixes: 9802d865 ("bpf: add a bpf_override_function helper") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Jakub Kicinski authored
netdev_bpf.flags is the input member for installing the program. netdev_bpf.prog_flags is the output member for querying. Set the correct one on query. Fixes: 92f0292b ("net: xdp: report flags program was installed with on query") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Jakub Kicinski authored
/bin/sh's exit does not recognize -1 as a number, leading to the following error message: /bin/sh: 1: exit: Illegal number: -1 Use 1 as the exit code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Daniel Borkmann authored
Alexei Starovoitov says: ==================== First of all huge thank you to Daniel, John, Jakub, Edward and others who reviewed multiple iterations of this patch set over the last many months and to Dave and others who gave critical feedback during netconf/netdev. The patch is solid enough and we thought through numerous corner cases, but it's not the end. More followups with code reorg and features to follow. TLDR: Allow arbitrary function calls from bpf function to another bpf function. Since the beginning of bpf all bpf programs were represented as a single function and program authors were forced to use always_inline for all functions in their C code. That was causing llvm to unnecessary inflate the code size and forcing developers to move code to header files with little code reuse. With a bit of additional complexity teach verifier to recognize arbitrary function calls from one bpf function to another as long as all of functions are presented to the verifier as a single bpf program. Extended program layout: .. r1 = .. // arg1 r2 = .. // arg2 call pc+1 // function call pc-relative exit .. = r1 // access arg1 .. = r2 // access arg2 .. call pc+20 // second level of function call ... It allows for better optimized code and finally allows to introduce the core bpf libraries that can be reused in different projects, since programs are no longer limited by single elf file. With function calls bpf can be compiled into multiple .o files. This patch is the first step. It detects programs that contain multiple functions and checks that calls between them are valid. It splits the sequence of bpf instructions (one program) into a set of bpf functions that call each other. Calls to only known functions are allowed. Since all functions are presented to the verifier at once conceptually it is 'static linking'. Future plans: - introduce BPF_PROG_TYPE_LIBRARY and allow a set of bpf functions to be loaded into the kernel that can be later linked to other programs with concrete program types. Aka 'dynamic linking'. - introduce function pointer type and indirect calls to allow bpf functions call other dynamically loaded bpf functions while the caller bpf function is already executing. Aka 'runtime linking'. This will be more generic and more flexible alternative to bpf_tail_calls. FAQ: Q: Interpreter and JIT changes mean that new instruction is introduced ? A: No. The call instruction technically stays the same. Now it can call both kernel helpers and other bpf functions. Calling convention stays the same as well. From uapi point of view the call insn got new 'relocation' BPF_PSEUDO_CALL similar to BPF_PSEUDO_MAP_FD 'relocation' of bpf_ldimm64 insn. Q: What had to change on LLVM side? A: Trivial LLVM patch to allow calls was applied to upcoming 6.0 release: https://reviews.llvm.org/rL318614 with few bugfixes as well. Make sure to build the latest llvm to have bpf_call support. More details in the patches. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Daniel Borkmann authored
Add some additional checks for few more corner cases. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
similar to x64 add support for bpf-to-bpf calls. When program has calls to in-kernel helpers the target call offset is known at JIT time and arm64 architecture needs 2 passes. With bpf-to-bpf calls the dynamically allocated function start is unknown until all functions of the program are JITed. Therefore (just like x64) arm64 JIT needs one extra pass over the program to emit correct call offsets. Implementation detail: Avoid being too clever in 64-bit immediate moves and always use 4 instructions (instead of 3-4 depending on the address) to make sure only one extra pass is needed. If some future optimization would make it worth while to optimize 'call 64-bit imm' further, the JIT would need to do 4 passes over the program instead of 3 as in this patch. For typical bpf program address the mov needs 3 or 4 insns, so unconditional 4 insns to save extra pass is a worthy trade off at this state of JIT. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
Typical JIT does several passes over bpf instructions to compute total size and relative offsets of jumps and calls. With multitple bpf functions calling each other all relative calls will have invalid offsets intially therefore we need to additional last pass over the program to emit calls with correct offsets. For example in case of three bpf functions: main: call foo call bpf_map_lookup exit foo: call bar exit bar: exit We will call bpf_int_jit_compile() indepedently for main(), foo() and bar() x64 JIT typically does 4-5 passes to converge. After these initial passes the image for these 3 functions will be good except call targets, since start addresses of foo() and bar() are unknown when we were JITing main() (note that call bpf_map_lookup will be resolved properly during initial passes). Once start addresses of 3 functions are known we patch call_insn->imm to point to right functions and call bpf_int_jit_compile() again which needs only one pass. Additional safety checks are done to make sure this last pass doesn't produce image that is larger or smaller than previous pass. When constant blinding is on it's applied to all functions at the first pass, since doing it once again at the last pass can change size of the JITed code. Tested on x64 and arm64 hw with JIT on/off, blinding on/off. x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter. All other JITs that support normal BPF_CALL will behave the same way since bpf-to-bpf call is equivalent to bpf-to-kernel call from JITs point of view. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
global bpf_jit_enable variable is tested multiple times in JITs, blinding and verifier core. The malicious root can try to toggle it while loading the programs. This race condition was accounted for and there should be no issues, but it's safer to avoid this race condition. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-