Commits · 01c15e93a78cfcf45cc32d07aa38bdc84250f569 · Kirill Smelkov / linux

21 Jan, 2018 4 commits

nfp: flower: prioritize stats updates · 01c15e93

Pieter Jansen van Vuuren authored Jan 19, 2018

Previously it was possible to interrupt processing stats updates because
they were handled in a work queue. Interrupting the stats updates could
lead to a situation where we backup the control message queue. This patch
moves the stats update processing out of the work queue to be processed as
soon as hardware sends a request.
Reported-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

01c15e93

net: gemini: Depend on HAS_IOMEM · d83bb0be

Linus Walleij authored Jan 21, 2018

The zeroday builder notices that since Usermode Linux does not
have IO memory, the build fails for them when selecting everything
it can enable.

As the driver is clearly using memory-mapped registers to access
the network adapter, we add depends on HAS_IOMEM to solve this
problem.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d83bb0be

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · cbcbeedb

David S. Miller authored Jan 21, 2018

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next
tree. Basically, a new extension for ip6tables, simplification work of
nf_tables that saves us 500 LoC, allow raw table registration before
defragmentation, conversion of the SNMP helper to use the ASN.1 code
generator, unique 64-bit handle for all nf_tables objects and fixes to
address fallout from previous nf-next batch.  More specifically, they
are:

1) Seven patches to remove family abstraction layer (struct nft_af_info)
   in nf_tables, this simplifies our codebase and it saves us 64 bytes per
   net namespace.

2) Add IPv6 segment routing header matching for ip6tables, from Ahmed
   Abdelsalam.

3) Allow to register iptable_raw table before defragmentation, some
   people do not want to waste cycles on defragmenting traffic that is
   going to be dropped, hence add a new module parameter to enable this
   behaviour in iptables and ip6tables. From Subash Abhinov
   Kasiviswanathan. This patch needed a couple of follow up patches to
   get things tidy from Arnd Bergmann.

4) SNMP helper uses the ASN.1 code generator, from Taehee Yoo. Several
   patches for this helper to prepare this change are also part of this
   patch series.

5) Add 64-bit handles to uniquely objects in nf_tables, from Harsha
   Sharma.

6) Remove log message that several netfilter subsystems print at
   boot/load time.

7) Restore x_tables module autoloading, that got broken in a previous
   patch to allow singleton NAT hook callback registration per hook
   spot, from Florian Westphal. Moreover, return EBUSY to report that
   the singleton NAT hook slot is already in instead.

8) Several fixes for the new nf_tables flowtable representation,
   including incorrect error check after nf_tables_flowtable_lookup(),
   missing Kconfig dependencies that lead to build breakage and missing
   initialization of priority and hooknum in flowtable object.

9) Missing NETFILTER_FAMILY_ARP dependency in Kconfig for the clusterip
   target. This is due to recent updates in the core to shrink the hook
   array size and compile it out if no specific family is enabled via
   .config file. Patch from Florian Westphal.

10) Remove duplicated include header files, from Wei Yongjun.

11) Sparse warning fix for the NFPROTO_INET handling from the core
    due to missing static function definition, also from Wei Yongjun.

12) Restore ICMPv6 Parameter Problem error reporting when
    defragmentation fails, from Subash Abhinov Kasiviswanathan.

13) Remove obsolete owner field initialization from struct
    file_operations, patch from Alexey Dobriyan.

14) Use boolean datatype where needed in the Netfilter codebase, from
    Gustavo A. R. Silva.

15) Remove double semicolon in dynset nf_tables expression, from
    Luis de Bethencourt.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

cbcbeedb

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · ea9722e2

David S. Miller authored Jan 20, 2018

Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-01-19

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) bpf array map HW offload, from Jakub.

2) support for bpf_get_next_key() for LPM map, from Yonghong.

3) test_verifier now runs loaded programs, from Alexei.

4) xdp cpumap monitoring, from Jesper.

5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.

6) user space can now retrieve HW JITed program, from Jiong.

Note there is a minor conflict between Russell's arm32 JIT fixes
and removal of bpf_jit_enable variable by Daniel which should
be resolved by keeping Russell's comment and removing that variable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ea9722e2

20 Jan, 2018 12 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8565d26b

David S. Miller authored Jan 19, 2018

The BPF verifier conflict was some minor contextual issue.

The TUN conflict was less trivial.  Cong Wang fixed a memory leak of
tfile->tx_array in 'net'.  This is an skb_array.  But meanwhile in
net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
ptr_ring.
Signed-off-by: David S. Miller <davem@davemloft.net>

8565d26b

Merge branch 'bpf-misc-improvements' · 1391040b

Alexei Starovoitov authored Jan 19, 2018

Daniel Borkmann says:

====================
This series adds various misc improvements to BPF: detection
of BPF helper definition misconfiguration for mem/size argument
pairs, csum_diff helper also for XDP, various test cases,
removal of the recently added pure_initcall(), restriction
of the jit sysctls to cap_sys_admin for initns, a minor size
improvement for x86 jit in alu ops, output of complexity limit
to verifier log and last but not least having the event output
more flexible with moving to const_size_or_zero type.

Thanks!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

1391040b

bpf: move event_output to const_size_or_zero for xdp/skb as well · 1728a4f2

Daniel Borkmann authored Jan 20, 2018

Similar rationale as in a60dd35d ("bpf: change bpf_perf_event_output
arg5 type to ARG_CONST_SIZE_OR_ZERO"), change the type to CONST_SIZE_OR_ZERO
such that we can better deal with optimized code. No changes needed in
bpf_event_output() as it can also deal with 0 size entirely (e.g. as only
wake-up signal with empty frame in perf RB, or packet dumps w/o meta data
as another such possibility).
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

1728a4f2

bpf: add upper complexity limit to verifier log · 4bd95f4b

Daniel Borkmann authored Jan 20, 2018

Given the limit could potentially get further adjustments in the
future, add it to the log so it becomes obvious what the current
limit is w/o having to check the source first. This may also be
helpful for debugging complexity related issues on kernels that
backport from upstream.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

4bd95f4b

bpf, x86: small optimization in alu ops with imm · de0a444d

Daniel Borkmann authored Jan 20, 2018

For the BPF_REG_0 (BPF_REG_A in cBPF, respectively), we can use
the short form of the opcode as dst mapping is on eax/rax and
thus save a byte per such operation. Added to add/sub/and/or/xor
for 32/64 bit when K immediate is used. There may be more such
low-hanging fruit to add in future as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

de0a444d

bpf: restrict access to core bpf sysctls · 2e4a3098

Daniel Borkmann authored Jan 20, 2018

Given BPF reaches far beyond just networking these days, it was
never intended to allow setting and in some cases reading those
knobs out of a user namespace root running without CAP_SYS_ADMIN,
thus tighten such access.

Also the bpf_jit_enable = 2 debugging mode should only be allowed
if kptr_restrict is not set since it otherwise can leak addresses
to the kernel log. Dump a note to the kernel log that this is for
debugging JITs only when enabled.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

2e4a3098

bpf: get rid of pure_initcall dependency to enable jits · fa9dd599

Daniel Borkmann authored Jan 20, 2018

Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

fa9dd599

bpf: add couple of test cases for div/mod by zero · 87c1793b

Daniel Borkmann authored Jan 20, 2018

Add couple of missing test cases for eBPF div/mod by zero to the
new test_verifier prog runtime feature. Also one for an empty prog
and only exit.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

87c1793b

bpf: add couple of test cases for signed extended imms · fcd1c917

Daniel Borkmann authored Jan 20, 2018

Add a couple of test cases for interpreter and JIT that are
related to an issue we faced some time ago in Cilium [1],
which is fixed in LLVM with commit e53750e1e086 ("bpf: fix
bug on silently truncating 64-bit immediate").

Test cases were run-time checking kernel to behave as intended
which should also provide some guidance for current or new
JITs in case they should trip over this. Added for cBPF and
eBPF.

  [1] https://github.com/cilium/cilium/pull/2162Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

fcd1c917

bpf: add csum_diff helper to xdp as well · 205c3807

Daniel Borkmann authored Jan 20, 2018

Useful for porting cls_bpf programs w/o increasing program
complexity limits much at the same time, so add the helper
to XDP as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

205c3807

bpf, verifier: detect misconfigured mem, size argument pair · 90133415

Daniel Borkmann authored Jan 20, 2018

I've seen two patch proposals now for helper additions that used
ARG_PTR_TO_MEM or similar in reg_X but no corresponding ARG_CONST_SIZE
in reg_X+1. Verifier won't complain in such case, but it will omit
verifying the memory passed to the helper thus ending up badly.
Detect such buggy helper function signature and bail out during
verification rather than finding them through review.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

90133415

samples/bpf: xdp_monitor include cpumap tracepoints in monitoring · 417f1d9f

Jesper Dangaard Brouer authored Jan 19, 2018

The xdp_redirect_cpu sample have some "builtin" monitoring of the
tracepoints for xdp_cpumap_*, but it is practical to have an external
tool that can monitor these transpoint as an easy way to troubleshoot
an application using XDP + cpumap.

Specifically I need such external tool when working on Suricata and
XDP cpumap redirect. Extend the xdp_monitor tool sample with
monitoring of these xdp_cpumap_* tracepoints.  Model the output format
like xdp_redirect_cpu.

Given I needed to handle per CPU decoding for cpumap, this patch also
add per CPU info on the existing monitor events.  This resembles part
of the builtin monitoring output from sample xdp_rxq_info.  Thus, also
covering part of that sample in an external monitoring tool.

Performance wise, the cpumap tracepoints uses bulking, which cause
them to have very little overhead.  Thus, they are enabled by default.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

417f1d9f

19 Jan, 2018 24 commits

Merge branch 'bpf-lpm-get-next-key' · 05526361

Daniel Borkmann authored Jan 19, 2018

Yonghong Song says:

====================
This patch set implements MAP_GET_NEXT_KEY command for LPM_TRIE map.
This command is really useful for key enumeration, and for key deletion
if what keys in the trie are unknown.

Patch #1 implements the functionality in the kernel and patch #2
adds a test case in tools/testing/selftests/bpf.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

05526361

tools/bpf: add a testcase for MAP_GET_NEXT_KEY command of LPM_TRIE map · 8c417dc1

Yonghong Song authored Jan 18, 2018

A test case is added in tools/testing/selftests/bpf/test_lpm_map.c
for MAP_GET_NEXT_KEY command. A four node trie, which
is described in kernel/bpf/lpm_trie.c, is built and the
MAP_GET_NEXT_KEY results are checked.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

8c417dc1

bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map · b471f2f1

Yonghong Song authored Jan 18, 2018

Current LPM_TRIE map type does not implement MAP_GET_NEXT_KEY
command. This command is handy when users want to enumerate
keys. Otherwise, a different map which supports key
enumeration may be required to store the keys. If the
map data is sparse and all map data are to be deleted without
closing file descriptor, using MAP_GET_NEXT_KEY to find
all keys is much faster than enumerating all key space.

This patch implements MAP_GET_NEXT_KEY command for LPM_TRIE map.
If user provided key pointer is NULL or the key does not have
an exact match in the trie, the first key will be returned.
Otherwise, the next key will be returned.

In this implemenation, key enumeration follows a postorder
traversal of internal trie. More specific keys
will be returned first than less specific ones, given
a sequence of MAP_GET_NEXT_KEY syscalls.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

b471f2f1

selftests: bpf: update .gitignore with missing generated files · b7bcc0bb

Shuah Khan authored Jan 18, 2018

Update .gitignore with missing generated files.
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

b7bcc0bb

bpftool: recognize BPF_MAP_TYPE_CPUMAP maps · a55aaf6d

Roman Gushchin authored Jan 19, 2018

Add BPF_MAP_TYPE_CPUMAP map type to the list
of map type recognized by bpftool and define
corresponding text representation.
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Acked-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

a55aaf6d

Merge branch 'dsa-mv88e6xxx-ATU-VTU-irq-fixes' · 85831e56

David S. Miller authored Jan 19, 2018

Andrew Lunn says:

====================
ATU and VTU irq fixes

Further testing and code review found two sets of bugs.

Core review found a cut/paste error in the irq setup code.

A board which does not have an interrupt line from the switch to the
SoC, and experiancing an EPROBE_DEFER throw a splat when the ATU irq
was freed but never registered.

v2: Fix typ0 chip->chip->vtu_prob_irq to chip->vtu_prob_irq
    0-day compile testing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

85831e56

net: dsa: mv88e6xxx: Free ATU/VTU irq only when there is chip irq · ae14cafc

Andrew Lunn authored Jan 18, 2018

We only register the ATU and VTU irq when we have a chip level IRQ.
In the error path, we should only attempt to remove the ATU and VTU
irq if we also have a chip level IRQ.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

ae14cafc

net: dsa: mv88e6xxx: Return error from irq_find_mapping() · 9b662a3e

Andrew Lunn authored Jan 18, 2018

Fix a cut/paste error. When irq_find_mapping() returns an error for
the ATU or VTU interrupt, return that error, not the value of
chip->device_irq.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

9b662a3e

Merge branch 'net-sched-cls-add-extack-support' · 7677fd01

David S. Miller authored Jan 19, 2018

Alexander Aring says:

====================
net: sched: cls: add extack support

this patch adds extack support for TC classifier subsystem. The first
patch fixes some code style issues for this patch series pointed out
by checkpatch. The other patches until the last one prepares extack
handling for the TC classifier subsystem and handle generic extack
errors.

The last patch is an example for u32 classifier to add extack support
inside the callbacks delete and change. There exists a init callback as
well, but most classifier implementation run a kalloc() once to allocate
something. Not necessary _yet_ to add extack support now.

- Alex

Cc: David Ahern <dsahern@gmail.com>

changes since v3:
 - fix accidentally move of config option mismatch message in PATCH 2/8
   correct one is 4/8, detected by kbuildbot (Thank you)
 - Removed patch "net: sched: cls: add extack support for tc_setup_cb_call"
   PATCH 7/8 in version v2 as suggested by Jakub Kicinski (Thank you)
 - changed NL_SET_ERR_MSG to NL_SET_ERR_MSG_MOD as suggested by Jakub Kicinski
   in u32 cls (Thank You)
 - Removed text from cover letter that I was waiting for Jiri's Patches as
   detected by Jamal Hadi Salim (Thank you).

changes since v2:
 - rebased on Jiri's patches (Thank you)
 - several spelling fixes pointed out by Cong Wang (Thank you)
 - several spelling fixes pointed out by David Ahern (Thank you)
 - use David Ahern recommendation if config option is mismatch, but
   combine it with Cong Wang recommendation to put config name into it
   (Thank you)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

7677fd01

net: sched: cls_u32: add extack support · 4b981dbc

Alexander Aring authored Jan 18, 2018

This patch adds extack support for the u32 classifier as example for
delete and init callback.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4b981dbc

net: sched: cls: add extack support for tcf_change_indev · 1057c55f

Alexander Aring authored Jan 18, 2018

This patch adds extack handling for the tcf_change_indev function which
is common used by TC classifier implementations.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1057c55f

net: sched: cls: add extack support for delete callback · 571acf21

Alexander Aring authored Jan 18, 2018

This patch adds extack support for classifier delete callback api. This
prepares to handle extack support inside each specific classifier
implementation.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

571acf21

net: sched: cls: add extack support for tcf_exts_validate · 50a56190

Alexander Aring authored Jan 18, 2018

The tcf_exts_validate function calls the act api change callback. For
preparing extack support for act api, this patch adds the extack as
parameter for this function which is common used in cls implementations.

Furthermore the tcf_exts_validate will call action init callback which
prepares the TC action subsystem for extack support.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

50a56190

net: sched: cls: add extack support for change callback · 7306db38

Alexander Aring authored Jan 18, 2018

This patch adds extack support for classifier change callback api. This
prepares to handle extack support inside each specific classifier
implementation.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7306db38

net: sched: cls_api: handle generic cls errors · c35a4acc

Alexander Aring authored Jan 18, 2018

This patch adds extack support for generic cls handling. The extack
will be set deeper to each called function which is not part of netdev
core api.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c35a4acc

net: sched: cls: fix code style issues · 8865fdd4

Alexander Aring authored Jan 18, 2018

This patch changes some code style issues pointed out by checkpatch
inside the TC cls subsystem.
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8865fdd4

mlxsw: spectrum: Upper-bound supported FW version · fd5204cd

Yuval Mintz authored Jan 18, 2018

During initialization the driver checks whether the flashed FW image
suits its requirements by checking that it's sufficiently new.
However, there's only a weak backward compatibility scheme that is
actually guaranteed by the FW, so driver must also upper bound the
version to prevent compatibility issues between current driver and some
possible future fw.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd5204cd

Merge branch 'nfp-devlink-capabilities-extensions-and-updates' · ea8de471

David S. Miller authored Jan 19, 2018

Jakub Kicinski says:

====================
nfp: devlink, capabilities extensions and updates

This series starts with an improvement to the usability of the device
memory accessors (CPP transactions).  Next few patches are devoted to
fixing the devlink locking.  After recent patches for mlxsw the locking
scheme of devlink ops has to be reworked.  Following patches improve
NFP code dealing with "representors", and expands the error message
printed when driver has no support for loaded FW.

Second part of the series is focused on vNIC capabilities read from
vNIC control memory (often referred to as "BAR0" for historical reasons).
TLV capability format is established and immediately made use of.  The
next patches rework parsing of features for control vNIC which allows
apps to mask out features they don't want enabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ea8de471

nfp: bpf: disable all ctrl vNIC capabilities · 81bd5ded

Jakub Kicinski authored Jan 17, 2018

BPF firmware currently exposes IRQ moderation capability.
The driver will make use of it by default, inserting 50 usec
delay to every control message exchange.  This cuts the number
of messages per second we can exchange by almost half.

None of the other capabilities make much sense for BPF control
vNIC, either.  Disable them all.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

81bd5ded

nfp: allow apps to disable ctrl vNIC capabilities · 78a0a65f

Jakub Kicinski authored Jan 17, 2018

Most vNIC capabilities are netdev related.  It makes no sense
to initialize them and waste FW resources.  Some are even
counter-productive, like IRQ moderation, which will slow
down exchange of control messages.

Add to nfp_app a mask of enabled control vNIC capabilities
for apps to use.  Make flower and BPF enable all capabilities
for now.  No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

78a0a65f

nfp: split reading capabilities out of nfp_net_init() · 545bfa7a

Jakub Kicinski authored Jan 17, 2018

nfp_net_init() is a little long and we are about to add more
code to reading capabilties.  Move the capability reading,
parsing and validating out.  Only actual initialization
will stay in nfp_net_init().

No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

545bfa7a

nfp: read mailbox address from TLV caps · 527d7d1b

Jakub Kicinski authored Jan 17, 2018

Allow specifying alternative vNIC mailbox location in TLV caps.
This way we can size the mailbox to the needs and not necessarily
waste 512B of ctrl memory space.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

527d7d1b

nfp: read ME frequency from vNIC ctrl memory · ce991ab6

Jakub Kicinski authored Jan 17, 2018

PCIe island clock frequency is used when converting coalescing
parameters from usecs to NFP timestamps.  Most chips don't run
at 1200MHz, allow FW to provide us with the real frequency.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce991ab6

nfp: add TLV capabilities to the BAR · 73a0329b

Jakub Kicinski authored Jan 17, 2018

NFP is entirely programmable, including the PCI data interface.
Using a fixed control BAR layout certainly makes implementations
easier, but require careful considerations when space is allocated.
Once BAR area is allocated to one feature nothing else can use it.
Allocating space statically also requires it to be sized upfront,
which leads to either unnecessary limitation or wastage.

We currently have a 32bit capability word defined which tells drivers
which application FW features are supported.   Most of the bits
are exhausted.  The same bits are also reused for enabling specific
features.  Bulk of capabilities don't have a need for an enable bit,
however, leading to confusion and wastage.

TLVs seems like a better fit for expressing capabilities of applications
running on programmable hardware.

This patch leaves the front of the BAR as is, and declares a TLV
capability start at offset 0x58.  Most of the space up to 0x0d90
is already allocated, but the used space can be wrapped with RESERVED
TLVs.  E.g.:

Address    Type         Length
 0x0058    RESERVED      0xe00  /* Wrap basic structures */
 0x0e5c    FEATURE_A     0x004
 0x0e64    FEATURE_B     0x004
 0x0e6c    RESERVED      0x990  /* Wrap qeueue stats */
 0x1800    FEATURE_C     0x100
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

73a0329b