Commits · 7cf374182391d67f08c0ef0519a57fb594e0f543 · Kirill Smelkov / linux

15 Oct, 2015 2 commits

cfg80211: reg: remove useless non-NULL check · 7cf37418

Johannes Berg authored Oct 15, 2015

There's no way that the alpha2 pointer can be NULL, so
no point in checking that it isn't.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

7cf37418

cfg80211: fix gHz to GHz · 8047d261

Johannes Berg authored Oct 15, 2015

There's no "g" prefix, only "G" (1e9) that was clearly intended here.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

8047d261

14 Oct, 2015 5 commits

mac80211: remove event.c · 6f1f5d5f

Johannes Berg authored Oct 14, 2015

That file contains just a single function, which itself is just a
single statement to call a different function. Remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

6f1f5d5f

mac80211: remove cfg.h · 3beea351

Johannes Berg authored Oct 14, 2015

The file contains just a single declaration that can easily
move to another file - remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

3beea351

mac80211: move sta_set_rate_info_rx() and make it static · fbd6ff5c

Johannes Berg authored Oct 14, 2015

There's only a single caller of this function, so it can
be moved to the same file and made static.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

fbd6ff5c

mac80211: clean up ieee80211_rx_h_check_dup code · a732fa70

Johannes Berg authored Oct 14, 2015

Reduce indentation a bit to make the condition more readable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

a732fa70

mac80211: remove PM-QoS listener · 4a733ef1

Johannes Berg authored Oct 14, 2015

As this API has never really seen any use and most drivers don't
ever use the value derived from it, remove it.

Change the only driver using it (rt2x00) to simply use the DTIM
period instead of the "max sleep" time.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

4a733ef1

13 Oct, 2015 33 commits

mac80211: use new cfg80211_inform_bss_frame_data() API · 61f6bba0

Johannes Berg authored Oct 13, 2015

The new API is more easily extensible with a metadata struct
passed to it, use it in mac80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

61f6bba0

mac80211: Do not restart scheduled scan if multiple scan plans are set · e2845c45

Avraham Stern authored Oct 12, 2015

If multiple scan plans were set for scheduled scan, do not restart
scheduled scan on reconfig because it is possible that some scan
plans were already completed and there is no need to run them all
over again. Instead, notify userspace that scheduled scan stopped
so it can configure new scan plans for scheduled scan.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

e2845c45

cfg80211: Add multiple scan plans for scheduled scan · 3b06d277

Avraham Stern authored Oct 12, 2015

Add the option to configure multiple 'scan plans' for scheduled scan.
Each 'scan plan' defines the number of scan cycles and the interval
between scans. The scan plans are executed in the order they were
configured. The last scan plan will always run infinitely and thus
defines only the interval between scans.
The maximum number of scan plans supported by the device and the
maximum number of iterations in a single scan plan are advertised
to userspace so it can configure the scan plans appropriately.

When scheduled scan results are received there is no way to know which
scan plan is being currently executed, so there is no way to know when
the next scan iteration will start. This is not a problem, however.
The scan start timestamp is only used for flushing old scan results,
and there is no difference between flushing all results received until
the end of the previous iteration or the start of the current one,
since no results will be received in between.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

3b06d277

wireless: add WNM action frame categories · af614261

Johannes Berg authored Oct 07, 2015

Add the WNM and unprotected WNM categories and mark the latter
as not robust.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

af614261

wireless: update robust action frame list · a4288289

Johannes Berg authored Oct 07, 2015

Unprotected DMG and VHT action frames are not protected, reflect
that in the list.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

a4288289

nl80211: allow BSS data to include CLOCK_BOOTTIME timestamp · 6e19bc4b

Dmitry Shmidt authored Oct 07, 2015

For location and connectivity services, userspace would often like
to know the time when the BSS was last seen. The current "last seen"
value is calculated in a way that makes it less useful, especially
if the system suspended in the meantime.

Add the ability for the driver to report a real CLOCK_BOOTTIME stamp
that can then be reported to userspace (if present).

Drivers wishing to use this must be converted to the new API to call
cfg80211_inform_bss_data() or cfg80211_inform_bss_frame_data(). They
need to ensure the reported value is accurate enough even when the
frame might have been buffered in the device (e.g. firmware.)
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
[modified to use struct, inlines]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

6e19bc4b

Revert "mac80211: remove exposing 'mfp' to drivers" · 93f0490e

Tamizh chelvam authored Oct 07, 2015

This reverts commit 5c48f120.

Some device drivers (ath10k) offload part of aggregation including AddBA/DelBA
negotiations to firmware. In such scenario, the PMF configuration of
the station needs to be provided to driver to enable encryption of
AddBA/DelBA action frames.
Signed-off-by: Tamizh chelvam <c_traja@qti.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

93f0490e

Merge remote-tracking branch 'net-next/master' into mac80211-next · 985f2c87

Johannes Berg authored Oct 13, 2015

Merge net-next to get some driver changes that patches depend
on (in order to avoid conflicts).
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

985f2c87

bridge: vlan: enforce no pvid flag in vlan ranges · 6623c60d

Nikolay Aleksandrov authored Oct 11, 2015

Currently it's possible for someone to send a vlan range to the kernel
with the pvid flag set which will result in the pvid bouncing from a
vlan to vlan and isn't correct, it also introduces problems for hardware
where it doesn't make sense having more than 1 pvid. iproute2 already
enforces this, so let's enforce it on kernel-side as well.
Reported-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6623c60d

atm: iphase: fix misleading indention · cbb41b91

Tillmann Heidsieck authored Oct 10, 2015

Fix a smatch warning:
drivers/atm/iphase.c:1178 rx_pkt() warn: curly braces intended?

The code is correct, the indention is misleading. In case the allocation
of skb fails, we want to skip to the end.
Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

cbb41b91

atm: iphase: return -ENOMEM instead of -1 in case of failed kmalloc() · 21e26ff9

Tillmann Heidsieck authored Oct 10, 2015

Smatch complains about returning hard coded error codes, silence this
warning.

drivers/atm/iphase.c:115 ia_enque_rtn_q() warn: returning -1 instead of -ENOMEM is sloppy
Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

21e26ff9

ipv6 route: use err pointers instead of returning pointer by reference · 8c5b83f0

Roopa Prabhu authored Oct 10, 2015

This patch makes ip6_route_info_create return err pointer instead of
returning the rt pointer by reference as suggested  by Dave
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8c5b83f0

net: hns: fix the unknown phy_nterface_t type error · 99dcc7df

huangdaode authored Oct 10, 2015

This patch fix the building error reported by Jiri Pirko <jiri@resnulli.us>

drivers/net/ethernet/hisilicon/hns/hnae.h:465:2: error: unknown type
name 'phy_interface_t'
        phy_interface_t phy_if;
	^
the full build log is on https://lists.01.org/pipermail/kbuild-all.
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: yankejian <yankejian@huawei.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99dcc7df

tun: use sk_fullsock() before reading sk->sk_tsflags · 5fcd2d8b

Eric Dumazet authored Oct 09, 2015

timewait or request sockets are small and do not contain sk->sk_tsflags

Without this fix, we might read garbage, and crash later in

__skb_complete_tx_timestamp()
 -> sock_queue_err_skb()

(These pseudo sockets do not have an error queue either)

Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5fcd2d8b

Merge branch 'netns-defrag' · b7a46095

David S. Miller authored Oct 12, 2015

Eric W. Biederman says:

====================
net: Pass net into defragmentation

This is the next installment of my work to pass struct net through the
output path so the code does not need to guess how to figure out which
network namespace it is in, and ultimately routes can have output
devices in another network namespace.

In netfilter and af_packet we defragment packets in the output path,
and there is the usual amount of confusion about how to compute which
net we are processing the packets in.  This patchset clears that
confusion up by explicitly passing in struct net in ip_defrag,
ip_check_defrag, and nf_ct_frag6_gather.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b7a46095

ipv6: Pass struct net into nf_ct_frag6_gather · b7277597

Eric W. Biederman authored Oct 09, 2015

The function nf_ct_frag6_gather is called on both the input and the
output paths of the networking stack.  In particular ipv6_defrag which
calls nf_ct_frag6_gather is called from both the the PRE_ROUTING chain
on input and the LOCAL_OUT chain on output.

The addition of a net parameter makes it explicit which network
namespace the packets are being reassembled in, and removes the need
for nf_ct_frag6_gather to guess.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b7277597

ipv4: Pass struct net into ip_defrag and ip_check_defrag · 19bcf9f2

Eric W. Biederman authored Oct 09, 2015

The function ip_defrag is called on both the input and the output
paths of the networking stack.  In particular conntrack when it is
tracking outbound packets from the local machine calls ip_defrag.

So add a struct net parameter and stop making ip_defrag guess which
network namespace it needs to defragment packets in.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

19bcf9f2

ipv4: Only compute net once in ip_call_ra_chain · 37fcbab6

Eric W. Biederman authored Oct 09, 2015

ip_call_ra_chain is called early in the forwarding chain from
ip_forward and ip_mr_input, which makes skb->dev the correct
expression to get the input network device and dev_net(skb->dev) a
correct expression for the network namespace the packet is being
processed in.

Compute the network namespace and store it in a variable to make the
code clearer.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

37fcbab6

packet: fix match_fanout_group() · 161642e2

Eric Dumazet authored Oct 09, 2015

Recent TCP listener patches exposed a prior af_packet bug :
match_fanout_group() blindly assumes it is always safe
to cast sk to a packet socket to compare fanout with af_packet_priv

But SYNACK packets can be sent while attached to request_sock, which
are smaller than a "struct sock".

We can read non existent memory and crash.

Fixes: c0de08d0 ("af_packet: don't emit packet on orig fanout group")
Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eric Leblond <eric@regit.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

161642e2

Merge tag 'wireless-drivers-next-for-davem-2015-10-09' of... · 99165967

David S. Miller authored Oct 12, 2015

Merge tag 'wireless-drivers-next-for-davem-2015-10-09' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
Major changes:

iwlwifi

* some debugfs improvements
* fix signedness in beacon statistics
* deinline some functions to reduce size when device tracing is enabled
* filter beacons out in AP mode when no stations are associated
* deprecate firmwares version -12
* fix a runtime PM vs. legacy suspend race
* one-liner fix for a ToF bug
* clean-ups in the rx code
* small debugging improvement
* fix WoWLAN with new firmware versions
* more clean-ups towards multiple RX queues;
* some rate scaling fixes and improvements;
* some time-of-flight fixes;
* other generic improvements and clean-ups;

brcmfmac

* rework code dealing with multiple interfaces
* allow logging firmware console using debug level
* support for BCM4350, BCM4365, and BCM4366 PCIE devices
* fixed for legacy P2P and P2P device handling
* correct set and get tx-power

ath9k

* add support for Outside Context of a BSS (OCB) mode

mwifiex

* add USB multichannel feature
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

99165967

ipv4/icmp: redirect messages can use the ingress daddr as source · e2ca690b

Paolo Abeni authored Oct 09, 2015

This patch allows configuring how the source address of ICMP
redirect messages is selected; by default the old behaviour is
retained, while setting icmp_redirects_use_orig_daddr force the
usage of the destination address of the packet that caused the
redirect.

The new behaviour fits closely the RFC 5798 section 8.1.1, and fix the
following scenario:

Two machines are set up with VRRP to act as routers out of a subnet,
they have IPs x.x.x.1/24 and x.x.x.2/24, with VRRP holding on to
x.x.x.254/24.

If a host in said subnet needs to get an ICMP redirect from the VRRP
router, i.e. to reach a destination behind a different gateway, the
source IP in the ICMP redirect is chosen as the primary IP on the
interface that the packet arrived at, i.e. x.x.x.1 or x.x.x.2.

The host will then ignore said redirect, due to RFC 1122 section 3.2.2.2,
and will continue to use the wrong next-op.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e2ca690b

bridge: try switchdev op first in __vlan_vid_add/del · 0944d6b5

Jiri Pirko authored Oct 09, 2015

Some drivers need to implement both switchdev vlan ops and
vid_add/kill ndos. For that to work in bridge code, we need to try
switchdev op first when adding/deleting vlan id.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0944d6b5

BNX2: free temp_stats_blk on error path · 3703ebe4

wangweidong authored Oct 13, 2015

In bnx2_init_board, missing free temp_stats_blk on error path when
some operations do failed. Just add the 'kfree' operation.
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3703ebe4

Merge branch 'setsockopt_incoming_cpu' · 76973dd7

David S. Miller authored Oct 12, 2015

Eric Dumazet says:

====================
tcp: better smp listener behavior

As promised in last patch series, we implement a better SO_REUSEPORT
strategy, based on cpu hints if given by the application.

We also moved sk_refcnt out of the cache line containing the lookup
keys, as it was considerably slowing down smp operations because
of false sharing. This was simpler than converting listen sockets
to conventional RCU (to avoid sk_refcnt dirtying)

Could process 6.0 Mpps SYN instead of 4.2 Mpps on my test server.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

76973dd7

tcp: shrink tcp_timewait_sock by 8 bytes · d475f090

Eric Dumazet authored Oct 08, 2015

Reducing tcp_timewait_sock from 280 bytes to 272 bytes
allows SLAB to pack 15 objects per page instead of 14 (on x86)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d475f090

net: shrink struct sock and request_sock by 8 bytes · ed53d0ab

Eric Dumazet authored Oct 08, 2015

One 32bit hole is following skc_refcnt, use it.
skc_incoming_cpu can also be an union for request_sock rcv_wnd.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ed53d0ab

net: align sk_refcnt on 128 bytes boundary · 8e5eb54d

Eric Dumazet authored Oct 08, 2015

sk->sk_refcnt is dirtied for every TCP/UDP incoming packet.
This is a performance issue if multiple cpus hit a common socket,
or multiple sockets are chained due to SO_REUSEPORT.

By moving sk_refcnt 8 bytes further, first 128 bytes of sockets
are mostly read. As they contain the lookup keys, this has
a considerable performance impact, as cpus can cache them.

These 8 bytes are not wasted, we use them as a place holder
for various fields, depending on the socket type.

Tested:
 SYN flood hitting a 16 RX queues NIC.
 TCP listener using 16 sockets and SO_REUSEPORT
 and SO_INCOMING_CPU for proper siloing.

 Could process 6.0 Mpps SYN instead of 4.2 Mpps

 Kernel profile looked like :
    11.68%  [kernel]  [k] sha_transform
     6.51%  [kernel]  [k] __inet_lookup_listener
     5.07%  [kernel]  [k] __inet_lookup_established
     4.15%  [kernel]  [k] memcpy_erms
     3.46%  [kernel]  [k] ipt_do_table
     2.74%  [kernel]  [k] fib_table_lookup
     2.54%  [kernel]  [k] tcp_make_synack
     2.34%  [kernel]  [k] tcp_conn_request
     2.05%  [kernel]  [k] __netif_receive_skb_core
     2.03%  [kernel]  [k] kmem_cache_alloc
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8e5eb54d

net: SO_INCOMING_CPU setsockopt() support · 70da268b

Eric Dumazet authored Oct 08, 2015

SO_INCOMING_CPU as added in commit 2c8c56e1 was a getsockopt() command
to fetch incoming cpu handling a particular TCP flow after accept()

This commits adds setsockopt() support and extends SO_REUSEPORT selection
logic : If a TCP listener or UDP socket has this option set, a packet is
delivered to this socket only if CPU handling the packet matches the specified
one.

This allows to build very efficient TCP servers, using one listener per
RX queue, as the associated TCP listener should only accept flows handled
in softirq by the same cpu.
This provides optimal NUMA behavior and keep cpu caches hot.

Note that __inet_lookup_listener() still has to iterate over the list of
all listeners. Following patch puts sk_refcnt in a different cache line
to let this iteration hit only shared and read mostly cache lines.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70da268b

packet: support per-packet fwmark for af_packet sendmsg · c7d39e32

Edward Jee authored Oct 08, 2015

Signed-off-by: Edward Hyunkoo Jee <edjee@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7d39e32

sock: support per-packet fwmark · f28ea365

Edward Jee authored Oct 08, 2015

It's useful to allow users to set fwmark for an individual packet,
without changing the socket state. The function this patch adds in
sock layer can be used by the protocols that need such a feature.
Signed-off-by: Edward Hyunkoo Jee <edjee@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f28ea365

Merge branch 'bpf-unprivileged' · c1bf5fe0

David S. Miller authored Oct 12, 2015

Alexei Starovoitov says:

====================
bpf: unprivileged

v1-v2:
- this set logically depends on cb patch
  "bpf: fix cb access in socket filter programs":
  http://patchwork.ozlabs.org/patch/527391/
  which is must have to allow unprivileged programs.
  Thanks Daniel for finding that issue.
- refactored sysctl to be similar to 'modules_disabled'
- dropped bpf_trace_printk
- split tests into separate patch and added more tests
  based on discussion

v1 cover letter:
I think it is time to liberate eBPF from CAP_SYS_ADMIN.
As was discussed when eBPF was first introduced two years ago
the only piece missing in eBPF verifier is 'pointer leak detection'
to make it available to non-root users.
Patch 1 adds this pointer analysis.
The eBPF programs, obviously, need to see and operate on kernel addresses,
but with these extra checks they won't be able to pass these addresses
to user space.
Patch 2 adds accounting of kernel memory used by programs and maps.
It changes behavoir for existing root users, but I think it needs
to be done consistently for both root and non-root, since today
programs and maps are only limited by number of open FDs (RLIMIT_NOFILE).
Patch 2 accounts program's and map's kernel memory as RLIMIT_MEMLOCK.

Unprivileged eBPF is only meaningful for 'socket filter'-like programs.
eBPF programs for tracing and TC classifiers/actions will stay root only.

In parallel the bpf fuzzing effort is ongoing and so far
we've found only one verifier bug and that was already fixed.
The 'constant blinding' pass also being worked on.
It will obfuscate constant-like values that are part of eBPF ISA
to make jit spraying attacks even harder.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c1bf5fe0

bpf: add unprivileged bpf tests · bf508877

Alexei Starovoitov authored Oct 07, 2015

Add new tests samples/bpf/test_verifier:

unpriv: return pointer
  checks that pointer cannot be returned from the eBPF program

unpriv: add const to pointer
unpriv: add pointer to pointer
unpriv: neg pointer
  checks that pointer arithmetic is disallowed

unpriv: cmp pointer with const
unpriv: cmp pointer with pointer
  checks that comparison of pointers is disallowed
  Only one case allowed 'void *value = bpf_map_lookup_elem(..); if (value == 0) ...'

unpriv: check that printk is disallowed
  since bpf_trace_printk is not available to unprivileged

unpriv: pass pointer to helper function
  checks that pointers cannot be passed to functions that expect integers
  If function expects a pointer the verifier allows only that type of pointer.
  Like 1st argument of bpf_map_lookup_elem() must be pointer to map.
  (applies to non-root as well)

unpriv: indirectly pass pointer on stack to helper function
  checks that pointer stored into stack cannot be used as part of key
  passed into bpf_map_lookup_elem()

unpriv: mangle pointer on stack 1
unpriv: mangle pointer on stack 2
  checks that writing into stack slot that already contains a pointer
  is disallowed

unpriv: read pointer from stack in small chunks
  checks that < 8 byte read from stack slot that contains a pointer is
  disallowed

unpriv: write pointer into ctx
  checks that storing pointers into skb->fields is disallowed

unpriv: write pointer into map elem value
  checks that storing pointers into element values is disallowed
  For example:
  int bpf_prog(struct __sk_buff *skb)
  {
    u32 key = 0;
    u64 *value = bpf_map_lookup_elem(&map, &key);
    if (value)
       *value = (u64) skb;
  }
  will be rejected.

unpriv: partial copy of pointer
  checks that doing 32-bit register mov from register containing
  a pointer is disallowed

unpriv: pass pointer to tail_call
  checks that passing pointer as an index into bpf_tail_call
  is disallowed

unpriv: cmp map pointer with zero
  checks that comparing map pointer with constant is disallowed

unpriv: write into frame pointer
  checks that frame pointer is read-only (applies to root too)

unpriv: cmp of frame pointer
  checks that R10 cannot be using in comparison

unpriv: cmp of stack pointer
  checks that Rx = R10 - imm is ok, but comparing Rx is not

unpriv: obfuscate stack pointer
  checks that Rx = R10 - imm is ok, but Rx -= imm is not
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bf508877

bpf: charge user for creation of BPF maps and programs · aaac3ba9

Alexei Starovoitov authored Oct 07, 2015

since eBPF programs and maps use kernel memory consider it 'locked' memory
from user accounting point of view and charge it against RLIMIT_MEMLOCK limit.
This limit is typically set to 64Kbytes by distros, so almost all
bpf+tracing programs would need to increase it, since they use maps,
but kernel charges maximum map size upfront.
For example the hash map of 1024 elements will be charged as 64Kbyte.
It's inconvenient for current users and changes current behavior for root,
but probably worth doing to be consistent root vs non-root.

Similar accounting logic is done by mmap of perf_event.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aaac3ba9