Commits · dea870242a9c4ea74b3ca0f2da3f864c47484cff · Kirill Smelkov / linux

31 Aug, 2015 28 commits

dsa: mv88e6xxx: Allow speed/duplex of port to be configured · dea87024

Andrew Lunn authored Aug 31, 2015

The current code sets user ports to perform auto negotiation using the
phy. CPU and DSA ports are configured to full duplex and maximum speed
the switch supports.

There are however use cases where the CPU has a slower port, and when
user ports have SFP modules with fixed speed. In these cases, port
settings to be read from a fixed_phy devices. The switch driver then
needs to implement the adjust_link op, so the port settings can be
set.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dea87024

net: phy: Allow PHY devices to identify themselves as Ethernet switches, etc. · 5a11dd7d

Florian Fainelli authored Aug 31, 2015

Some Ethernet MAC drivers using the PHY library require the hardcoding
of link parameters when interfaced to a switch device, SFP module,
switch to switch port, etc. This has typically lead to various ad-hoc
implementations looking like this:

- using a "fixed PHY" emulated device, which will provide link
  indication towards the Ethernet MAC driver and hardware

- pretend there is no PHY and hardcode link parameters, ala mv643x_eth

Based on that, it is desireable to have the PHY drivers advertise the
correct link parameters, just like regular Ethernet PHYs towards their
CPU Ethernet MAC drivers, however, Ethernet MAC drivers should be able
to tell whether this link should be monitored or not. In the context
of an Ethernet switch, SFP module, switch to switch link, we do not
need to monitor this link since it should be always up.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

5a11dd7d

mpls: fix mpls_net_init memory leak · 6ea3c9d5

Nikolay Aleksandrov authored Aug 31, 2015

Fix a memory leak in the mpls netns init function in case of failure. If
register_net_sysctl fails then we need to free the ctl_table.

Fixes: 7720c01f ("mpls: Add a sysctl to control the size of the mpls label table")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6ea3c9d5

net: Add tos to validate source tracepoint · f0fa6e52

David Ahern authored Aug 31, 2015

TOS is another key aspect of the lookup passed to fib_validate_source.
Add it to the tracepoint.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f0fa6e52

lib: move strncpy_from_unsafe() into mm/maccess.c · dbb7ee0e

Alexei Starovoitov authored Aug 31, 2015

To fix build errors:
kernel/built-in.o: In function `bpf_trace_printk':
bpf_trace.c:(.text+0x11a254): undefined reference to `strncpy_from_unsafe'
kernel/built-in.o: In function `fetch_memory_string':
trace_kprobe.c:(.text+0x11acf8): undefined reference to `strncpy_from_unsafe'

move strncpy_from_unsafe() next to probe_kernel_read/write()
which use the same memory access style.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 1a6877b9 ("lib: introduce strncpy_from_unsafe()")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dbb7ee0e

Merge branch 'per-route-dctcp-receive-side' · 9dc30648

David S. Miller authored Aug 31, 2015

Daniel Borkmann says:

====================
tcp: receive-side per route dctcp handling

Original cover letter:

  Currently, the following case doesn't use DCTCP, even if it should:

    - responder has f.e. cubic as system wide default
    - 'ip route congctl dctcp $src' was set

  Then, DCTCP is NOT used if a DCTCP sender attempts to connect from a
  host in the $src range: ECT(0) is set, but listen_sk is not dctcp, so
  we fail the INET_ECN_is_not_ect sanity check.

  We also have to examine the dst used for the SYN/ACK reply to make
  this case work.

  In order to minimize additional cost, store the 'ecn is must have'
  information is the dst_features field.

  The set targets -next instead of -net since this doesn't seem to be a
  serious bug and to give the change more soak time until it hits linus
  tree.

v1 -> v2:
 - Addressed Dave's feedback, not exposing any bits to user space
 - Added patch 3 to reject incorrect configurations
 - Rest as is, rebased and retested
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9dc30648

tcp: use dctcp if enabled on the route to the initiator · c3a8d947

Daniel Borkmann authored Aug 31, 2015

Currently, the following case doesn't use DCTCP, even if it should:
A responder has f.e. Cubic as system wide default, but for a specific
route to the initiating host, DCTCP is being set in RTAX_CC_ALGO. The
initiating host then uses DCTCP as congestion control, but since the
initiator sets ECT(0), tcp_ecn_create_request() doesn't set ecn_ok,
and we have to fall back to Reno after 3WHS completes.

We were thinking on how to solve this in a minimal, non-intrusive
way without bloating tcp_ecn_create_request() needlessly: lets cache
the CA ecn option flag in RTAX_FEATURES. In other words, when ECT(0)
is set on the SYN packet, set ecn_ok=1 iff route RTAX_FEATURES
contains the unexposed (internal-only) DST_FEATURE_ECN_CA. This allows
to only do a single metric feature lookup inside tcp_ecn_create_request().

Joint work with Florian Westphal.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3a8d947

fib, fib6: reject invalid feature bits · b8d3e416

Daniel Borkmann authored Aug 31, 2015

Feature bits that are invalid should not be accepted by the kernel,
only the lower 4 bits may be configured, but not the remaining ones.
Even from these 4, 2 of them are unused.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

b8d3e416

net: fib6: reduce identation in ip6_convert_metrics · 1bb14807

Daniel Borkmann authored Aug 31, 2015

Reduce the identation a bit, there's no need to artificically have
it increased.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

1bb14807

net: fib: move metrics parsing to a helper · 6cf9dfd3

Florian Westphal authored Aug 31, 2015

fib_create_info() is already quite large, so before adding more
code to the metrics section move that to a helper, similar to
ip6_convert_metrics.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

6cf9dfd3

IGMP: Document igmp_link_local_mcast_reports · 87583ebb

Philip Downey authored Aug 31, 2015

Document the addition of a new sysctl variable which controls the
generation of IGMP reports for link local multicast groups in the
224.0.0.X range.

IGMP reports for local multicast groups can now be optionally
inhibited by setting the value to zero e.g.:
echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports

To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero.
Signed-off-by: Philip Downey <pdowney@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

87583ebb

ip-tunnel: Use API to access tunnel metadata options. · 4c222798

Pravin B Shelar authored Aug 30, 2015

Currently tun-info options pointer is used in few cases to
pass options around. But tunnel options can be accessed using
ip_tunnel_info_opts() API without using the pointer. Following
patch removes the redundant pointer and consistently make use
of API.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c222798

ipv4: fix 32b build · d1bfc625

Madalin Bucur authored Aug 31, 2015

Address remaining issue after 80ec1927.
Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d1bfc625

ipv4: Fix 32-bit build. · 80ec1927

David S. Miller authored Aug 30, 2015

   net/ipv4/af_inet.c: In function 'snmp_get_cpu_field64':
>> net/ipv4/af_inet.c:1486:26: error: 'offt' undeclared (first use in this function)
      v = *(((u64 *)bhptr) + offt);
                             ^
   net/ipv4/af_inet.c:1486:26: note: each undeclared identifier is reported only once for each function it appears in
   net/ipv4/af_inet.c: In function 'snmp_fold_field64':
>> net/ipv4/af_inet.c:1499:39: error: 'offct' undeclared (first use in this function)
      res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset);
                                          ^
>> net/ipv4/af_inet.c:1499:10: error: too many arguments to function 'snmp_get_cpu_field'
      res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset);
             ^
   net/ipv4/af_inet.c:1455:5: note: declared here
    u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offt)
        ^
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

80ec1927

netlink: rx mmap: fix POLLIN condition · 0ef70770

Ken-ichirou MATSUZAWA authored Aug 31, 2015

Poll() returns immediately after setting the kernel current frame
(ring->head) to SKIP from user space even though there is no new
frame. And in a case of all frames is VALID, user space program
unintensionally sets (only) kernel current frame to UNUSED, then
calls poll(), it will not return immediately even though there are
VALID frames.

To avoid situations like above, I think we need to scan all frames
to find VALID frames at poll() like netlink_alloc_skb(),
netlink_forward_ring() finding an UNUSED frame at skb allocation.
Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

0ef70770

Merge branch 'thunderx-features-fixes' · 793768f5

David S. Miller authored Aug 30, 2015

Aleksey Makarov says:

====================
net: thunderx: New features and fixes

v2:
  - The unused affinity_mask field of the structure cmp_queue
  has been deleted. (thanks to David Miller)
  - The unneeded initializers have been dropped. (thanks to Alexey Klimov)
  - The commit message "net: thunderx: Rework interrupt handling"
  has been fixed. (thanks to Alexey Klimov)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

793768f5

net: thunderx: Support for internal loopback mode · d77a2384

Sunil Goutham authored Aug 30, 2015

Support for setting VF's corresponding BGX LMAC in internal
loopback mode. This mode can be used for verifying basic HW
functionality such as packet I/O, RX checksum validation,
CQ/RBDR interrupts, stats e.t.c. Useful when DUT has no external
network connectivity.

'loopback' mode can be enabled or disabled via ethtool.

Note: This feature is not supported when no of VFs enabled are
morethan no of physical interfaces i.e active BGX LMACs
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d77a2384

net: thunderx: Support for upto 96 queues for a VF · 92dc8769

Sunil Goutham authored Aug 30, 2015

This patch adds support for handling multiple qsets assigned to a
single VF. There by increasing no of queues from earlier 8 to max
no of CPUs in the system i.e 48 queues on a single node and 96 on
dual node system. User doesn't have option to assign which Qsets/VFs
 to be merged. Upon request from VF, PF assigns next free Qsets as
secondary qsets. To maintain current behavior no of queues is kept
to 8 by default which can be increased via ethtool.

If user wants to unbind NICVF driver from a secondary Qset then it
should be done after tearing down primary VF's interface.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

92dc8769

net: thunderx: Rework interrupt handling · 39ad6eea

Sunil Goutham authored Aug 30, 2015

Rework interrupt handler to avoid checking IRQ affinity of
CQ interrupts. Now separate handlers are registered for each IRQ
including RBDR. Register interrupt handlers for only those
which are being used. Add nicvf_dump_intr_status() and use it
in irq handlers.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

39ad6eea

net: thunderx: Support for HW VLAN stripping · aa2e259b

Sunil Goutham authored Aug 30, 2015

This patch configures HW to strip 802.1Q header if found in a
receiving packet. The stripped VLAN ID and TCI information is
passed on to software via CQE_RX. Also sets netdev's 'vlan_features'
so that other HW offload features can be used for tagged packets.

This offload feature can be enabled or disabled via ethtool.

Network stack normally ignores RPS for 802.1Q packets and hence low
throughput. With this offload enabled throughput for tagged packets
will be almost same as normal packets.

Note: This patch doesn't enable HW VLAN insertion for transmit packets.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa2e259b

net: thunderx: Receive hashing HW offload support · 38bb5d4f

Sunil Goutham authored Aug 30, 2015

Adding support for receive hashing HW offload by using RSS_ALG
and RSS_TAG fields of CQE_RX descriptor. Also removed dependency
on minimum receive queue count to configure RSS so that hash is
always generated.

This hash is used by RPS logic to distribute flows across multiple
CPUs. Offload can be disabled via ethtool.
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

38bb5d4f

net: thunderx: mailboxes: remove code duplication · 6051cba7

Sunil Goutham authored Aug 30, 2015

Use the nicvf_send_msg_to_pf() function in the mailbox code.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6051cba7

net: thunderx: Add receive error stats reporting via ethtool · a2dc5ded

Sunil Goutham authored Aug 30, 2015

Added ethtool support to dump receive packet error statistics reported
in CQE. Also made some small fixes
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a2dc5ded

net: thunderx: fix MAINTAINERS · 322e5cc5

Aleksey Makarov authored Aug 30, 2015

The liquidio and thunder drivers have different maintainers.
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

322e5cc5

Merge branch 'snmp-stat-aggregation' · ef34c0f6

David S. Miller authored Aug 30, 2015

Raghavendra K T says:

====================
Optimize the snmp stat aggregation for large cpus

While creating 1000 containers, perf is showing lot of time spent in
snmp_fold_field on a large cpu system.

The current patch tries to improve by reordering the statistics gathering.

Please note that similar overhead was also reported while creating
veth pairs  https://lkml.org/lkml/2013/3/19/556

Changes in V4:
 - remove 'item' variable and use IPSTATS_MIB_MAX to avoid sparse
   warning (Eric) also remove 'item' parameter (Joe)
 - add missing memset of padding.

Changes in V3:
 - use memset to initialize temp buffer in leaf function. (David)
 - use memcpy to copy the buffer data to stat instead of unalign_pu (Joe)
 - Move buffer definition to leaf function __snmp6_fill_stats64() (Eric)
 -
Changes in V2:
 - Allocate the stat calculation buffer in stack. (Eric)

Setup:
160 cpu (20 core) baremetal powerpc system with 1TB memory

1000 docker containers was created with command
docker run -itd  ubuntu:15.04  /bin/bash in loop

observation:
Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) perf data showed, creating veth interfaces resulting in
the below code path was taking more time.

rtnl_fill_ifinfo
  -> inet6_fill_link_af
    -> inet6_fill_ifla6_attrs
      -> snmp_fold_field

proposed idea:
 currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data to of an item (iteratively for around 36 items).
 The patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.

Performance of docker creation improved by around more than 2x
after the patch.

before the patch:
================
3f45ba571a42e925c4ec4aaee0e48d7610a9ed82a4c931f83324d41822cf6617
real	0m6.836s
user	0m0.095s
sys	0m0.011s

perf record -a docker run -itd  ubuntu:15.04  /bin/bash
=======================================================
    50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
     9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
     3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
     2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock
     1.37%  docker           docker                  [.] backtrace_qsort
     1.31%  docker           docker                  [.] strings.FieldsFunc

  cache-misses:  2.7%

after the patch:
=============
9178273e9df399c8290b6c196e4aef9273be2876225f63b14a60cf97eacfafb5
real	0m3.249s
user	0m0.088s
sys	0m0.020s

perf record -a docker run -itd  ubuntu:15.04  /bin/bash
=======================================================
    10.57%  docker           docker                [.] scanblock
     8.37%  swapper          [kernel.kallsyms]     [k] snooze_loop
     6.91%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
     6.67%  docker           [kernel.kallsyms]     [k] veth_stats_one
     3.96%  docker           docker                [.] runtime_MSpan_Sweep
     2.47%  docker           docker                [.] strings.FieldsFunc

cache-misses: 1.41 %

Please let me know if you have suggestions/comments.
Thanks Eric, Joe and David for the comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ef34c0f6

net: Optimize snmp stat aggregation by walking all the percpu data at once · a3a77372

Raghavendra K T authored Aug 30, 2015

Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.

reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data of an item (iteratively for around 36 items).

idea: This patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.

Docker creation got faster by more than 2x after the patch.

Result:
                       Before           After
Docker creation time   6.836s           3.25s
cache miss             2.7%             1.41%

perf before:
    50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
     9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
     3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
     2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock

perf after:
    10.57%  docker           docker                [.] scanblock
     8.37%  swapper          [kernel.kallsyms]     [k] snooze_loop
     6.91%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
     6.67%  docker           [kernel.kallsyms]     [k] veth_stats_one

changes/ideas suggested:
Using buffer in stack (Eric), Usage of memset (David), Using memcpy in
place of unaligned_put (Joe).
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a3a77372

net: Introduce helper functions to get the per cpu data · c4c6bc31

Raghavendra K T authored Aug 30, 2015

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c4c6bc31

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 06fb4e70
David S. Miller authored Aug 30, 2015

06fb4e70

30 Aug, 2015 5 commits

Merge branch 'ovs-vport-cleanup' · 2573d788

David S. Miller authored Aug 29, 2015

Pravin B Shelar says:

====================
openvswitch: Cleanup post vport conversion.

After converting all vport to netdev implmentations there
is no need for some of vport functionality.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2573d788

openvswitch: Remove vport-net · a581b96d

Pravin B Shelar authored Aug 29, 2015

This structure is not used anymore.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a581b96d

openvswitch: Remove vport stats. · 8c876639

Pravin B Shelar authored Aug 29, 2015

Since all vport types are now backed by netdev, we can directly
use netdev stats. Following patch removes redundant stat
from vport.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8c876639

openvswitch: Remove egress_tun_info. · 3eedb41f

Pravin B Shelar authored Aug 29, 2015

tun info is passed using skb-dst pointer. Now we have
converted all vports to netdev based implementation so
Now we can remove redundant pointer to tun-info from OVS_CB.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3eedb41f

openvswitch: Remove vport get_name() · 24d43f32

Pravin B Shelar authored Aug 29, 2015

Remove unused get_name() function pointer from vport ops.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

24d43f32

29 Aug, 2015 7 commits

geneve: Use GRO cells infrastructure. · 8e816df8

Jesse Gross authored Aug 28, 2015

Geneve can benefit from GRO at the device level in a manner similar
to other tunnels, especially as hardware offloads are still emerging.

After this patch, aggregated frames are seen on the tunnel interface.
Single stream throughput nearly doubles in ideal circumstances (on
old hardware).
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8e816df8

openvswitch: retain parsed IPv6 header fields in flow on error skipping extension headers · c30da497

Simon Horman authored Aug 29, 2015

When an error occurs skipping IPv6 extension headers retain the already
parsed IP protocol and IPv6 addresses in the flow. Also assume that the
packet is not a fragment in the absence of information to the contrary;
that is always use the frag_off value set by ipv6_skip_exthdr().

This allows matching on the IP protocol and IPv6 addresses of packets
with malformed extension headers.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c30da497

Merge branch 'for-upstream' of... · f5004a14

David S. Miller authored Aug 29, 2015

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2015-08-28

One more bunch of Bluetooth patches for 4.3:

 - Crash fix for hci_bcm driver
 - Enhancements to hci_intel driver (e.g. baudrate configuration)
 - Fix for SCO link type after multiple connect attempts
 - Cleanups & minor fixes in a few other places

Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

f5004a14

net/smsc911x: Fix deferred probe for interrupt · f892a84c

Tony Lindgren authored Aug 28, 2015

The interrupt handler may not be available when smsc911x probes if the
interrupt handler is a GPIO controller for example. Let's fix that
by adding handling for -EPROBE_DEFER.

Cc: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f892a84c

Merge branch 'tnl-ipv4-ipv6' · 6d742324

David S. Miller authored Aug 29, 2015

Jiri Benc says:

====================
tunnels: fix incorrect IPv4/v6 headers interpretation

With tunneling, it is currently possible to get an IPv6 header and interpret
it as an IPv4 header, or to interpret an IPv6 address as an IPv4 address
(and vice versa). This leads to things like sending packets to incorrect
address, IPv6 flow label being interpreted as IP packet length, etc.

Fix several places where this can happen.

Most of this is net-next only. The third patch affects net, too, but it
doesn't seem there's anything in user space that sets the attribute at all
currently, thus net-next is fine.

Changelog:
v2: fixed geneve after incorrect rebase on top of Pravin's patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6d742324

vxlan: do not receive IPv4 packets on IPv6 socket · a43a9ef6

Jiri Benc authored Aug 28, 2015

By default (subject to the sysctl settings), IPv6 sockets listen also for
IPv4 traffic. Vxlan is not prepared for that and expects IPv6 header in
packets received through an IPv6 socket.

In addition, it's currently not possible to have both IPv4 and IPv6 vxlan
tunnel on the same port (unless bindv6only sysctl is enabled), as it's not
possible to create and bind both IPv4 and IPv6 vxlan interfaces and there's
no way to specify both IPv4 and IPv6 remote/group IP addresses.

Set IPV6_V6ONLY on vxlan sockets to fix both of these issues. This is not
done globally in udp_tunnel, as l2tp and tipc seems to work okay when
receiving IPv4 packets on IPv6 socket and people may rely on this behavior.
The other tunnels (geneve and fou) do not support IPv6.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a43a9ef6

fou: reject IPv6 config · b9b6695c

Jiri Benc authored Aug 28, 2015

fou does not really support IPv6 encapsulation. After an UDP socket is
created in fou_create, the encap_rcv callback is set either to fou_udp_recv
or to gue_udp_recv. Both of those unconditionally assume that the received
packet has an IPv4 header and access the data at network_header as it was an
IPv4 header. This leads to IPv6 flow label being interpreted as IP packet
length, etc.

Disallow fou tunnel to be configured as IPv6 until real IPv6 support is
added to fou.

CC: Tom Herbert <tom@herbertland.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9b6695c