- 11 Aug, 2016 4 commits
-
-
Jason Wang authored
We've clean skb_array in macvtap_put_queue() but still try to pop from it during macvtap_sock_destruct(). Fix this use after free by moving the skb array cleanup to macvtap_sock_destruct() instead. Fixes: 362899b8 ("macvtap: switch to use skb array") Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martynas Pumputis authored
The creation of a tunnel vport (geneve, gre, vxlan) brings up a corresponding netdev, a multi-step operation which can fail. For example, changing a vxlan vport's netdev state to 'up' binds the vport's socket to a UDP port - if the binding fails (e.g. due to the port being in use), the error is currently ignored giving the appearance that the tunnel vport creation completed successfully. Signed-off-by: Martynas Pumputis <martynas@weave.works> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fabian Frederick authored
s/gamc/gmac/ Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Parthasarathy Bhuvaragan authored
In commit cf6f7e1d ("tipc: dump monitor attributes"), I dereferenced a pointer before checking if its valid. This is reported by static check Smatch as: net/tipc/monitor.c:733 tipc_nl_add_monitor_peer() warn: variable dereferenced before check 'mon' (see line 731) In this commit, we check for a valid monitor before proceeding with any other operation. Fixes: cf6f7e1d ("tipc: dump monitor attributes") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 10 Aug, 2016 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller authored
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Use mod_timer_pending() to avoid reactivating a dead expectation in the h323 conntrack helper, from Liping Zhang. 2) Oneliner to fix a type in the register name defined in the nf_tables header. 3) Don't try to look further when we find an inactive elements with no descendants in the rbtree set implementation, otherwise we crash. 4) Handle valid zero CSeq in the SIP conntrack helper, from Christophe Leroy. 5) Don't display a trailing slash in conntrack helper with no classes via /proc/net/nf_conntrack_expect, from Liping Zhang. 6) Fix an expectation leak during creation from the nfqueue path, again from Liping Zhang. 7) Validate netlink port ID in verdict message from nfqueue, otherwise an injection can be possible. Again from Zhang. 8) Reject conntrack tuples with different transport protocol on original and reply tuples, also from Zhang. 9) Validate offset and length in nft_exthdr, make sure they are under sizeof(u8), from Laura Garcia Liebana. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Laura Garcia Liebana authored
Fix the direct assignment of offset and length attributes included in nft_exthdr structure from u32 data to u8. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Toshiaki Makita authored
Adding fdb entries pointing to the bridge device uses fdb_insert(), which lacks various checks and does not respect added_by_user flag. As a result, some inconsistent behavior can happen: * Adding temporary entries succeeds but results in permanent entries. * Same goes for "dynamic" and "use". * Changing mac address of the bridge device causes deletion of user-added entries. * Replacing existing entries looks successful from userspace but actually not, regardless of NLM_F_EXCL flag. Use the same logic as other entries and fix them. Fixes: 3741873b ("bridge: allow adding of fdb entries pointing to the bridge device") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Aug, 2016 21 commits
-
-
Wenyou Yang authored
Disable all interrupts when suspend, they will be enabled when resume. Otherwise, the suspend/resume process will be blocked occasionally. Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sylwester Nawrocki authored
Commit b5a099c6 "net: ethernet: davicom: fix devicetree irq resource" causes an interrupt storm after the ethernet interface is activated on S3C24XX platform (ARM non-dt), due to the interrupt trigger type not being set properly. It seems, after adding parsing of IRQ flags in commit 7085a740 "drivers: platform: parse IRQ flags from resources", there is no path for non-dt platforms where irq_set_type callback could be invoked when we don't pass the trigger type flags to the request_irq() call. In case of a board where the regression is seen the interrupt trigger type flags are passed through a platform device's resource and it is not currently handled properly without passing the irq trigger type flags to the request_irq() call. In case of OF an of_irq_get() call within platform_get_irq() function seems to be ensuring required irq_chip setup, but there is no equivalent code for non OF/ACPI platforms. This patch mostly restores irq trigger type setting code which has been removed in commit ("net: ethernet: davicom: fix devicetree irq resource"). Fixes: b5a099c6 ("net: ethernet: davicom: fix devicetree irq resource") Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhu Yanjun authored
The message "803.ad" should be "802.3ad". Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
Kmemleak reports following false positive memory leaks for each sk buffers allocated by CPSW (__netdev_alloc_skb_ip_align()) in cpsw_ndo_open() and cpsw_rx_handler(): unreferenced object 0xea915000 (size 2048): comm "systemd-network", pid 713, jiffies 4294938323 (age 102.180s) hex dump (first 32 bytes): 00 58 91 ea ff ff ff ff ff ff ff ff ff ff ff ff .X.............. ff ff ff ff ff ff fd 0f 00 00 00 00 00 00 00 00 ................ backtrace: [<c0108680>] __kmalloc_track_caller+0x1a4/0x230 [<c0529eb4>] __alloc_skb+0x68/0x16c [<c052c884>] __netdev_alloc_skb+0x40/0x104 [<bf1ad29c>] cpsw_ndo_open+0x374/0x670 [ti_cpsw] [<c053c3d4>] __dev_open+0xb0/0x114 [<c053c690>] __dev_change_flags+0x9c/0x14c [<c053c760>] dev_change_flags+0x20/0x50 [<c054bdcc>] do_setlink+0x2cc/0x78c [<c054c358>] rtnl_setlink+0xcc/0x100 [<c054b34c>] rtnetlink_rcv_msg+0x184/0x224 [<c056467c>] netlink_rcv_skb+0xa8/0xc4 [<c054b1c0>] rtnetlink_rcv+0x2c/0x34 [<c0564018>] netlink_unicast+0x16c/0x1f8 [<c0564498>] netlink_sendmsg+0x334/0x348 [<c052015c>] sock_sendmsg+0x1c/0x2c [<c05213e0>] SyS_sendto+0xc0/0xe8 unreferenced object 0xec861780 (size 192): comm "softirq", pid 0, jiffies 4294938759 (age 109.540s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 b0 5a ed 00 00 00 00 00 00 00 00 ......Z......... backtrace: [<c0107830>] kmem_cache_alloc+0x190/0x208 [<c052c768>] __build_skb+0x30/0x98 [<c052c8fc>] __netdev_alloc_skb+0xb8/0x104 [<bf1abc54>] cpsw_rx_handler+0x68/0x1e4 [ti_cpsw] [<bf11aa30>] __cpdma_chan_free+0xa8/0xc4 [davinci_cpdma] [<bf11ab98>] __cpdma_chan_process+0x14c/0x16c [davinci_cpdma] [<bf11abfc>] cpdma_chan_process+0x44/0x5c [davinci_cpdma] [<bf1adc78>] cpsw_rx_poll+0x1c/0x9c [ti_cpsw] [<c0539180>] net_rx_action+0x1f0/0x2ec [<c003881c>] __do_softirq+0x134/0x258 [<c0038a00>] do_softirq+0x68/0x70 [<c0038adc>] __local_bh_enable_ip+0xd4/0xe8 [<c0640994>] _raw_spin_unlock_bh+0x30/0x34 [<c05f4e9c>] igmp6_group_added+0x4c/0x1bc [<c05f6600>] ipv6_dev_mc_inc+0x398/0x434 [<c05dba74>] addrconf_dad_work+0x224/0x39c This happens because CPSW allocates SK buffers and then passes pointers on them in CPDMA where they stored in internal CPPI RAM (SRAM) which belongs to DEV MMIO space. Kmemleak does not scan IO memory and so reports memory leaks. Hence, mark allocated sk buffers as false positive explicitly. Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lance Richardson authored
When executing the script included below, the netns delete operation hangs with the following message (repeated at 10 second intervals): kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 This occurs because a reference to the lo interface in the "secure" netns is still held by a dst entry in the xfrm bundle cache in the init netns. Address this problem by garbage collecting the tunnel netns flow cache when a cross-namespace vti interface receives a NETDEV_DOWN notification. A more detailed description of the problem scenario (referencing commands in the script below): (1) ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1 The vti_test interface is created in the init namespace. vti_tunnel_init() attaches a struct ip_tunnel to the vti interface's netdev_priv(dev), setting the tunnel net to &init_net. (2) ip link set vti_test netns secure The vti_test interface is moved to the "secure" netns. Note that the associated struct ip_tunnel still has tunnel->net set to &init_net. (3) ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1 The first packet sent using the vti device causes xfrm_lookup() to be called as follows: dst = xfrm_lookup(tunnel->net, skb_dst(skb), fl, NULL, 0); Note that tunnel->net is the init namespace, while skb_dst(skb) references the vti_test interface in the "secure" namespace. The returned dst references an interface in the init namespace. Also note that the first parameter to xfrm_lookup() determines which flow cache is used to store the computed xfrm bundle, so after xfrm_lookup() returns there will be a cached bundle in the init namespace flow cache with a dst referencing a device in the "secure" namespace. (4) ip netns del secure Kernel begins to delete the "secure" namespace. At some point the vti_test interface is deleted, at which point dst_ifdown() changes the dst->dev in the cached xfrm bundle flow from vti_test to lo (still in the "secure" namespace however). Since nothing has happened to cause the init namespace's flow cache to be garbage collected, this dst remains attached to the flow cache, so the kernel loops waiting for the last reference to lo to go away. <Begin script> ip link add br1 type bridge ip link set dev br1 up ip addr add dev br1 1.1.1.1/8 ip netns add secure ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1 ip link set vti_test netns secure ip netns exec secure ip link set vti_test up ip netns exec secure ip link s lo up ip netns exec secure ip addr add dev lo 192.168.100.1/24 ip netns exec secure ip route add 192.168.200.0/24 dev vti_test ip xfrm policy flush ip xfrm state flush ip xfrm policy add dir out tmpl src 1.1.1.1 dst 1.1.1.2 \ proto esp mode tunnel mark 1 ip xfrm policy add dir in tmpl src 1.1.1.2 dst 1.1.1.1 \ proto esp mode tunnel mark 1 ip xfrm state add src 1.1.1.1 dst 1.1.1.2 proto esp spi 1 \ mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788 ip xfrm state add src 1.1.1.2 dst 1.1.1.1 proto esp spi 1 \ mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788 ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1 ip netns del secure <End script> Reported-by: Hangbin Liu <haliu@redhat.com> Reported-by: Jan Tluka <jtluka@redhat.com> Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fsDavid S. Miller authored
David Howells says: ==================== rxrpc: Miscellaneous fixes Here are a bunch of miscellaneous fixes to AF_RXRPC: (*) Fix an uninitialised pointer. (*) Fix error handling when we fail to connect a call. (*) Fix a NULL pointer dereference. (*) Fix two occasions where a packet is accessed again after being queued for someone else to deal with. (*) Fix a missing skb free. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Howells authored
Under certain conditions, the data_ready handler will discard a packet. These need to be freed. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Fix a use of a packet after it has been enqueued onto the packet processing queue in the data_ready handler. Once on a call's Rx queue, we mustn't touch it any more as it may be dequeued and freed by the call processor running on a work queue. Save the values we need before enqueuing. Without this, we can get an oops like the following: BUG: unable to handle kernel NULL pointer dereference at 000000000000009c IP: [<ffffffffa01854e8>] rxrpc_fast_process_packet+0x724/0xa11 [af_rxrpc] PGD 0 Oops: 0000 [#1] SMP Modules linked in: kafs(E) af_rxrpc(E) [last unloaded: af_rxrpc] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G E 4.7.0-fsdevel+ #1336 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 task: ffff88040d6863c0 task.stack: ffff88040d68c000 RIP: 0010:[<ffffffffa01854e8>] [<ffffffffa01854e8>] rxrpc_fast_process_packet+0x724/0xa11 [af_rxrpc] RSP: 0018:ffff88041fb03a78 EFLAGS: 00010246 RAX: ffffffffffffffff RBX: ffff8803ff195b00 RCX: 0000000000000001 RDX: ffffffffa01854d1 RSI: 0000000000000008 RDI: ffff8803ff195b00 RBP: ffff88041fb03ab0 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88041fb038c8 R11: 0000000000000000 R12: ffff880406874800 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000009c CR3: 0000000001c14000 CR4: 00000000001406e0 Stack: ffff8803ff195ea0 ffff880408348800 ffff880406874800 ffff8803ff195b00 ffff880408348800 ffff8803ff195ed8 0000000000000000 ffff88041fb03af0 ffffffffa0186072 0000000000000000 ffff8804054da000 0000000000000000 Call Trace: <IRQ> [<ffffffffa0186072>] rxrpc_data_ready+0x89d/0xbae [af_rxrpc] [<ffffffff814c94d7>] __sock_queue_rcv_skb+0x24c/0x2b2 [<ffffffff8155c59a>] __udp_queue_rcv_skb+0x4b/0x1bd [<ffffffff8155e048>] udp_queue_rcv_skb+0x281/0x4db [<ffffffff8155ea8f>] __udp4_lib_rcv+0x7ed/0x963 [<ffffffff8155ef9a>] udp_rcv+0x15/0x17 [<ffffffff81531d86>] ip_local_deliver_finish+0x1c3/0x318 [<ffffffff81532544>] ip_local_deliver+0xbb/0xc4 [<ffffffff81531bc3>] ? inet_del_offload+0x40/0x40 [<ffffffff815322a9>] ip_rcv_finish+0x3ce/0x42c [<ffffffff81532851>] ip_rcv+0x304/0x33d [<ffffffff81531edb>] ? ip_local_deliver_finish+0x318/0x318 [<ffffffff814dff9d>] __netif_receive_skb_core+0x601/0x6e8 [<ffffffff814e072e>] __netif_receive_skb+0x13/0x54 [<ffffffff814e082a>] netif_receive_skb_internal+0xbb/0x17c [<ffffffff814e1838>] napi_gro_receive+0xf9/0x1bd [<ffffffff8144eb9f>] rtl8169_poll+0x32b/0x4a8 [<ffffffff814e1c7b>] net_rx_action+0xe8/0x357 [<ffffffff81051074>] __do_softirq+0x1aa/0x414 [<ffffffff810514ab>] irq_exit+0x3d/0xb0 [<ffffffff810184a2>] do_IRQ+0xe4/0xfc [<ffffffff81612053>] common_interrupt+0x93/0x93 <EOI> [<ffffffff814af837>] ? cpuidle_enter_state+0x1ad/0x2be [<ffffffff814af832>] ? cpuidle_enter_state+0x1a8/0x2be [<ffffffff814af96a>] cpuidle_enter+0x12/0x14 [<ffffffff8108956f>] call_cpuidle+0x39/0x3b [<ffffffff81089855>] cpu_startup_entry+0x230/0x35d [<ffffffff810312ea>] start_secondary+0xf4/0xf7 Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Once a packet has been posted to a connection in the data_ready handler, we mustn't try reposting if we then find that the connection is dying as the refcount has been given over to the dying connection and the packet might no longer exist. Losing the packet isn't a problem as the peer will retransmit. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
The call state machine processor sets up the message parameters for a UDP message that it might need to transmit in advance on the basis that there's a very good chance it's going to have to transmit either an ACK or an ABORT. This requires it to look in the connection struct to retrieve some of the parameters. However, if the call is complete, the call connection pointer may be NULL to dissuade the processor from transmitting a message. However, there are some situations where the processor is still going to be called - and it's still going to set up message parameters whether it needs them or not. This results in a NULL pointer dereference at: net/rxrpc/call_event.c:837 To fix this, skip the message pre-initialisation if there's no connection attached. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
If rxrpc_new_client_call() fails to make a connection, the call record that it allocated needs to be marked as RXRPC_CALL_RELEASED before it is passed to rxrpc_put_call() to indicate that it no longer has any attachment to the AF_RXRPC socket. Without this, an assertion failure may occur at: net/rxrpc/call_object:635 Signed-off-by: David Howells <dhowells@redhat.com>
-
Arnd Bergmann authored
A newly added bugfix caused an uninitialized variable to be used for printing debug output. This is harmless as long as the debug setting is disabled, but otherwise leads to an immediate crash. gcc warns about this when -Wmaybe-uninitialized is enabled: net/rxrpc/call_object.c: In function 'rxrpc_release_call': net/rxrpc/call_object.c:496:163: error: 'sp' may be used uninitialized in this function [-Werror=maybe-uninitialized] The initialization was removed but one of the users remains. This adds back the initialization. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 372ee163 ("rxrpc: Fix races between skb free, ACK generation and replying") Signed-off-by: David Howells <dhowells@redhat.com>
-
Liping Zhang authored
Currently, user can add a conntrack with different l4proto via nfnetlink. For example, original tuple is TCP while reply tuple is SCTP. This is invalid combination, we should report EINVAL to userspace. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Liping Zhang authored
Like NFQNL_MSG_VERDICT_BATCH do, we should also reject the verdict request when the portid is not same with the initial portid(maybe from another process). Fixes: 97d32cf9 ("netfilter: nfnetlink_queue: batch verdict support") Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Liping Zhang authored
User can use NFQA_EXP to attach expectations to conntracks, but we forget to put back nf_conntrack_expect when it is inserted successfully, i.e. in this normal case, expect's use refcnt will be 3. So even we unlink it and put it back later, the use refcnt is still 1, then the memory will be leaked forever. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Liping Zhang authored
The 'name' filed in struct nf_conntrack_expect_policy{} is not a pointer, so check it is NULL or not will always return true. Even if the name is empty, slash will always be displayed like follows: # cat /proc/net/nf_conntrack_expect 297 l3proto = 2 proto=6 src=1.1.1.1 dst=2.2.2.2 sport=1 dport=1025 ftp/ ^ Fixes: 3a8fc53a ("netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names") Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
David S. Miller authored
Sudarsana Reddy Kalluru says: ==================== qed: dcbx fix series. The patch series contains the minor bug fixes for qed dcbx module. Please consider applying this to 'net' branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sudarsana Reddy Kalluru authored
App count is not updated while adding new app entry to the dcbx app table. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sudarsana Reddy Kalluru authored
MFW now supports the Selection field for IEEE mode. Add driver changes to use the newer MFW masks to read/write the port-id value. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sudarsana Reddy Kalluru authored
Ethtype value is being read incorrectly in ieee-dcbx mode. Use the correct mfw mask value. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sudarsana Reddy Kalluru authored
Endian-ness conversion is not needed for priority-to-TC field as the field is already being read/written by the driver in big-endian way. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 08 Aug, 2016 12 commits
-
-
Xin Long authored
Commit 52253db9 ("sctp: also point GSO head_skb to the sk when it's available") used event->chunk->head_skb to get the head_skb in sctp_ulpevent_set_owner(). But at that moment, the event->chunk was NULL, as it cloned the skb in sctp_ulpevent_make_rcvmsg(). Therefore, that patch didn't really work. This patch is to move the event->chunk initialization before calling sctp_ulpevent_receive_data() so that it uses event->chunk when it's valid. Fixes: 52253db9 ("sctp: also point GSO head_skb to the sk when it's available") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
pravin shelar authored
vxlan driver has bypass for local vxlan traffic, but that depends on information about all VNIs on local system in vxlan driver. This is not available in case of LWT. Therefore following patch disable encap bypass for LWT vxlan traffic. Fixes: ee122c79 ("vxlan: Flow based tunneling"). Reported-by: Jakub Libosvar <jlibosva@redhat.com> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
pravin shelar authored
LWT user can specify destination as well as source ip address for given tunnel endpoint. But vxlan is ignoring given source ip address. Following patch uses both ip address to route the tunnel packet. This consistent with other LWT implementations, like GENEVE and GRE. Fixes: ee122c79 ("vxlan: Flow based tunneling"). Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Daniel Borkmann says: ==================== Few BPF helper related checksum fixes The set contains three fixes with regards to CHECKSUM_COMPLETE and BPF helper functions. For details please see individual patches. Thanks! v1 -> v2: - Fixed make htmldocs issue reported by kbuild bot. - Rest as is. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
When having skbs on ingress with CHECKSUM_COMPLETE, tc BPF programs don't push rcsum of mac header back in and after BPF run back pull out again as opposed to some other subsystems (ovs, for example). For cases like q-in-q, meaning when a vlan tag for offloading is already present and we're about to push another one, then skb_vlan_push() pushes the inner one into the skb, increasing mac header and skb_postpush_rcsum()'ing the 4 bytes vlan header diff. Likewise, for the reverse operation in skb_vlan_pop() for the case where vlan header needs to be pulled out of the skb, we're decreasing the mac header and skb_postpull_rcsum()'ing the 4 bytes rcsum of the vlan header that was removed. However mangling the rcsum here will lead to hw csum failure for BPF case, since we're pulling or pushing data that was not part of the current rcsum. Changing tc BPF programs in general to push/pull rcsum around BPF_PROG_RUN() is also not really an option since current behaviour is ABI by now, but apart from that would also mean to do quite a bit of useless work in the sense that usually 12 bytes need to be rcsum pushed/pulled also when we don't need to touch this vlan related corner case. One way to fix it would be to push the necessary rcsum fixup down into vlan helpers that are (mostly) slow-path anyway. Fixes: 4e10df9a ("bpf: introduce bpf_skb_vlan_push/pop() helpers") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
bpf_skb_store_bytes() invocations above L2 header need BPF_F_RECOMPUTE_CSUM flag for updates, so that CHECKSUM_COMPLETE will be fixed up along the way. Where we ran into an issue with bpf_skb_store_bytes() is when we did a single-byte update on the IPv6 hoplimit despite using BPF_F_RECOMPUTE_CSUM flag; simple ping via ICMPv6 triggered a hw csum failure as a result. The underlying issue has been tracked down to a buffer alignment issue. Meaning, that csum_partial() computations via skb_postpull_rcsum() and skb_postpush_rcsum() pair invoked had a wrong result since they operated on an odd address for the hoplimit, while other computations were done on an even address. This mix doesn't work as-is with skb_postpull_rcsum(), skb_postpush_rcsum() pair as it always expects at least half-word alignment of input buffers, which is normally the case. Thus, instead of these helpers using csum_sub() and (implicitly) csum_add(), we need to use csum_block_sub(), csum_block_add(), respectively. For unaligned offsets, they rotate the sum to align it to a half-word boundary again, otherwise they work the same as csum_sub() and csum_add(). Adding __skb_postpull_rcsum(), __skb_postpush_rcsum() variants that take the offset as an input and adapting bpf_skb_store_bytes() to them fixes the hw csum failures again. The skb_postpull_rcsum(), skb_postpush_rcsum() helpers use a 0 constant for offset so that the compiler optimizes the offset & 1 test away and generates the same code as with csum_sub()/_add(). Fixes: 608cd71a ("tc: bpf: generalize pedit action") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Follow-up to commit f8ffad69 ("bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions") to fix an issue for dev_queue_xmit() redirect locations which need CHECKSUM_COMPLETE fixups on ingress. For the same reasons as described in f8ffad69 already, we of course also need this here, since dev_queue_xmit() on a veth device will let us end up in the dev_forward_skb() helper again to cross namespaces. Latter then calls into skb_postpull_rcsum() to pull out L2 header, so that netif_rx_internal() sees CHECKSUM_COMPLETE as it is expected. That is, CHECKSUM_COMPLETE on ingress covering L2 _payload_, not L2 headers. Also here we have to address bpf_redirect() and bpf_clone_redirect(). Fixes: 3896d655 ("bpf: introduce bpf_clone_redirect() helper") Fixes: 27b29f63 ("bpf: add bpf_redirect() helper") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Gortmaker authored
The call site for this function appears as: #ifdef DEBUG data->msg_enable = DEBUG; dump_eth_one(dev); #endif ...leading to the following warning for !DEBUG builds: drivers/net/ethernet/tundra/tsi108_eth.c:169:13: warning: 'dump_eth_one' defined but not used [-Wunused-function] static void dump_eth_one(struct net_device *dev) ^ ...when using the arch/powerpc/configs/mpc7448_hpc2_defconfig Put the function definition under the same #ifdef as the call site to avoid the warning. Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: DCB fixes Patches 1 and 2 fix a problem in which PAUSE frames settings are wrongly overridden when ieee_setpfc() gets called. Patch 3 adds a missing rollback in port's creation error path. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
We correctly execute mlxsw_sp_port_dcb_fini() when port is removed, but I missed its rollback in the error path of port creation, so add it. Fixes: f00817df ("mlxsw: spectrum: Introduce support for Data Center Bridging (DCB)") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The PFCC register is used to configure both PAUSE and PFC frames. Therefore, when PFC frames are disabled we must make sure we don't mistakenly also disable PAUSE frames (which might be enabled). Fix this by packing the PFCC register with the current PAUSE settings. Note that this register is also accessed via ethtool ops, but there we are guaranteed to have PFC disabled. Fixes: d81a6bdb ("mlxsw: spectrum: Add IEEE 802.1Qbb PFC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
When ieee_setpfc() gets called, PAUSE frames are not necessarily disabled on the port. Check if PAUSE frames are disabled or enabled and configure the port's headroom buffer accordingly. Fixes: d81a6bdb ("mlxsw: spectrum: Add IEEE 802.1Qbb PFC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-