- 18 Dec, 2015 24 commits
-
-
Mike Snitzer authored
commit 172c2386 upstream. A thin-pool that is in out-of-data-space (OODS) mode may transition back to write mode -- without the admin adding more space to the thin-pool -- if/when blocks are released (either by deleting thin devices or discarding provisioned blocks). But as part of the thin-pool's earlier transition to out-of-data-space mode the thin-pool may have set the 'error_if_no_space' flag to true if the no_space_timeout expires without more space having been made available. That implementation detail, of changing the pool's error_if_no_space setting, needs to be reset back to the default that the user specified when the thin-pool's table was loaded. Otherwise we'll drop the user requested behaviour on the floor when this out-of-data-space to write mode transition occurs. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com> Fixes: 2c43fd26 ("dm thin: fix missing out-of-data-space to write mode transition if blocks are released") [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Clemens Ladisch authored
commit a91e627e upstream. One of the many faults of the QinHeng CH345 USB MIDI interface chip is that it does not handle received SysEx messages correctly -- every second event packet has a wrong code index number, which is the one from the last seen message, instead of 4. For example, the two messages "FE F0 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E F7" result in the following event packets: correct: CH345: 0F FE 00 00 0F FE 00 00 04 F0 01 02 04 F0 01 02 04 03 04 05 0F 03 04 05 04 06 07 08 04 06 07 08 04 09 0A 0B 0F 09 0A 0B 04 0C 0D 0E 04 0C 0D 0E 05 F7 00 00 05 F7 00 00 A class-compliant driver must interpret an event packet with CIN 15 as having a single data byte, so the other two bytes would be ignored. The message received by the host would then be missing two bytes out of six; in this example, "F0 01 02 03 06 07 08 09 0C 0D 0E F7". These corrupted SysEx event packages contain only data bytes, while the CH345 uses event packets with a correct CIN value only for messages with a status byte, so it is possible to distinguish between these two cases by checking for the presence of this status byte. (Other bugs in the CH345's input handling, such as the corruption resulting from running status, cannot be worked around.) Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Clemens Ladisch authored
commit 1ca8b201 upstream. The CH345 USB MIDI chip has two output ports. However, they are multiplexed through one pin, and the number of ports cannot be reduced even for hardware that implements only one connector, so for those devices, data sent to either port ends up on the same hardware output. This becomes a problem when both ports are used at the same time, as longer MIDI commands (such as SysEx messages) are likely to be interrupted by messages from the other port, and thus to get lost. It would not be possible for the driver to detect how many ports the device actually has, except that in practice, _all_ devices built with the CH345 have only one port. So we can just ignore the device's descriptors, and hardcode one output port. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Clemens Ladisch authored
commit 98d362be upstream. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Eric Sandeen authored
commit 9de67c3b upstream. Today, if we perform an xfs_growfs which adds allocation groups, mp->m_maxagi is not properly updated when the growfs is complete. Therefore inodes will continue to be allocated only in the AGs which existed prior to the growfs, and the new space won't be utilized. This is because of this path in xfs_growfs_data_private(): xfs_growfs_data_private xfs_initialize_perag(mp, nagcount, &nagimax); if (mp->m_flags & XFS_MOUNT_32BITINODES) index = xfs_set_inode32(mp); else index = xfs_set_inode64(mp); if (maxagi) *maxagi = index; where xfs_set_inode* iterates over the (old) agcount in mp->m_sb.sb_agblocks, which has not yet been updated in the growfs path. So "index" will be returned based on the old agcount, not the new one, and new AGs are not available for inode allocation. Fix this by explicitly passing the proper AG count (which xfs_initialize_perag() already has) down another level, so that xfs_set_inode* can make the proper decision about acceptable AGs for inode allocation in the potentially newly-added AGs. This has been broken since 3.7, when these two xfs_set_inode* functions were added in commit 2d2194f6. Prior to that, we looped over "agcount" not sb_agblocks in these calculations. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Cc: Ben Hutchings <ben@decadent.org.uk> Acked-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Eric Dumazet authored
commit 602dd62d upstream. Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets. We need to call inet6_destroy_sock() to properly release inet6 specific fields. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Konstantin Khlebnikov authored
commit 6adc5fd6 upstream. Proxy entries could have null pointer to net-device. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Fixes: 84920c14 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Eric Dumazet authored
commit 45f6fad8 upstream. This patch addresses multiple problems : UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions while socket is not locked : Other threads can change np->opt concurrently. Dmitry posted a syzkaller (http://github.com/google/syzkaller) program desmonstrating use-after-free. Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock() and dccp_v6_request_recv_sock() also need to use RCU protection to dereference np->opt once (before calling ipv6_dup_options()) This patch adds full RCU protection to np->opt Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Michal Kubeček authored
commit 264640fc upstream. If a fragmented multicast packet is received on an ethernet device which has an active macvlan on top of it, each fragment is duplicated and received both on the underlying device and the macvlan. If some fragments for macvlan are processed before the whole packet for the underlying device is reassembled, the "overlapping fragments" test in ip6_frag_queue() discards the whole fragment queue. To resolve this, add device ifindex to the search key and require it to match reassembling multicast packets and packets to link-local addresses. Note: similar patch has been already submitted by Yoshifuji Hideaki in http://patchwork.ozlabs.org/patch/220979/ but got lost and forgotten for some reason. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Aaro Koskinen authored
commit 3c25a860 upstream. Commit fcb26ec5 ("broadcom: move all PHY_ID's to header") updated broadcom_tbl to use PHY_IDs, but incorrectly replaced 0x0143bca0 with PHY_ID_BCM5482 (making a duplicate entry, and completely omitting the original). Fix that. Fixes: fcb26ec5 ("broadcom: move all PHY_ID's to header") Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Nikolay Aleksandrov authored
commit 4c698046 upstream. Similar to ipv4, when destroying an mrt table the static mfc entries and the static devices are kept, which leads to devices that can never be destroyed (because of refcnt taken) and leaked memory. Make sure that everything is cleaned up on netns destruction. Fixes: 8229efda ("netns: ip6mr: enable namespace support in ipv6 multicast forwarding code") CC: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Nikolay Aleksandrov authored
commit 0e615e96 upstream. When destroying an mrt table the static mfc entries and the static devices are kept, which leads to devices that can never be destroyed (because of refcnt taken) and leaked memory, for example: unreferenced object 0xffff880034c144c0 (size 192): comm "mfc-broken", pid 4777, jiffies 4320349055 (age 46001.964s) hex dump (first 32 bytes): 98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff .S.4.....S.4.... ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00 ................ backtrace: [<ffffffff815c1b9e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811ea6e0>] kmem_cache_alloc+0x190/0x300 [<ffffffff815931cb>] ip_mroute_setsockopt+0x5cb/0x910 [<ffffffff8153d575>] do_ip_setsockopt.isra.11+0x105/0xff0 [<ffffffff8153e490>] ip_setsockopt+0x30/0xa0 [<ffffffff81564e13>] raw_setsockopt+0x33/0x90 [<ffffffff814d1e14>] sock_common_setsockopt+0x14/0x20 [<ffffffff814d0b51>] SyS_setsockopt+0x71/0xc0 [<ffffffff815cdbf6>] entry_SYSCALL_64_fastpath+0x16/0x7a [<ffffffffffffffff>] 0xffffffffffffffff Make sure that everything is cleaned on netns destruction. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Daniel Borkmann authored
commit 6900317f upstream. David and HacKurx reported a following/similar size overflow triggered in a grsecurity kernel, thanks to PaX's gcc size overflow plugin: (Already fixed in later grsecurity versions by Brad and PaX Team.) [ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314 cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr; [ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7 [ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...] [ 1002.296153] ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8 [ 1002.296162] ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8 [ 1002.296169] ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60 [ 1002.296176] Call Trace: [ 1002.296190] [<ffffffff818129ba>] dump_stack+0x45/0x57 [ 1002.296200] [<ffffffff8121f838>] report_size_overflow+0x38/0x60 [ 1002.296209] [<ffffffff816a979e>] scm_detach_fds+0x2ce/0x300 [ 1002.296220] [<ffffffff81791899>] unix_stream_read_generic+0x609/0x930 [ 1002.296228] [<ffffffff81791c9f>] unix_stream_recvmsg+0x4f/0x60 [ 1002.296236] [<ffffffff8178dc00>] ? unix_set_peek_off+0x50/0x50 [ 1002.296243] [<ffffffff8168fac7>] sock_recvmsg+0x47/0x60 [ 1002.296248] [<ffffffff81691522>] ___sys_recvmsg+0xe2/0x1e0 [ 1002.296257] [<ffffffff81693496>] __sys_recvmsg+0x46/0x80 [ 1002.296263] [<ffffffff816934fc>] SyS_recvmsg+0x2c/0x40 [ 1002.296271] [<ffffffff8181a3ab>] entry_SYSCALL_64_fastpath+0x12/0x85 Further investigation showed that this can happen when an *odd* number of fds are being passed over AF_UNIX sockets. In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)), where i is the number of successfully passed fds, differ by 4 bytes due to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary on 64 bit. The padding is used to align subsequent cmsg headers in the control buffer. When the control buffer passed in from the receiver side *lacks* these 4 bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will overflow in scm_detach_fds(): int cmlen = CMSG_LEN(i * sizeof(int)); <--- cmlen w/o tail-padding err = put_user(SOL_SOCKET, &cm->cmsg_level); if (!err) err = put_user(SCM_RIGHTS, &cm->cmsg_type); if (!err) err = put_user(cmlen, &cm->cmsg_len); if (!err) { cmlen = CMSG_SPACE(i * sizeof(int)); <--- cmlen w/ 4 byte extra tail-padding msg->msg_control += cmlen; msg->msg_controllen -= cmlen; <--- iff no tail-padding space here ... } ... wrap-around F.e. it will wrap to a length of 18446744073709551612 bytes in case the receiver passed in msg->msg_controllen of 20 bytes, and the sender properly transferred 1 fd to the receiver, so that its CMSG_LEN results in 20 bytes and CMSG_SPACE in 24 bytes. In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an issue in my tests as alignment seems always on 4 byte boundary. Same should be in case of native 32 bit, where we end up with 4 byte boundaries as well. In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving a single fd would mean that on successful return, msg->msg_controllen is being set by the kernel to 24 bytes instead, thus more than the input buffer advertised. It could f.e. become an issue if such application later on zeroes or copies the control buffer based on the returned msg->msg_controllen elsewhere. Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253). Going over the code, it seems like msg->msg_controllen is not being read after scm_detach_fds() in scm_recv() anymore by the kernel, good! Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg()) and unix_stream_recvmsg(). Both return back to their recvmsg() caller, and ___sys_recvmsg() places the updated length, that is, new msg_control - old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen in the example). Long time ago, Wei Yongjun fixed something related in commit 1ac70e7a ("[NET]: Fix function put_cmsg() which may cause usr application memory overflow"). RFC3542, section 20.2. says: The fields shown as "XX" are possible padding, between the cmsghdr structure and the data, and between the data and the next cmsghdr structure, if required by the implementation. While sending an application may or may not include padding at the end of last ancillary data in msg_controllen and implementations must accept both as valid. On receiving a portable application must provide space for padding at the end of the last ancillary data as implementations may copy out the padding at the end of the control message buffer and include it in the received msg_controllen. When recvmsg() is called if msg_controllen is too small for all the ancillary data items including any trailing padding after the last item an implementation may set MSG_CTRUNC. Since we didn't place MSG_CTRUNC for already quite a long time, just do the same as in 1ac70e7a to avoid an overflow. Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix error in SCM_RIGHTS code sample"). Some people must have copied this (?), thus it got triggered in the wild (reported several times during boot by David and HacKurx). No Fixes tag this time as pre 2002 (that is, pre history tree). Reported-by: David Sterba <dave@jikos.cz> Reported-by: HacKurx <hackurx@gmail.com> Cc: PaX Team <pageexec@freemail.hu> Cc: Emese Revfy <re.emese@gmail.com> Cc: Brad Spengler <spender@grsecurity.net> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Eric Dumazet <edumazet@google.com> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Eric Dumazet authored
commit 142a2e7e upstream. Dmitry provided a syzkaller (http://github.com/google/syzkaller) generated program that triggers the WARNING at net/ipv4/tcp.c:1729 in tcp_recvmsg() : WARN_ON(tp->copied_seq != tp->rcv_nxt && !(flags & (MSG_PEEK | MSG_TRUNC))); His program is specifically attempting a Cross SYN TCP exchange, that we support (for the pleasure of hackers ?), but it looks we lack proper tcp->copied_seq initialization. Thanks again Dmitry for your report and testings. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Eric Dumazet authored
commit 1b8e6a01 upstream. When a passive TCP is created, we eventually call tcp_md5_do_add() with sk pointing to the child. It is not owner by the user yet (we will add this socket into listener accept queue a bit later anyway) But we do own the spinlock, so amend the lockdep annotation to avoid following splat : [ 8451.090932] net/ipv4/tcp_ipv4.c:923 suspicious rcu_dereference_protected() usage! [ 8451.090932] [ 8451.090932] other info that might help us debug this: [ 8451.090932] [ 8451.090934] [ 8451.090934] rcu_scheduler_active = 1, debug_locks = 1 [ 8451.090936] 3 locks held by socket_sockopt_/214795: [ 8451.090936] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff855c6ac1>] __netif_receive_skb_core+0x151/0xe90 [ 8451.090947] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff85618143>] ip_local_deliver_finish+0x43/0x2b0 [ 8451.090952] #2: (slock-AF_INET){+.-...}, at: [<ffffffff855acda5>] sk_clone_lock+0x1c5/0x500 [ 8451.090958] [ 8451.090958] stack backtrace: [ 8451.090960] CPU: 7 PID: 214795 Comm: socket_sockopt_ [ 8451.091215] Call Trace: [ 8451.091216] <IRQ> [<ffffffff856fb29c>] dump_stack+0x55/0x76 [ 8451.091229] [<ffffffff85123b5b>] lockdep_rcu_suspicious+0xeb/0x110 [ 8451.091235] [<ffffffff8564544f>] tcp_md5_do_add+0x1bf/0x1e0 [ 8451.091239] [<ffffffff85645751>] tcp_v4_syn_recv_sock+0x1f1/0x4c0 [ 8451.091242] [<ffffffff85642b27>] ? tcp_v4_md5_hash_skb+0x167/0x190 [ 8451.091246] [<ffffffff85647c78>] tcp_check_req+0x3c8/0x500 [ 8451.091249] [<ffffffff856451ae>] ? tcp_v4_inbound_md5_hash+0x11e/0x190 [ 8451.091253] [<ffffffff85647170>] tcp_v4_rcv+0x3c0/0x9f0 [ 8451.091256] [<ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0 [ 8451.091260] [<ffffffff856181b6>] ip_local_deliver_finish+0xb6/0x2b0 [ 8451.091263] [<ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0 [ 8451.091267] [<ffffffff85618d38>] ip_local_deliver+0x48/0x80 [ 8451.091270] [<ffffffff85618510>] ip_rcv_finish+0x160/0x700 [ 8451.091273] [<ffffffff8561900e>] ip_rcv+0x29e/0x3d0 [ 8451.091277] [<ffffffff855c74b7>] __netif_receive_skb_core+0xb47/0xe90 Fixes: a8afca03 ("tcp: md5: protects md5sig_info with RCU") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Bjørn Mork authored
commit 68242a5a upstream. Thomas reports " 4gsystems sells two total different LTE-surfsticks under the same name. .. The newer version of XS Stick W100 is from "omega" .. Under windows the driver switches to the same ID, and uses MI03\6 for network and MI01\6 for modem. .. echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1c9e ProdID=9b01 Rev=02.32 S: Manufacturer=USB Modem S: Product=USB Modem S: SerialNumber= C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage Now all important things are there: wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at) There is also ttyUSB0, but it is not usable, at least not for at. The device works well with qmi and ModemManager-NetworkManager. " Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Neil Horman authored
commit 41033f02 upstream. the OUTMCAST stat is double incremented, getting bumped once in the mcast code itself, and again in the common ip output path. Remove the mcast bump, as its not needed Validated by the reporter, with good results Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Claus Jensen <claus.jensen@microsemi.com> CC: Claus Jensen <claus.jensen@microsemi.com> CC: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Jason A. Donenfeld authored
commit b4fe85f9 upstream. Drivers like vxlan use the recently introduced udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the packet, updates the struct stats using the usual u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats). udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch tstats, so drivers like vxlan, immediately after, call iptunnel_xmit_stats, which does the same thing - calls u64_stats_update_begin/end on this_cpu_ptr(dev->tstats). While vxlan is probably fine (I don't know?), calling a similar function from, say, an unbound workqueue, on a fully preemptable kernel causes real issues: [ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6 [ 188.435579] caller is debug_smp_processor_id+0x17/0x20 [ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2 [ 188.435607] Call Trace: [ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b [ 188.435615] [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0 [ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20 The solution would be to protect the whole this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with disabling preemption and then reenabling it. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Daniel Borkmann authored
commit c72219b7 upstream. In case no struct sockaddr_ll has been passed to packet socket's sendmsg() when doing a TX_RING flush run, then skb->protocol is set to po->num instead, which is the protocol passed via socket(2)/bind(2). Applications only xmitting can go the path of allocating the socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the TX_RING with sll_protocol of 0. That way, register_prot_hook() is neither called on creation nor on bind time, which saves cycles when there's no interest in capturing anyway. That leaves us however with po->num 0 instead and therefore the TX_RING flush run sets skb->protocol to 0 as well. Eric reported that this leads to problems when using tools like trafgen over bonding device. I.e. the bonding's hash function could invoke the kernel's flow dissector, which depends on skb->protocol being properly set. In the current situation, all the traffic is then directed to a single slave. Fix it up by inferring skb->protocol from the Ethernet header when not set and we have ARPHRD_ETHER device type. This is only done in case of SOCK_RAW and where we have a dev->hard_header_len length. In case of ARPHRD_ETHER devices, this is guaranteed to cover ETH_HLEN, and therefore being accessed on the skb after the skb_store_bits(). Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Daniel Borkmann authored
commit 8fd6c80d upstream. We concluded that the skb_probe_transport_header() should better be called unconditionally. Avoiding the call into the flow dissector has also not really much to do with the direct xmit mode. While it seems that only virtio_net code makes use of GSO from non RX/TX ring packet socket paths, we should probe for a transport header nevertheless before they hit devices. Reference: http://thread.gmane.org/gmane.linux.network/386173/Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Rainer Weikusat authored
commit 7d267278 upstream. Rainer Weikusat <rweikusat@mobileactivedefense.com> writes: An AF_UNIX datagram socket being the client in an n:1 association with some server socket is only allowed to send messages to the server if the receive queue of this socket contains at most sk_max_ack_backlog datagrams. This implies that prospective writers might be forced to go to sleep despite none of the message presently enqueued on the server receive queue were sent by them. In order to ensure that these will be woken up once space becomes again available, the present unix_dgram_poll routine does a second sock_poll_wait call with the peer_wait wait queue of the server socket as queue argument (unix_dgram_recvmsg does a wake up on this queue after a datagram was received). This is inherently problematic because the server socket is only guaranteed to remain alive for as long as the client still holds a reference to it. In case the connection is dissolved via connect or by the dead peer detection logic in unix_dgram_sendmsg, the server socket may be freed despite "the polling mechanism" (in particular, epoll) still has a pointer to the corresponding peer_wait queue. There's no way to forcibly deregister a wait queue with epoll. Based on an idea by Jason Baron, the patch below changes the code such that a wait_queue_t belonging to the client socket is enqueued on the peer_wait queue of the server whenever the peer receive queue full condition is detected by either a sendmsg or a poll. A wake up on the peer queue is then relayed to the ordinary wait queue of the client socket via wake function. The connection to the peer wait queue is again dissolved if either a wake up is about to be relayed or the client socket reconnects or a dead peer is detected or the client socket is itself closed. This enables removing the second sock_poll_wait from unix_dgram_poll, thus avoiding the use-after-free, while still ensuring that no blocked writer sleeps forever. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Fixes: ec0d215f ("af_unix: fix 'poll for write'/connected DGRAM sockets") Reviewed-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Vineet Gupta authored
commit 30b9dbee upstream. Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Sabrina Dubroca authored
commit e639b8d8 upstream. Reset pskb in macvlan_handle_frame in case skb_share_check returned a clone. Fixes: 8a4eb573 ("net: introduce rx_handler results and logic around that") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Geert Uytterhoeven authored
commit cf897526 upstream. fs/cachefiles/rdwr.c: In function ‘cachefiles_write_page’: fs/cachefiles/rdwr.c:882: warning: ‘ret’ may be used uninitialized in this function If the jump to label "error" is taken, "ret" will indeed be uninitialized, and random stack data may be printed by the debug code. Fixes: 102f4d90 ("FS-Cache: Handle a write to the page immediately beyond the EOF marker") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
- 14 Dec, 2015 16 commits
-
-
Uwe Kleine-König authored
commit 705e63d2 upstream. There is a bit of a mess in the order of arguments to the ulpi write callback. There is int ulpi_write(struct ulpi *ulpi, u8 addr, u8 val) in drivers/usb/common/ulpi.c; struct usb_phy_io_ops { ... int (*write)(struct usb_phy *x, u32 val, u32 reg); } in include/linux/usb/phy.h. The callback registered by the musb driver has to comply to the latter, but up to now had "offset" first which effectively made the function broken for correct users. So flip the order and while at it also switch to the parameter names of struct usb_phy_io_ops's write. Fixes: ffb865b1 ("usb: musb: add ulpi access operations") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Noa Osherovich authored
commit d49c2197 upstream. The err variable wasn't set with the correct error value in some cases. Fixes: 47605df9 ('mlx4: Modify proxy/tunnel QP mechanism [..]') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
lucien authored
commit ed5a377d upstream. now sctp auth cannot work well when setting a hmacid manually, which is caused by that we didn't use the network order for hmacid, so fix it by adding the transformation in sctp_auth_ep_set_hmacs. even we set hmacid with the network order in userspace, it still can't work, because of this condition in sctp_auth_ep_set_hmacs(): if (id > SCTP_AUTH_HMAC_ID_MAX) return -EOPNOTSUPP; so this wasn't working before and thus it won't break compatibility. Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Daniel Borkmann authored
commit 5cfb4c8d upstream. Since it's introduction in commit 69e3c75f ("net: TX_RING and packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW side. When used with SOCK_DGRAM only, the size_max > dev->mtu + reserve check should have reserve as 0, but currently, this is unconditionally set (in it's original form as dev->hard_header_len). I think this is not correct since tpacket_fill_skb() would then take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM, the extra VLAN_HLEN could be possible in both cases. Presumably, the reserve code was copied from packet_snd(), but later on missed the check. Make it similar as we have it in packet_snd(). Fixes: 69e3c75f ("net: TX_RING and packet mmap") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Daniel Borkmann authored
commit 3c70c132 upstream. Packet sockets can be used by various net devices and are not really restricted to ARPHRD_ETHER device types. However, when currently checking for the extra 4 bytes that can be transmitted in VLAN case, our assumption is that we generally probe on ARPHRD_ETHER devices. Therefore, before looking into Ethernet header, check the device type first. This also fixes the issue where non-ARPHRD_ETHER devices could have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus the check would test unfilled linear part of the skb (instead of non-linear). Fixes: 57f89bfa ("network: Allow af_packet to transmit +4 bytes for VLAN packets.") Fixes: 52f1454f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Daniel Borkmann authored
commit efdfa2f7 upstream. In tpacket_fill_skb() commit c1aad275 ("packet: set transport header before doing xmit") and later on 40893fd0 ("net: switch to use skb_probe_transport_header()") was probing for a transport header on the skb from a ring buffer slot, but at a time, where the skb has _not even_ been filled with data yet. So that call into the flow dissector is pretty useless. Lets do it after we've set up the skb frags. Fixes: c1aad275 ("packet: set transport header before doing xmit") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Kamal Mostafa authored
commit d7475de5 upstream. Use the local uapi headers to keep in sync with "recently" added #define's (e.g. SKF_AD_VLAN_TPID). Refactored CFLAGS, and bpf_asm doesn't need -I. Fixes: 3f356385 ("filter: bpf_asm: add minimal bpf asm tool") Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Luca Porzio authored
commit d3df0465 upstream. Anytime a write operation is performed with Reliable Write flag enabled, the eMMC device is enforced to bypass the cache and do a write to the underling NVM device by Jedec specification; this causes a performance penalty since write operations can't be optimized by the device cache. In our tests, we replayed a typical mobile daily trace pattern and found ~9% overall time reduction in trace replay by using this patch. Also the write ops within 4KB~64KB chunk size range get a 40~60% performance improvement by using the patch (as this range of write chunks are the ones affected by REQ_META). This patch has been discussed in the Mobile & Embedded Linux Storage Forum and it's the results of feedbacks from many people. We also checked with fsdevl and f2fs mailing list developers that this change in the usage of REQ_META is not affecting FS behavior and we got positive feedbacks. Reporting here the feedbacks: http://comments.gmane.org/gmane.linux.file-systems/97219 http://thread.gmane.org/gmane.linux.file-systems.f2fs/3178/focus=3183Signed-off-by: Bruce Ford <bford@micron.com> Signed-off-by: Luca Porzio <lporzio@micron.com> Fixes: ce39f9d1 ("mmc: support packed write command for eMMC4.5 devices") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Helge Deller authored
commit dcbf0d29 upstream. Drop the MADV_xxK_PAGES flags, which were never used and were from a proposed API which was never integrated into the generic Linux kernel code. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Junxiao Bi authored
commit 8f1eb487 upstream. New created file's mode is not masked with umask, and this makes umask not work for ocfs2 volume. Fixes: 702e5bc6 ("ocfs2: use generic posix ACL infrastructure") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Richard Weinberger authored
commit 9d8a7652 upstream. sigsuspend() is nowhere used except in signal.c itself, so we can mark it static do not pollute the global namespace. But this patch is more than a boring cleanup patch, it fixes a real issue on UserModeLinux. UML has a special console driver to display ttys using xterm, or other terminal emulators, on the host side. Vegard reported that sometimes UML is unable to spawn a xterm and he's facing the following warning: WARNING: CPU: 0 PID: 908 at include/linux/thread_info.h:128 sigsuspend+0xab/0xc0() It turned out that this warning makes absolutely no sense as the UML xterm code calls sigsuspend() on the host side, at least it tries. But as the kernel itself offers a sigsuspend() symbol the linker choose this one instead of the glibc wrapper. Interestingly this code used to work since ever but always blocked signals on the wrong side. Some recent kernel change made the WARN_ON() trigger and uncovered the bug. It is a wonderful example of how much works by chance on computers. :-) Fixes: 68f3f16d ("new helper: sigsuspend()") Signed-off-by: Richard Weinberger <richard@nod.at> Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Tested-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
OGAWA Hirofumi authored
commit 928a4771 upstream. For the root directory, . and .. are faked (using dir_emit_dots()) and ctx->pos is reset from 2 to 0. A corrupted root directory could cause fat_get_entry() to fail, but ->iterate() (fat_readdir()) reports progress to the VFS (with ctx->pos rewound to 0), so any following calls to ->iterate() continue to return the same entries again and again. The result is that userspace will never see the end of the directory, causing e.g. 'ls' to hang in a getdents() loop. [hirofumi@mail.parknet.co.jp: cleanup and make sure to correct fake_offset] Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Tested-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Richard Weinberger <richard.weinberger@gmail.com> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Jiri Slaby authored
commit 19cd80a2 upstream. It is not permitted to set task state before lock. usblp_wwait sets the state to TASK_INTERRUPTIBLE and calls mutex_lock_interruptible. Upon return from that function, the state will be TASK_RUNNING again. This is clearly a bug and a warning is generated with LOCKDEP too: WARNING: CPU: 1 PID: 5109 at kernel/sched/core.c:7404 __might_sleep+0x7d/0x90() do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffffa0c588d0>] usblp_wwait+0xa0/0x310 [usblp] Modules linked in: ... CPU: 1 PID: 5109 Comm: captmon Tainted: G W 4.2.5-0.gef2823b-default #1 Hardware name: LENOVO 23252SG/23252SG, BIOS G2ET33WW (1.13 ) 07/24/2012 ffffffff81a4edce ffff880236ec7ba8 ffffffff81716651 0000000000000000 ffff880236ec7bf8 ffff880236ec7be8 ffffffff8106e146 0000000000000282 ffffffff81a50119 000000000000028b 0000000000000000 ffff8802dab7c508 Call Trace: ... [<ffffffff8106e1c6>] warn_slowpath_fmt+0x46/0x50 [<ffffffff8109a8bd>] __might_sleep+0x7d/0x90 [<ffffffff8171b20f>] mutex_lock_interruptible_nested+0x2f/0x4b0 [<ffffffffa0c588fc>] usblp_wwait+0xcc/0x310 [usblp] [<ffffffffa0c58bb2>] usblp_write+0x72/0x350 [usblp] [<ffffffff8121ed98>] __vfs_write+0x28/0xf0 ... Commit 7f477358 (usblp: Implement the ENOSPC convention) moved the set prior locking. So move it back after the lock. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Fixes: 7f477358 ("usblp: Implement the ENOSPC convention") Acked-By: Pete Zaitcev <zaitcev@yahoo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Bjørn Mork authored
commit 638148e2 upstream. Thomas reports " 4gsystems sells two total different LTE-surfsticks under the same name. .. The newer version of XS Stick W100 is from "omega" .. Under windows the driver switches to the same ID, and uses MI03\6 for network and MI01\6 for modem. .. echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1c9e ProdID=9b01 Rev=02.32 S: Manufacturer=USB Modem S: Product=USB Modem S: SerialNumber= C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage Now all important things are there: wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at) There is also ttyUSB0, but it is not usable, at least not for at. The device works well with qmi and ModemManager-NetworkManager. " Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Yang Shi authored
commit 92e788b7 upstream. As previously reported, some userspace applications depend on bogomips showed by /proc/cpuinfo. Although there is much less legacy impact on aarch64 than arm, it does break libvirt. This patch reverts commit 326b16db ("arm64: delay: don't bother reporting bogomips in /proc/cpuinfo"), but with some tweak due to context change and without the pr_info(). Fixes: 326b16db ("arm64: delay: don't bother reporting bogomips in /proc/cpuinfo") Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [ luis: backported to 3.16: - file rename: cpuinfo.c -> setup.c - linux/delay.h is already included - adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-
Andrew Cooper authored
commit 581b7f15 upstream. There appears to be no formal statement of what pv_irq_ops.save_fl() is supposed to return precisely. Native returns the full flags, while lguest and Xen only return the Interrupt Flag, and both have comments by the implementations stating that only the Interrupt Flag is looked at. This may have been true when initially implemented, but no longer is. To make matters worse, the Xen PVOP leaves the upper bits undefined, making the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV guests on Broadwell hardware. The BUG_ON() is consistent for an individual build, but not consistent for all builds. It has also been a sitting timebomb since SMAP support was introduced. Use native_save_fl() instead, which will obtain an accurate view of the AC flag. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Tested-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <lguest@lists.ozlabs.org> Cc: Xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.comSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
-