- 19 Jun, 2013 5 commits
-
-
John W. Linville authored
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
-
-
Johannes Berg authored
Since my commit 3713b4e3 ("nl80211: allow splitting wiphy information in dumps"), nl80211_dump_wiphy() uses the global nl80211_fam.attrbuf for parsing the incoming data. This wouldn't be a problem if it only did so on the first dump iteration which is locked against other commands in generic netlink, but due to space constraints in cb->args (the needed state doesn't fit) I decided to always parse the original message. That's racy though since nl80211_fam.attrbuf could be used by some other parsing in generic netlink concurrently. For now, fix this by allocating a separate parse buffer (it's a bit too big for the stack, currently 1448 bytes on 64-bit). For -next, I'll change the code to parse into the global buffer in the first round only and then allocate a smaller buffer to keep the data in cb->args. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
stephen hemminger authored
The check introduced by: commit 26a41ae6 Author: stephen hemminger <stephen@networkplumber.org> Date: Mon Jun 17 12:09:58 2013 -0700 vxlan: only migrate dynamic FDB entries was not correct because it is checking flag about type of FDB entry, rather than the state (dynamic versus static). The confusion arises because vxlan is reusing values from bridge, and bridge is reusing values from neighbour table, and easy to get lost in translation. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Marc Kleine-Budde authored
The usb_8dev hardware has problems on some xhci USB hosts. The driver fails to read the firmware revision in the probe function. This leads to the following Oops: [ 3356.635912] kernel BUG at net/core/dev.c:5701! The driver tries to free the netdev, which has already been registered, without unregistering it. This patch fixes the problem by unregistering the netdev in the error path. Reported-by: Michael Olbrich <m.olbrich@pengutronix.de> Reviewed-by: Bernd Krumboeck <krumboeck@universalnet.at> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
- 18 Jun, 2013 3 commits
-
-
Matthias Schiffer authored
Since some refactoring in 5f5a0115, ndisc_send_redirect called ndisc_fill_redirect_hdr_option on the wrong skb, leading to data corruption or in the worst case a panic when the skb_put failed. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Lüssing authored
General Queries (the one with the Multicast Address field set to zero / '::') are supposed to have a Maximum Response Delay of [Query Response Interval], while for Multicast-Address-Specific Queries it is [Last Listener Query Interval] - not the other way round. (see RFC2710, section 7.3+7.8) Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fernando Luis Vazquez Cao authored
As part of the push to add 802.1ad server provider tagging support to the kernel the VLAN features flags were renamed. Unfortunately the kernel name for the VLAN hardware acceleration features that the kernel shows user space was included in the rename, which broke ethtool (txvlan and rxvlan options do not work). This patch restores the original names, i.e. the original ABI. If we wanted to make clear to users that we are refering to CTAGs we can always change ethtool's short_name and long_name for these features (for example something along the lines of txvlan -> txvlan-ctag, tx-vlan-offload -> tx-vlan-ctag-offload). Cc: Patrick McHardy <kaber@trash.net> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 17 Jun, 2013 9 commits
-
-
Sebastian Siewior authored
after priv->cpts got allocated then this pointer should check to determine if the allocation succeeded or not. Cc: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into wireless John W. Linville says: ==================== This will probably be the last batch of wireless fixes intended for 3.10. Many of these are one- or two-liners, and a couple of others are mostly relocating existing code to avoid races or to limit the code to effecting specific hardware, etc. The mac80211 fixes have a couple of exceptions to the above. Regarding those, Johannes says: "Following davem's complaint about my patch, here's a new pull request w/o the patch he was complaining about, but instead with the const fix rolled into the fix. I have a fix for radar detection, one for rate control and a workaround for broken HT APs which is a regression fix because we didn't rely on them to be correct before." Johannes also sends some iwlwifi fixes: "I picked up Nikolay's patch for the chain noise calibration bug that seems to have been there forever, a fix from Emmanuel for setting TX flags on BAR frames and a fix of my own to avoid printing request_module() errors if the kernel isn't even modular. We also have our own version of Stanislaw's fix for rate control." Along with those... Anderson Lizardo fixes a Bluetooth memory corruption bug when an MTU value is set to too small of a value. Arend van Spriel sends a revised brcmsmac bug that fixes a regression caused by a bad return value in an earlier patch. He also sends a brcmfmac fix to avoid an oops when loading the driver at boot. Daniel Drake fixes a race condition in btmrvl that causes hangs on suspend for OLPC hardware. Johan Hedberg adds a check to avoid sending a HCI_Delete_Stored_Link_Key command to devices that don't support them, avoiding some scary looking log spam. Stanislaw Gruszka gives us a fix for iwlegacy to be able to use rates higher than 1Mb/s on older wireless networks. He also sends an rt2x00 fix to reinstate older tx power handling behavior for some devices that didn't work well with the current code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller authored
Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes. They are targeted to the TCP option targets, that have receive some scrinity in the last week. The changes are: * Fix TCPOPTSTRIP, it stopped working in the forward chain as tcp_hdr uses skb->transport_header, and we cannot use that in the forwarding case, from myself. * Fix default IPv6 MSS in TCPMSS in case of absence of TCP MSS options, from Phil Oester. * Fix missing fragmentation handling again in TCPMSS, from Phil Oester. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
This is a very simple driver, based on the original vendor driver that Qualcomm/Atheros published/submitted previously, but reworked to make the code saner. However, it also lost a number of features (TSO/GSO, VLAN acceleration and multi- queue support) in the process, as well as debugging support features I didn't have any use for. The only thing I left is checksum offload. More features can obviously be added, but this seemed like a good start for having a driver in mainline at all. Johannes Stezenbach has verified that the driver works on AR8161, I have a AR8171 myself. The E2200 device ID I found on github in somebody's repository. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
We should call __vlan_hwaccel_put_tag() only if the packet comes from vlan, otherwise VLAN_TAG_PRESENT will always be added. Reported-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
If skb_clone fails if out of memory then just skip the fanout. Problem was introduced in 3.10 with: commit 6681712d Author: David Stevens <dlstevens@us.ibm.com> Date: Fri Mar 15 04:35:51 2013 +0000 vxlan: generalize forwarding tables Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
Only migrate dynamic forwarding table entries, don't modify static entries. If packet received from incorrect source IP address assume it is an imposter and drop it. This patch applies only to -net, a different patch would be needed for earlier kernels since the NTF_SELF flag was introduced with 3.10. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
It is possible for a packet to arrive during vxlan_stop(), and have a dynamic entry created. Close this by checking if device is up. CPU1 CPU2 vxlan_stop vxlan_flush hash_lock acquired vxlan_encap_recv vxlan_snoop waiting for hash_lock hash_lock relased vxlan_flush done hash_lock acquired vxlan_fdb_create This is a day-one bug in vxlan goes back to 3.7. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
John W. Linville authored
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
-
- 16 Jun, 2013 1 commit
-
-
Al Viro authored
When you copy some code, you are supposed to read it. If nothing else, there's a chance to spot and fix an obvious bug instead of sharing it... X-Song: "I Got It From Agnes", by Tom Lehrer Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ Tom Lehrer? You're dating yourself, Al ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 15 Jun, 2013 16 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "These are a little later than I planned on since I got caught up with handling merges for 3.11 most of the week. Another week, another batch of fixes for arm-soc platforms. Again, nothing controversial. A few more than would be ideal, but all are valid fixes. In particular the prima2 panic patch is critical since it fixes a problem where multiplatform kernels panic on all but prima2 hardware." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: SAMSUNG: pm: Adjust for pinctrl- and DT-enabled platforms ARM: prima2: fix incorrect panic usage arm: mvebu: armada-xp-{gp,openblocks-ax3-4}: specify PCIe range ARM: Kirkwood: handle mv88f6282 cpu in __kirkwood_variant(). ARM: omap3: clock: fix wrong container_of in clock36xx.c ARM: dts: OMAP5: Fix missing PWM capability to timer nodes ARM: dts: omap4-panda|sdp: Fix mux for twl6030 IRQ pin and msecure line ARM: dts: AM33xx: Fix properties on gpmc node arm: omap2: fix AM33xx hwmod infos for UART2 ARM: OMAP3: Fix iva2_pwrdm settings for 3703
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix RTNL locking in batman-adv, from Matthias Schiffer. 2) Don't allow non-passthrough macvlan devices to set NOPROMISC via netlink, otherwise we can end up with corrupted promisc counter values on the device. From Michael S Tsirkin. 3) Fix stmmac driver build with debugging defines enabled, from Dinh Nguyen. 4) Make sure name string we give in socket address in AF_PACKET is NULL terminated, from Daniel Borkmann. 5) Fix leaking of two uninitialized bytes of memory to userspace in l2tp, from Guillaume Nault. 6) Clear IPCB(skb) before tunneling otherwise we touch dangling IP options state and crash. From Saurabh Mohan. 7) Fix suspend/resume for davinci_mdio by using suspend_late and resume_early. From Mugunthan V N. 8) Don't tag ip_tunnel_init_net and ip_tunnel_delete_net with __net_{init,exit}, they can be called outside of those contexts. From Eric Dumazet. 9) Fix RX length error in sh_eth driver, from Yoshihiro Shimoda. 10) Fix missing sctp_outq initialization in some code paths of SCTP stack, from Neil Horman. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits) sctp: fully initialize sctp_outq in sctp_outq_init netiucv: Hold rtnl between name allocation and device registration. tulip: Properly check dma mapping result net: sh_eth: fix incorrect RX length error if R8A7740 ip_tunnel: remove __net_init/exit from exported functions drivers: net: davinci_mdio: restore mdio clk divider in mdio resume drivers: net: davinci_mdio: moving mdio resume earlier than cpsw ethernet driver net/ipv4: ip_vti clear skb cb before tunneling. tg3: Wait for boot code to finish after power on l2tp: Fix sendmsg() return value l2tp: Fix PPP header erasure and memory leak bonding: fix igmp_retrans type and two related races bonding: reset master mac on first enslave failure packet: packet_getname_spkt: make sure string is always 0-terminated net: ethernet: stmicro: stmmac: Fix compile error when STMMAC_XMIT_DEBUG used be2net: Fix 32-bit DMA Mask handling xen-netback: don't de-reference vif pointer after having called xenvif_put() macvlan: don't touch promisc without passthrough batman-adv: Don't handle address updates when bla is disabled batman-adv: forward late OGMs from best next hop ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpcLinus Torvalds authored
Pull powerpc fixes from Benjamin Herrenschmidt: "So here are 3 fixes still for 3.10. Fixes are simple, bugs are nasty (though not recent regressions, nasty enough) and all targeted at stable" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Fix missing/delayed calls to irq_work powerpc: Fix emulation of illegal instructions on PowerNV platform powerpc: Fix stack overflow crash in resume_kernel when ftracing
-
David Daney authored
Thanks to commit f91eb62f ("init: scream bloody murder if interrupts are enabled too early"), "bloody murder" is now being screamed. With a MIPS OCTEON config, we use on_each_cpu() in our irq_chip.irq_bus_sync_unlock() function. This gets called in early as a result of the time_init() call. Because the !SMP version of on_each_cpu() unconditionally enables irqs, we get: WARNING: at init/main.c:560 start_kernel+0x250/0x410() Interrupts were enabled early CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-rc5-Cavium-Octeon+ #801 Call Trace: show_stack+0x68/0x80 warn_slowpath_common+0x78/0xb0 warn_slowpath_fmt+0x38/0x48 start_kernel+0x250/0x410 Suggested fix: Do what we already do in the SMP version of on_each_cpu(), and use local_irq_save/local_irq_restore. Because we need a flags variable, make it a static inline to avoid name space issues. [ Change from v1: Convert on_each_cpu to a static inline function, add #include <linux/irqflags.h> to avoid build breakage on some files. on_each_cpu_mask() and on_each_cpu_cond() suffer the same problem as on_each_cpu(), but they are not causing !SMP bugs for me, so I will defer changing them to a less urgent patch. ] Signed-off-by: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull VFS fixes from Al Viro: "Several fixes + obvious cleanup (you've missed a couple of open-coded can_lookup() back then)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: snd_pcm_link(): fix a leak... use can_lookup() instead of direct checks of ->i_op->lookup move exit_task_namespaces() outside of exit_notify() fput: task_work_add() can fail if the caller has passed exit_task_work() ncpfs: fix rmdir returns Device or resource busy
-
git://oss.sgi.com/xfs/xfsLinus Torvalds authored
Pull xfs fixes from Ben Myers: - Remove noisy warnings about experimental support which spams the logs - Add padding to align directory and attr structures correctly - Set block number on child buffer on a root btree split - Disable verifiers during log recovery for non-CRC filesystems * tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfs: xfs: don't shutdown log recovery on validation errors xfs: ensure btree root split sets blkno correctly xfs: fix implicit padding in directory and attr CRC formats xfs: don't emit v5 superblock warnings on write
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds authored
Pull char / misc fixes from Greg Kroah-Hartman: "Here are some small mei driver fixes for 3.10-rc6 that fix some reported problems" * tag 'char-misc-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mei: me: clear interrupts on the resume path mei: nfc: fix nfc device freeing mei: init: Flush scheduled work before resetting the device
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg Kroah-Hartman: "Here are some small USB driver fixes that resolve some reported problems for 3.10-rc6 Nothing major, just 3 USB serial driver fixes, and two chipidea fixes" * tag 'usb-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: chipidea: fix id change handling usb: chipidea: fix no transceiver case USB: pl2303: fix device initialisation at open USB: spcp8x5: fix device initialisation at open USB: f81232: fix device initialisation at open
-
Benjamin Herrenschmidt authored
When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Steven Rostedt <rostedt@goodmis.org>
-
Paul Mackerras authored
Normally, the kernel emulates a few instructions that are unimplemented on some processors (e.g. the old dcba instruction), or privileged (e.g. mfpvr). The emulation of unimplemented instructions is currently not working on the PowerNV platform. The reason is that on these machines, unimplemented and illegal instructions cause a hypervisor emulation assist interrupt, rather than a program interrupt as on older CPUs. Our vector for the emulation assist interrupt just calls program_check_exception() directly, without setting the bit in SRR1 that indicates an illegal instruction interrupt. This fixes it by making the emulation assist interrupt set that bit before calling program_check_interrupt(). With this, old programs that use no-longer implemented instructions such as dcba now work again. CC: <stable@vger.kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Michael Ellerman authored
It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc "powerpc: Rework runlatch code" by myself --BenH ] CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Al Viro authored
in case when snd_pcm_stream_linked(substream) is true, we end up leaking group. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
a couple of places got missed back when Linus has introduced that one... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Oleg Nesterov authored
exit_notify() does exit_task_namespaces() after forget_original_parent(). This was needed to ensure that ->nsproxy can't be cleared prematurely, an exiting child we are going to reparent can do do_notify_parent() and use the parent's (ours) pid_ns. However, after 32084504 "pidns: use task_active_pid_ns in do_notify_parent" ->nsproxy != NULL is no longer needed, we rely on task_active_pid_ns(). Move exit_task_namespaces() from exit_notify() to do_exit(), after exit_fs() and before exit_task_work(). This solves the problem reported by Andrey, free_ipc_ns()->shm_destroy() does fput() which needs task_work_add(). Note: this particular problem can be fixed if we change fput(), and that change makes sense anyway. But there is another reason to move the callsite. The original reason for exit_task_namespaces() from the middle of exit_notify() was subtle and it has already gone away, now this looks confusing. And this allows us do simplify exit_notify(), we can avoid unlock/lock(tasklist) and we can use ->exit_state instead of PF_EXITING in forget_original_parent(). Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Oleg Nesterov authored
fput() assumes that it can't be called after exit_task_work() but this is not true, for example free_ipc_ns()->shm_destroy() can do this. In this case fput() silently leaks the file. Change it to fallback to delayed_fput_work if task_work_add() fails. The patch looks complicated but it is not, it changes the code from if (PF_KTHREAD) { schedule_work(...); return; } task_work_add(...) to if (!PF_KTHREAD) { if (!task_work_add(...)) return; /* fallback */ } schedule_work(...); As for shm_destroy() in particular, we could make another fix but I think this change makes sense anyway. There could be another similar user, it is not safe to assume that task_work_add() can't fail. Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
- 14 Jun, 2013 6 commits
-
-
Dave Chinner authored
Unfortunately, we cannot guarantee that items logged multiple times and replayed by log recovery do not take objects back in time. When they are taken back in time, the go into an intermediate state which is corrupt, and hence verification that occurs on this intermediate state causes log recovery to abort with a corruption shutdown. Instead of causing a shutdown and unmountable filesystem, don't verify post-recovery items before they are written to disk. This is less than optimal, but there is no way to detect this issue for non-CRC filesystems If log recovery successfully completes, this will be undone and the object will be consistent by subsequent transactions that are replayed, so in most cases we don't need to take drastic action. For CRC enabled filesystems, leave the verifiers in place - we need to call them to recalculate the CRCs on the objects anyway. This recovery problem can be solved for such filesystems - we have a LSN stamped in all metadata at writeback time that we can to determine whether the item should be replayed or not. This is a separate piece of work, so is not addressed by this patch. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 9222a9cf)
-
Dave Chinner authored
For CRC enabled filesystems, the BMBT is rooted in an inode, so it passes through a different code path on root splits than the freespace and inode btrees. This is much less traversed by xfstests than the other trees. When testing on a 1k block size filesystem, I've been seeing ASSERT failures in generic/234 like: XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_private.b.allocated == 0, file: fs/xfs/xfs_btree.c, line: 317 which are generally preceded by a lblock check failure. I noticed this in the bmbt stats: $ pminfo -f xfs.btree.block_map xfs.btree.block_map.lookup value 39135 xfs.btree.block_map.compare value 268432 xfs.btree.block_map.insrec value 15786 xfs.btree.block_map.delrec value 13884 xfs.btree.block_map.newroot value 2 xfs.btree.block_map.killroot value 0 ..... Very little coverage of root splits and merges. Indeed, on a 4k filesystem, block_map.newroot and block_map.killroot are both zero. i.e. the code is not exercised at all, and it's the only generic btree infrastructure operation that is not exercised by a default run of xfstests. Turns out that on a 1k filesystem, generic/234 accounts for one of those two root splits, and that is somewhat of a smoking gun. In fact, it's the same problem we saw in the directory/attr code where headers are memcpy()d from one block to another without updating the self describing metadata. Simple fix - when copying the header out of the root block, make sure the block number is updated correctly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit ade1335a)
-
Dave Chinner authored
Michael L. Semon has been testing CRC patches on a 32 bit system and been seeing assert failures in the directory code from xfs/080. Thanks to Michael's heroic efforts with printk debugging, we found that the problem was that the last free space being left in the directory structure was too small to fit a unused tag structure and it was being corrupted and attempting to log a region out of bounds. Hence the assert failure looked something like: ..... #5 calling xfs_dir2_data_log_unused() 36 32 #1 4092 4095 4096 #2 8182 8183 4096 XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568 Where #1 showed the first region of the dup being logged (i.e. the last 4 bytes of a directory buffer) and #2 shows the corrupt values being calculated from the length of the dup entry which overflowed the size of the buffer. It turns out that the problem was not in the logging code, nor in the freespace handling code. It is an initial condition bug that only shows up on 32 bit systems. When a new buffer is initialised, where's the freespace that is set up: [ 172.316249] calling xfs_dir2_leaf_addname() from xfs_dir_createname() [ 172.316346] #9 calling xfs_dir2_data_log_unused() [ 172.316351] #1 calling xfs_trans_log_buf() 60 63 4096 [ 172.316353] #2 calling xfs_trans_log_buf() 4094 4095 4096 Note the offset of the first region being logged? It's 60 bytes into the buffer. Once I saw that, I pretty much knew that the bug was going to be caused by this. Essentially, all direct entries are rounded to 8 bytes in length, and all entries start with an 8 byte alignment. This means that we can decode inplace as variables are naturally aligned. With the directory data supposedly starting on a 8 byte boundary, and all entries padded to 8 bytes, the minimum freespace in a directory block is supposed to be 8 bytes, which is large enough to fit a unused data entry structure (6 bytes in size). The fact we only have 4 bytes of free space indicates a directory data block alignment problem. And what do you know - there's an implicit hole in the directory data block header for the CRC format, which means the header is 60 byte on 32 bit intel systems and 64 bytes on 64 bit systems. Needs padding. And while looking at the structures, I found the same problem in the attr leaf header. Fix them both. Note that this only affects 32 bit systems with CRCs enabled. Everything else is just fine. Note that CRC enabled filesystems created before this fix on such systems will not be readable with this fix applied. Reported-by: Michael L. Semon <mlsemon35@gmail.com> Debugged-by: Michael L. Semon <mlsemon35@gmail.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 8a1fd295)
-
Dave Chinner authored
We write the superblock every 30s or so which results in the verifier being called. Right now that results in this output every 30s: XFS (vda): Version 5 superblock detected. This kernel has EXPERIMENTAL support enabled! Use of these features in this kernel is at your own risk! And spamming the logs. We don't need to check for whether we support v5 superblocks or whether there are feature bits we don't support set as these are only relevant when we first mount the filesytem. i.e. on superblock read. Hence for the write verification we can just skip all the checks (and hence verbose output) altogether. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 34510185)
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs fixes from Chris Mason: "This is an assortment of crash fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: stop all workers before cleaning up roots Btrfs: fix use-after-free bug during umount Btrfs: init relocate extent_io_tree with a mapping btrfs: Drop inode if inode root is NULL Btrfs: don't delete fs_roots until after we cleanup the transaction
-
Tomas Winkler authored
We need to clear pending interrupts on the resume path. This brings the device into defined state before starting the reset flow This should solve suspend/resume issues: mei_me : wait hw ready failed. status = 0x0 mei_me : version message write failed Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-