- 11 Jul, 2016 40 commits
-
-
Alex Deucher authored
[ Upstream commit f971f226 ] bug: https://bugs.freedesktop.org/show_bug.cgi?id=94692Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Daniel Vetter authored
[ Upstream commit 72b9ff06 ] For drm_gem_object_unreference callers are required to hold dev->struct_mutex, which these paths don't. Enforcing this requirement has become a bit more strict with commit ef4c6270 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Oct 15 09:36:25 2015 +0200 drm/gem: Check locking in drm_gem_object_unreference Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Todd Previte authored
[ Upstream commit 396aa445 ] For test 4.2.2.5 to pass per the Link CTS Core 1.2 rev1.1 spec, the source device must attempt at least 7 times to read the EDID when it receives an I2C defer. The normal DRM code makes only 7 retries, regardless of whether or not the response is a native defer or an I2C defer. Test 4.2.2.5 fails since there are native defers interspersed with the I2C defers which results in less than 7 EDID read attempts. The solution is to add the numer of defers to the retry counter when an I2C DEFER is returned such that another read attempt will be made. This situation should normally only occur in compliance testing, however, as a worse case real-world scenario, it would result in 13 attempts ( 6 native defers, 7 I2C defers) for a single transaction to complete. The net result is a slightly slower response to an EDID read that shouldn't significantly impact overall performance. V2: - Added a check on the number of I2C Defers to limit the number of times that the retries variable will be decremented. This is to address review feedback regarding possible infinite loops from misbehaving sink devices. V3: - Fixed the limit value to 7 instead of 8 to get the correct retry count. - Combined the increment of the defer count into the if-statement V4: - Removed i915 tag from subject as the patch is not i915-specific V5: - Updated the for-loop to add the number of i2c defers to the retry counter such that the correct number of retry attempts will be made Signed-off-by: Todd Previte <tprevite@gmail.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Sebastian Siewior authored
[ Upstream commit 08a5bb29 ] hugepd_free() used __get_cpu_var() once. Nothing ensured that the code accessing the variable did not migrate from one CPU to another and soon this was noticed by Tiejun Chen in 94b09d75 ("powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var"). So we had it fixed. Christoph Lameter was doing his __get_cpu_var() replaces and forgot PowerPC. Then he noticed this and sent his fixed up batch again which got applied as 69111bac ("powerpc: Replace __get_cpu_var uses"). The careful reader will noticed one little detail: get_cpu_var() got replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does a preempt_enable() and nothing that does preempt_disable() so we underflow the preempt counter. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux.com> Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Xishi Qiu authored
[ Upstream commit 6f25a14a ] It is incorrect to use next_node to find a target node, it will return MAX_NUMNODES or invalid node. This will lead to crash in buddy system allocation. Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: "Laura Abbott" <lauraa@codeaurora.org> Cc: Hui Zhu <zhuhui@xiaomi.com> Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Takashi Iwai authored
[ Upstream commit 4a07083e ] ALSA system timer backend stops the timer via del_timer() without sync and leaves del_timer_sync() at the close instead. This is because of the restriction by the design of ALSA timer: namely, the stop callback may be called from the timer handler, and calling the sync shall lead to a hangup. However, this also triggers a kernel BUG() when the timer is rearmed immediately after stopping without sync: kernel BUG at kernel/time/timer.c:966! Call Trace: <IRQ> [<ffffffff8239c94e>] snd_timer_s_start+0x13e/0x1a0 [<ffffffff8239e1f4>] snd_timer_interrupt+0x504/0xec0 [<ffffffff8122fca0>] ? debug_check_no_locks_freed+0x290/0x290 [<ffffffff8239ec64>] snd_timer_s_function+0xb4/0x120 [<ffffffff81296b72>] call_timer_fn+0x162/0x520 [<ffffffff81296add>] ? call_timer_fn+0xcd/0x520 [<ffffffff8239ebb0>] ? snd_timer_interrupt+0xec0/0xec0 .... It's the place where add_timer() checks the pending timer. It's clear that this may happen after the immediate restart without sync in our cases. So, the workaround here is just to use mod_timer() instead of add_timer(). This looks like a band-aid fix, but it's a right move, as snd_timer_interrupt() takes care of the continuous rearm of timer. Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Nicolai Stange authored
[ Upstream commit e5435891 ] Despite what the DocBook comment to pkcs7_validate_trust() says, the *_trusted argument is never set to false. pkcs7_validate_trust() only positively sets *_trusted upon encountering a trusted PKCS#7 SignedInfo block. This is quite unfortunate since its callers, system_verify_data() for example, depend on pkcs7_validate_trust() clearing *_trusted on non-trust. Indeed, UBSAN splats when attempting to load the uninitialized local variable 'trusted' from system_verify_data() in pkcs7_validate_trust(): UBSAN: Undefined behaviour in crypto/asymmetric_keys/pkcs7_trust.c:194:14 load of value 82 is not a valid value for type '_Bool' [...] Call Trace: [<ffffffff818c4d35>] dump_stack+0xbc/0x117 [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169 [<ffffffff8194113b>] ubsan_epilogue+0xd/0x4e [<ffffffff819419fa>] __ubsan_handle_load_invalid_value+0x111/0x158 [<ffffffff819418e9>] ? val_to_string.constprop.12+0xcf/0xcf [<ffffffff818334a4>] ? x509_request_asymmetric_key+0x114/0x370 [<ffffffff814b83f0>] ? kfree+0x220/0x370 [<ffffffff818312c2>] ? public_key_verify_signature_2+0x32/0x50 [<ffffffff81835e04>] pkcs7_validate_trust+0x524/0x5f0 [<ffffffff813c391a>] system_verify_data+0xca/0x170 [<ffffffff813c3850>] ? top_trace_array+0x9b/0x9b [<ffffffff81510b29>] ? __vfs_read+0x279/0x3d0 [<ffffffff8129372f>] mod_verify_sig+0x1ff/0x290 [...] The implication is that pkcs7_validate_trust() effectively grants trust when it really shouldn't have. Fix this by explicitly setting *_trusted to false at the very beginning of pkcs7_validate_trust(). Cc: <stable@vger.kernel.org> Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Guenter Roeck authored
[ Upstream commit 3c2e2266 ] arm:pxa_defconfig can result in the following crash if the max1111 driver is not instantiated. Unhandled fault: page domain fault (0x01b) at 0x00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: : 1b [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 300 Comm: kworker/0:1 Not tainted 4.5.0-01301-g1701f680 #10 Hardware name: SHARP Akita Workqueue: events sharpsl_charge_toggle task: c390a000 ti: c391e000 task.ti: c391e000 PC is at max1111_read_channel+0x20/0x30 LR is at sharpsl_pm_pxa_read_max1111+0x2c/0x3c pc : [<c03aaab0>] lr : [<c0024b50>] psr: 20000013 ... [<c03aaab0>] (max1111_read_channel) from [<c0024b50>] (sharpsl_pm_pxa_read_max1111+0x2c/0x3c) [<c0024b50>] (sharpsl_pm_pxa_read_max1111) from [<c00262e0>] (spitzpm_read_devdata+0x5c/0xc4) [<c00262e0>] (spitzpm_read_devdata) from [<c0024094>] (sharpsl_check_battery_temp+0x78/0x110) [<c0024094>] (sharpsl_check_battery_temp) from [<c0024f9c>] (sharpsl_charge_toggle+0x48/0x110) [<c0024f9c>] (sharpsl_charge_toggle) from [<c004429c>] (process_one_work+0x14c/0x48c) [<c004429c>] (process_one_work) from [<c0044618>] (worker_thread+0x3c/0x5d4) [<c0044618>] (worker_thread) from [<c004a238>] (kthread+0xd0/0xec) [<c004a238>] (kthread) from [<c000a670>] (ret_from_fork+0x14/0x24) This can occur because the SPI controller driver (SPI_PXA2XX) is built as module and thus not necessarily loaded. While building SPI_PXA2XX into the kernel would make the problem disappear, it appears prudent to ensure that the driver is instantiated before accessing its data structures. Cc: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Takashi Iwai authored
[ Upstream commit 0ab1ace8 ] The commit [d507941b: ALSA: pcm: Correct PCM BUG error message] made the warning prefix back to "BUG:" due to its previous wrong prefix. But a kernel message containing "BUG:" seems taken as an Oops message wrongly by some brain-dead daemons, and it annoys users in the end. Instead of teaching daemons, change the string again to a more reasonable one. Fixes: 507941beb1e ('ALSA: pcm: Correct PCM BUG error message') Cc: <stable@vger.kernel.org> # v3.19+ Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Asai Thambi SP authored
[ Upstream commit 1b899eb4 ] commit cfc05bd3 upstream. Service thread does not detect the need for taskfile error hanlding. Fixed the flag condition to process taskfile error. Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Asai Thambi SP authored
[ Upstream commit 59cf70e2 ] When FTL rebuild is in progress, alloc_disk() initializes the disk but device node will be created by add_disk() only after successful completion of FTL rebuild. So, skip deletion of device node in removal path when FTL rebuild is in progress. Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Sebastian Frias authored
[ Upstream commit 0b41ce99 ] Some UART HW has a single register combining UART_DLL/UART_DLM (this was probably forgotten in the change that introduced the callbacks, commit b32b19b8) Fixes: b32b19b8 ("[SERIAL] 8250: set divisor register correctly ...") Signed-off-by: Sebastian Frias <sf84@laposte.net> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Grazvydas Ignotas authored
[ Upstream commit 5d74325a ] The patch that added Logitech Dual Action gamepad support forgot to update the special driver list for the device. This caused the logitech driver not to probe unless kernel module load order was favorable. Update the special driver list to fix it. Thanks to Simon Wood for the idea. Cc: Vitaly Katraew <zawullon@gmail.com> Fixes: 56d0c8b7 ("HID: add support for Logitech Dual Action gamepads") Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Jarkko Sakkinen authored
[ Upstream commit 8e0ee3c9 ] If the initialization fails before tpm_chip_register(), put_device() will be not called, which causes release callback not to be called. This patch fixes the issue by adding put_device() to devres list of the parent device. Fixes: 313d21ee ("tpm: device class for tpm") Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> cc: stable@vger.kernel.org Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Vladis Dronov authored
[ Upstream commit 836b34a9 ] create_fixed_stream_quirk(), snd_usb_parse_audio_interface() and create_uaxx_quirk() functions allocate the audioformat object by themselves and free it upon error before returning. However, once the object is linked to a stream, it's freed again in snd_usb_audio_pcm_free(), thus it'll be double-freed, eventually resulting in a memory corruption. This patch fixes these failures in the error paths by unlinking the audioformat object before freeing it. Based on a patch by Takashi Iwai <tiwai@suse.de> [Note for stable backports: this patch requires the commit 902eb7fd ('ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk()')] Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1283358Reported-by: Ralf Spenneberg <ralf@spenneberg.net> Cc: <stable@vger.kernel.org> # see the note above Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Takashi Iwai authored
[ Upstream commit 902eb7fd ] Just a minor code cleanup: unify the error paths. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
DingXiang authored
[ Upstream commit 4df2bf46 ] Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[<ffffffffa040efba>] [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20 [ 1959.261816] [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [<ffffffff81216449>] SyS_ioctl+0x79/0x90 [ 1959.294450] [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP <ffff88032a957b30> [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Cc: stable@vger.kernel.org Signed-off-by: Ding Xiang <dingxiang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Arnd Bergmann authored
[ Upstream commit b9a1a743 ] ARM64 allmodconfig produces a bunch of warnings when building the samsung ASoC code: sound/soc/samsung/dmaengine.c: In function 'samsung_asoc_init_dma_data': sound/soc/samsung/dmaengine.c:53:32: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] playback_data->filter_data = (void *)playback->channel; sound/soc/samsung/dmaengine.c:60:31: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] capture_data->filter_data = (void *)capture->channel; We could easily shut up the warning by adding an intermediate cast, but there is a bigger underlying problem: The use of IORESOURCE_DMA to pass data from platform code to device drivers is dubious to start with, as what we really want is a pointer that can be passed into a filter function. Note that on s3c64xx, the pl08x DMA data is already a pointer, but gets cast to resource_size_t so we can pass it as a resource, and it then gets converted back to a pointer. In contrast, the data we pass for s3c24xx is an index into a device specific table, and we artificially convert that into a pointer for the filter function. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Krzysztof Hałasa authored
[ Upstream commit 54c6e2dd ] pci_create_root_bus() passes a "parent" pointer to pci_bus_assign_domain_nr(). When CONFIG_PCI_DOMAINS_GENERIC is defined, pci_bus_assign_domain_nr() dereferences that pointer. Many callers of pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL pointer dereference error. 7c674700 ("PCI: Move domain assignment from arm64 to generic code") moved the "parent" dereference from arm64 to generic code. Only arm64 used that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it always supplied a valid "parent" pointer. Other arches supplied NULL "parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they used a no-op version of pci_bus_assign_domain_nr(). 8c7d1474 ("ARM/PCI: Move to generic PCI domains") defined CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use pci_common_init(), which supplies a NULL "parent" pointer. These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash with a NULL pointer dereference like this while probing PCI: Unable to handle kernel NULL pointer dereference at virtual address 000000a4 PC is at pci_bus_assign_domain_nr+0x10/0x84 LR is at pci_create_root_bus+0x48/0x2e4 Kernel panic - not syncing: Attempted to kill init! [bhelgaas: changelog, add "Reported:" and "Fixes:" tags] Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1 Fixes: 8c7d1474 ("ARM/PCI: Move to generic PCI domains") Fixes: 7c674700 ("PCI: Move domain assignment from arm64 to generic code") Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> CC: stable@vger.kernel.org # v4.0+ Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Miklos Szeredi authored
[ Upstream commit 6343a212 ] (Another one for the f_path debacle.) ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask. The reason is that generic_add_lease() used filp->f_path.dentry->inode while all the others use file_inode(). This makes a difference for files opened on overlayfs since the former will point to the overlay inode the latter to the underlying inode. So generic_add_lease() added the lease to the overlay inode and generic_delete_lease() removed it from the underlying inode. When the file was released the lease remained on the overlay inode's lock list, resulting in use after free. Reported-by: Eryu Guan <eguan@redhat.com> Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Andrey Ulanov authored
[ Upstream commit e06b933e ] - m_start() in fs/namespace.c expects that ns->event is incremented each time a mount added or removed from ns->list. - umount_tree() removes items from the list but does not increment event counter, expecting that it's done before the function is called. - There are some codepaths that call umount_tree() without updating "event" counter. e.g. from __detach_mounts(). - When this happens m_start may reuse a cached mount structure that no longer belongs to ns->list (i.e. use after free which usually leads to infinite loop). This change fixes the above problem by incrementing global event counter before invoking umount_tree(). Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589 Cc: stable@vger.kernel.org Signed-off-by: Andrey Ulanov <andreyu@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Trond Myklebust authored
[ Upstream commit e547f262 ] Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code") Cc: stable@vger.kernel.org # 2.6.33+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Michael Holzheu authored
[ Upstream commit 5419447e ] This reverts commit 852ffd0f. There are use cases where an intermediate boot kernel (1) uses kexec to boot the final production kernel (2). For this scenario we should provide the original boot information to the production kernel (2). Therefore clearing the boot information during kexec() should not be done. Cc: stable@vger.kernel.org # v3.17+ Reported-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Alexey Brodkin authored
[ Upstream commit 9bd54517 ] If CONFIG_ARC_DW2_UNWIND is disabled every time arc_unwind_core() gets called following message gets printed in debug console: ----------------->8--------------- CONFIG_ARC_DW2_UNWIND needs to be enabled ----------------->8--------------- That message makes sense if user indeed wants to see a backtrace or get nice function call-graphs in perf but what if user disabled unwinder for the purpose? Why pollute his debug console? So instead we'll warn user about possibly missing feature once and let him decide if that was what he or she really wanted. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Vineet Gupta authored
[ Upstream commit f52e126c ] With recent binutils update to support dwarf CFI pseudo-ops in gas, we now get .eh_frame vs. .debug_frame. Although the call frame info is exactly the same in both, the CIE differs, which the current kernel unwinder can't cope with. This broke both the kernel unwinder as well as loadable modules (latter because of a new unhandled relo R_ARC_32_PCREL from .rela.eh_frame in the module loader) The ideal solution would be to switch unwinder to .eh_frame. For now however we can make do by just ensureing .debug_frame is generated by removing -fasynchronous-unwind-tables .eh_frame generated with -gdwarf-2 -fasynchronous-unwind-tables .debug_frame generated with -gdwarf-2 Fixes STAR 9001058196 Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Alan Stern authored
[ Upstream commit ab2a4bf8 ] The USB core contains a bug that can show up when a USB-3 host controller is removed. If the primary (USB-2) hcd structure is released before the shared (USB-3) hcd, the core will try to do a double-free of the common bandwidth_mutex. The problem was described in graphical form by Chung-Geol Kim, who first reported it: ================================================= At *remove USB(3.0) Storage sequence <1> --> <5> ((Problem Case)) ================================================= VOLD ------------------------------------|------------ (uevent) ________|_________ |<1> | |dwc3_otg_sm_work | |usb_put_hcd | |peer_hcd(kref=2)| |__________________| ________|_________ |<2> | |New USB BUS #2 | | | |peer_hcd(kref=1) | | | --(Link)-bandXX_mutex| | |__________________| | ___________________ | |<3> | | |dwc3_otg_sm_work | | |usb_put_hcd | | |primary_hcd(kref=1)| | |___________________| | _________|_________ | |<4> | | |New USB BUS #1 | | |hcd_release | | |primary_hcd(kref=0)| | | | | |bandXX_mutex(free) |<- |___________________| (( VOLD )) ______|___________ |<5> | | SCSI | |usb_put_hcd | |peer_hcd(kref=0) | |*hcd_release | |bandXX_mutex(free*)|<- double free |__________________| ================================================= This happens because hcd_release() frees the bandwidth_mutex whenever it sees a primary hcd being released (which is not a very good idea in any case), but in the course of releasing the primary hcd, it changes the pointers in the shared hcd in such a way that the shared hcd will appear to be primary when it gets released. This patch fixes the problem by changing hcd_release() so that it deallocates the bandwidth_mutex only when the _last_ hcd structure referencing it is released. The patch also removes an unnecessary test, so that when an hcd is released, both the shared_hcd and primary_hcd pointers in the hcd's peer will be cleared. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Chung-Geol Kim <chunggeol.kim@samsung.com> Tested-by: Chung-Geol Kim <chunggeol.kim@samsung.com> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Al Viro authored
[ Upstream commit d20cb71d ] In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code" unconditional d_drop() after the ->open_context() had been removed. It had been correct for success cases (there ->open_context() itself had been doing dcache manipulations), but not for error ones. Only one of those (ENOENT) got a compensatory d_drop() added in that commit, but in fact it should've been done for all errors. As it is, the case of O_CREAT non-exclusive open on a hashed negative dentry racing with e.g. symlink creation from another client ended up with ->open_context() getting an error and proceeding to call nfs_lookup(). On a hashed dentry, which would've instantly triggered BUG_ON() in d_materialise_unique() (or, these days, its equivalent in d_splice_alias()). Cc: stable@vger.kernel.org # v3.10+ Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
James Morse authored
[ Upstream commit 591d215a ] kvm provides kvm_vcpu_uninit(), which amongst other things, releases the last reference to the struct pid of the task that was last running the vcpu. On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool, then killing it with SIGKILL results (after some considerable time) in: > cat /sys/kernel/debug/kmemleak > unreferenced object 0xffff80007d5ea080 (size 128): > comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s) > hex dump (first 32 bytes): > 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffff8000001b30ec>] create_object+0xfc/0x278 > [<ffff80000071da34>] kmemleak_alloc+0x34/0x70 > [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8 > [<ffff8000000d0474>] alloc_pid+0x34/0x4d0 > [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338 > [<ffff8000000b633c>] _do_fork+0x74/0x320 > [<ffff8000000b66b0>] SyS_clone+0x18/0x20 > [<ffff800000085cb0>] el0_svc_naked+0x24/0x28 > [<ffffffffffffffff>] 0xffffffffffffffff On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(), on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free(). Signed-off-by: James Morse <james.morse@arm.com> Fixes: 749cf76c ("KVM: ARM: Initial skeleton to compile KVM support") Cc: <stable@vger.kernel.org> # 3.10+ Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Cyril Bur authored
[ Upstream commit 8e96a87c ] Userspace can quite legitimately perform an exec() syscall with a suspended transaction. exec() does not return to the old process, rather it load a new one and starts that, the expectation therefore is that the new process starts not in a transaction. Currently exec() is not treated any differently to any other syscall which creates problems. Firstly it could allow a new process to start with a suspended transaction for a binary that no longer exists. This means that the checkpointed state won't be valid and if the suspended transaction were ever to be resumed and subsequently aborted (a possibility which is exceedingly likely as exec()ing will likely doom the transaction) the new process will jump to invalid state. Secondly the incorrect attempt to keep the transactional state while still zeroing state for the new process creates at least two TM Bad Things. The first triggers on the rfid to return to userspace as start_thread() has given the new process a 'clean' MSR but the suspend will still be set in the hardware MSR. The second TM Bad Thing triggers in __switch_to() as the processor is still transactionally suspended but __switch_to() wants to zero the TM sprs for the new process. This is an example of the outcome of calling exec() with a suspended transaction. Note the first 700 is likely the first TM bad thing decsribed earlier only the kernel can't report it as we've loaded userspace registers. c000000000009980 is the rfid in fast_exception_return() Bad kernel stack pointer 3fffcfa1a370 at c000000000009980 Oops: Bad kernel stack pointer, sig: 6 [#1] CPU: 0 PID: 2006 Comm: tm-execed Not tainted NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000 REGS: c00000003ffefd40 TRAP: 0700 Not tainted MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]> CR: 00000000 XER: 00000000 CFAR: c0000000000098b4 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000 NIP [c000000000009980] fast_exception_return+0xb0/0xb8 LR [0000000000000000] (null) Call Trace: Instruction dump: f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070 e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b Kernel BUG at c000000000043e80 [verbose debug info unavailable] Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033) Oops: Unrecoverable exception, sig: 6 [#2] CPU: 0 PID: 2006 Comm: tm-execed Tainted: G D task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000 NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000 REGS: c00000003ffef7e0 TRAP: 0700 Tainted: G D MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]> CR: 28002828 XER: 00000000 CFAR: c000000000015a20 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000 GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000 GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004 GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000 GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000 GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80 NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c LR [c000000000015a24] __switch_to+0x1f4/0x420 Call Trace: Instruction dump: 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020 This fixes CVE-2016-5828. Fixes: bc2a9408 ("powerpc: Hook in new transactional memory code") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Torsten Hilbrich authored
[ Upstream commit 63d2f95d ] The value `bytes' comes from the filesystem which is about to be mounted. We cannot trust that the value is always in the range we expect it to be. Check its value before using it to calculate the length for the crc32_le call. It value must be larger (or equal) sumoff + 4. This fixes a kernel bug when accidentially mounting an image file which had the nilfs2 magic value 0x3434 at the right offset 0x406 by chance. The bytes 0x01 0x00 were stored at 0x408 and were interpreted as a s_bytes value of 1. This caused an underflow when substracting sumoff + 4 (20) in the call to crc32_le. BUG: unable to handle kernel paging request at ffff88021e600000 IP: crc32_le+0x36/0x100 ... Call Trace: nilfs_valid_sb.part.5+0x52/0x60 [nilfs2] nilfs_load_super_block+0x142/0x300 [nilfs2] init_nilfs+0x60/0x390 [nilfs2] nilfs_mount+0x302/0x520 [nilfs2] mount_fs+0x38/0x160 vfs_kern_mount+0x67/0x110 do_mount+0x269/0xe00 SyS_mount+0x9f/0x100 entry_SYSCALL_64_fastpath+0x16/0x71 Link: http://lkml.kernel.org/r/1466778587-5184-2-git-send-email-konishi.ryusuke@lab.ntt.co.jpSigned-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
David Rientjes authored
[ Upstream commit a4f04f2c ] If the memory compaction free scanner cannot successfully split a free page (only possible due to per-zone low watermark), terminate the free scanner rather than continuing to scan memory needlessly. If the watermark is insufficient for a free page of order <= cc->order, then terminate the scanner since all future splits will also likely fail. This prevents the compaction freeing scanner from scanning all memory on very large zones (very noticeable for zones > 128GB, for instance) when all splits will likely fail while holding zone->lock. compaction_alloc() iterating a 128GB zone has been benchmarked to take over 400ms on some systems whereas any free page isolated and ready to be split ends up failing in split_free_page() because of the low watermark check and thus the iteration continues. The next time compaction occurs, the freeing scanner will likely start at the end of the zone again since no success was made previously and we get the same lengthy iteration until the zone is brought above the low watermark. All thp page faults can take >400ms in such a state without this fix. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211820350.97086@chino.kir.corp.google.comSigned-off-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Vlastimil Babka authored
[ Upstream commit 9fcd6d2e ] The compaction free scanner is looking for PageBuddy() pages and skipping all others. For large compound pages such as THP or hugetlbfs, we can save a lot of iterations if we skip them at once using their compound_order(). This is generally unsafe and we can read a bogus value of order due to a race, but if we are careful, the only danger is skipping too much. When tested with stress-highalloc from mmtests on 4GB system with 1GB hugetlbfs pages, the vmstat compact_free_scanned count decreased by at least 15%. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Christoph Lameter <cl@linux.com> Cc: Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Lukasz Odzioba authored
[ Upstream commit 8f182270 ] Currently we can have compound pages held on per cpu pagevecs, which leads to a lot of memory unavailable for reclaim when needed. In the systems with hundreads of processors it can be GBs of memory. On of the way of reproducing the problem is to not call munmap explicitly on all mapped regions (i.e. after receiving SIGTERM). After that some pages (with THP enabled also huge pages) may end up on lru_add_pvec, example below. void main() { #pragma omp parallel { size_t size = 55 * 1000 * 1000; // smaller than MEM/CPUS void *p = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS , -1, 0); if (p != MAP_FAILED) memset(p, 0, size); //munmap(p, size); // uncomment to make the problem go away } } When we run it with THP enabled it will leave significant amount of memory on lru_add_pvec. This memory will be not reclaimed if we hit OOM, so when we run above program in a loop: for i in `seq 100`; do ./a.out; done many processes (95% in my case) will be killed by OOM. The primary point of the LRU add cache is to save the zone lru_lock contention with a hope that more pages will belong to the same zone and so their addition can be batched. The huge page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like a safer option when compared to a potential excess in the caching which can be quite large and much harder to fix because lru_add_drain_all is way to expensive and it is not really clear what would be a good moment to call it. Similarly we can reproduce the problem on lru_deactivate_pvec by adding: madvise(p, size, MADV_FREE); after memset. This patch flushes lru pvecs on compound page arrival making the problem less severe - after applying it kill rate of above example drops to 0%, due to reducing maximum amount of memory held on pvec from 28MB (with THP) to 56kB per CPU. Suggested-by: Michal Hocko <mhocko@suse.com> Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Ming Li <mingli199x@qq.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Anthony Romano authored
[ Upstream commit b9b4bb26 ] When fallocate is interrupted it will undo a range that extends one byte past its range of allocated pages. This can corrupt an in-use page by zeroing out its first byte. Instead, undo using the inclusive byte range. Fixes: 1635f6a7 ("tmpfs: undo fallocation on failure") Link: http://lkml.kernel.org/r/1462713387-16724-1-git-send-email-anthony.romano@coreos.comSigned-off-by: Anthony Romano <anthony.romano@coreos.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Brandon Philips <brandon@ifup.co> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Alan Stern authored
[ Upstream commit 7e8b3dfe ] The HOSTPC extension registers found in some EHCI implementations form a variable-length array, with one element for each port. Therefore the hostpc field in struct ehci_regs should be declared as a zero-length array, not a single-element array. This fixes a problem reported by UBSAN. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de> Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steve French authored
[ Upstream commit 45e8a258 ] POSIX allows files with trailing spaces or a trailing period but SMB3 does not, so convert these using the normal Services For Mac mapping as we do for other reserved characters such as : < > | ? * This is similar to what Macs do for the same problem over SMB3. CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steve French authored
[ Upstream commit 4fcd1813 ] Azure server blocks clients that open a socket and don't do anything on it. In our reconnect scenarios, we can reconnect the tcp session and detect the socket is available but we defer the negprot and SMB3 session setup and tree connect reconnection until the next i/o is requested, but this looks suspicous to some servers who expect SMB3 negprog and session setup soon after a socket is created. In the echo thread, reconnect SMB3 sessions and tree connections that are disconnected. A later patch will replay persistent (and resilient) handle opens. CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Weston Andros Adamson authored
[ Upstream commit 5e3a9888 ] pnfs_generic_commit_cancel_empty_pagelist calls nfs_commitdata_release, but that is wrong: nfs_commitdata_release puts the open context, something that isn't valid until nfs_init_commit is called, which is never the case when pnfs_generic_commit_cancel_empty_pagelist is called. This was introduced in "nfs: avoid race that crashes nfs_init_commit". Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Weston Andros Adamson authored
[ Upstream commit ade8febd ] Since the patch "NFS: Allow multiple commit requests in flight per file" we can run multiple simultaneous commits on the same inode. This introduced a race over collecting pages to commit that made it possible to call nfs_init_commit() with an empty list - which causes crashes like the one below. The fix is to catch this race and avoid calling nfs_init_commit and initiate_commit when there is no work to do. Here is the crash: [600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 [600522.078475] IP: [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0 [600522.078972] Oops: 0000 [#1] SMP [600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3 [600522.081380] vmw_pvscsi ata_generic pata_acpi [600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1 [600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000 [600522.083378] RIP: 0010:[<ffffffffa0479e72>] [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.083973] RSP: 0018:ffff88042ae87438 EFLAGS: 00010246 [600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588 [600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40 [600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0 [600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0 [600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840 [600522.087484] FS: 00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000 [600522.088070] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0 [600522.089327] Stack: [600522.089926] 0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa [600522.090549] 0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930 [600522.091169] ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840 [600522.091789] Call Trace: [600522.092420] [<ffffffffa04f09fa>] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4] [600522.093052] [<ffffffffa0563d80>] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles] [600522.093696] [<ffffffffa0562645>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles] [600522.094359] [<ffffffffa047bc78>] nfs_generic_commit_list+0xe8/0x120 [nfs] [600522.095032] [<ffffffffa047bd6a>] nfs_commit_inode+0xba/0x110 [nfs] [600522.095719] [<ffffffffa046ac54>] nfs_release_page+0x44/0xd0 [nfs] [600522.096410] [<ffffffff811a8122>] try_to_release_page+0x32/0x50 [600522.097109] [<ffffffff811bd4f1>] shrink_page_list+0x961/0xb30 [600522.097812] [<ffffffff811bdced>] shrink_inactive_list+0x1cd/0x550 [600522.098530] [<ffffffff811bea65>] shrink_lruvec+0x635/0x840 [600522.099250] [<ffffffff811bed60>] shrink_zone+0xf0/0x2f0 [600522.099974] [<ffffffff811bf312>] do_try_to_free_pages+0x192/0x470 [600522.100709] [<ffffffff811bf6ca>] try_to_free_pages+0xda/0x170 [600522.101464] [<ffffffff811b2198>] __alloc_pages_nodemask+0x588/0x970 [600522.102235] [<ffffffff811fbbd5>] alloc_pages_vma+0xb5/0x230 [600522.103000] [<ffffffff813a1589>] ? cpumask_any_but+0x39/0x50 [600522.103774] [<ffffffff811d6115>] wp_page_copy.isra.55+0x95/0x490 [600522.104558] [<ffffffff810e3438>] ? __wake_up+0x48/0x60 [600522.105357] [<ffffffff811d7d3b>] do_wp_page+0xab/0x4f0 [600522.106137] [<ffffffff810a1bbb>] ? release_task+0x36b/0x470 [600522.106902] [<ffffffff8126dbd7>] ? eventfd_ctx_read+0x67/0x1c0 [600522.107659] [<ffffffff811da2a8>] handle_mm_fault+0xc78/0x1900 [600522.108431] [<ffffffff81067ef1>] __do_page_fault+0x181/0x420 [600522.109173] [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280 [600522.109893] [<ffffffff810681c0>] do_page_fault+0x30/0x80 [600522.110594] [<ffffffff81024f36>] ? syscall_trace_leave+0xc6/0x120 [600522.111288] [<ffffffff81790a58>] page_fault+0x28/0x30 [600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce <48> 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08 [600522.113343] RIP [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs] [600522.114003] RSP <ffff88042ae87438> [600522.114636] CR2: 0000000000000040 Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file) CC: stable@vger.kernel.org Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Trond Myklebust authored
[ Upstream commit 27571297 ] I'm not aware of any bugreports around this issue, but the locking around the pnfs_commit_bucket is inconsistent at best. This patch tightens it up by ensuring that the 'bucket->committing' list is always changed atomically w.r.t. the 'bucket->clseg' layout segment tracking. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-