1. 11 Jul, 2016 40 commits
    • Alex Deucher's avatar
      cd18a79a
    • Daniel Vetter's avatar
      drm/udl: Use unlocked gem unreferencing · 28a201e2
      Daniel Vetter authored
      [ Upstream commit 72b9ff06 ]
      
      For drm_gem_object_unreference callers are required to hold
      dev->struct_mutex, which these paths don't. Enforcing this requirement
      has become a bit more strict with
      
      commit ef4c6270
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Thu Oct 15 09:36:25 2015 +0200
      
          drm/gem: Check locking in drm_gem_object_unreference
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      28a201e2
    • Todd Previte's avatar
      drm: Fix for DP CTS test 4.2.2.5 - I2C DEFER handling · da38a239
      Todd Previte authored
      [ Upstream commit 396aa445 ]
      
      For test 4.2.2.5 to pass per the Link CTS Core 1.2 rev1.1 spec, the source
      device must attempt at least 7 times to read the EDID when it receives an
      I2C defer. The normal DRM code makes only 7 retries, regardless of whether
      or not the response is a native defer or an I2C defer. Test 4.2.2.5 fails
      since there are native defers interspersed with the I2C defers which
      results in less than 7 EDID read attempts.
      
      The solution is to add the numer of defers to the retry counter when an I2C
      DEFER is returned such that another read attempt will be made. This situation
      should normally only occur in compliance testing, however, as a worse case
      real-world scenario, it would result in 13 attempts ( 6 native defers, 7 I2C
      defers) for a single transaction to complete. The net result is a slightly
      slower response to an EDID read that shouldn't significantly impact overall
      performance.
      
      V2:
      - Added a check on the number of I2C Defers to limit the number
        of times that the retries variable will be decremented. This
        is to address review feedback regarding possible infinite loops
        from misbehaving sink devices.
      V3:
      - Fixed the limit value to 7 instead of 8 to get the correct retry
        count.
      - Combined the increment of the defer count into the if-statement
      V4:
      - Removed i915 tag from subject as the patch is not i915-specific
      V5:
      - Updated the for-loop to add the number of i2c defers to the retry
        counter such that the correct number of retry attempts will be
        made
      Signed-off-by: default avatarTodd Previte <tprevite@gmail.com>
      Cc: dri-devel@lists.freedesktop.org
      Reviewed-by: default avatarPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      da38a239
    • Sebastian Siewior's avatar
      powerpc/mm: Fixup preempt underflow with huge pages · 7e69c7b2
      Sebastian Siewior authored
      [ Upstream commit 08a5bb29 ]
      
      hugepd_free() used __get_cpu_var() once. Nothing ensured that the code
      accessing the variable did not migrate from one CPU to another and soon
      this was noticed by Tiejun Chen in 94b09d75 ("powerpc/hugetlb:
      Replace __get_cpu_var with get_cpu_var"). So we had it fixed.
      
      Christoph Lameter was doing his __get_cpu_var() replaces and forgot
      PowerPC. Then he noticed this and sent his fixed up batch again which
      got applied as 69111bac ("powerpc: Replace __get_cpu_var uses").
      
      The careful reader will noticed one little detail: get_cpu_var() got
      replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does
      a preempt_enable() and nothing that does preempt_disable() so we
      underflow the preempt counter.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7e69c7b2
    • Xishi Qiu's avatar
      mm: fix invalid node in alloc_migrate_target() · 39bf5645
      Xishi Qiu authored
      [ Upstream commit 6f25a14a ]
      
      It is incorrect to use next_node to find a target node, it will return
      MAX_NUMNODES or invalid node.  This will lead to crash in buddy system
      allocation.
      
      Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
      Signed-off-by: default avatarXishi Qiu <qiuxishi@huawei.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Laura Abbott" <lauraa@codeaurora.org>
      Cc: Hui Zhu <zhuhui@xiaomi.com>
      Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      39bf5645
    • Takashi Iwai's avatar
      ALSA: timer: Use mod_timer() for rearming the system timer · b9f2aab6
      Takashi Iwai authored
      [ Upstream commit 4a07083e ]
      
      ALSA system timer backend stops the timer via del_timer() without sync
      and leaves del_timer_sync() at the close instead.  This is because of
      the restriction by the design of ALSA timer: namely, the stop callback
      may be called from the timer handler, and calling the sync shall lead
      to a hangup.  However, this also triggers a kernel BUG() when the
      timer is rearmed immediately after stopping without sync:
       kernel BUG at kernel/time/timer.c:966!
       Call Trace:
        <IRQ>
        [<ffffffff8239c94e>] snd_timer_s_start+0x13e/0x1a0
        [<ffffffff8239e1f4>] snd_timer_interrupt+0x504/0xec0
        [<ffffffff8122fca0>] ? debug_check_no_locks_freed+0x290/0x290
        [<ffffffff8239ec64>] snd_timer_s_function+0xb4/0x120
        [<ffffffff81296b72>] call_timer_fn+0x162/0x520
        [<ffffffff81296add>] ? call_timer_fn+0xcd/0x520
        [<ffffffff8239ebb0>] ? snd_timer_interrupt+0xec0/0xec0
        ....
      
      It's the place where add_timer() checks the pending timer.  It's clear
      that this may happen after the immediate restart without sync in our
      cases.
      
      So, the workaround here is just to use mod_timer() instead of
      add_timer().  This looks like a band-aid fix, but it's a right move,
      as snd_timer_interrupt() takes care of the continuous rearm of timer.
      Reported-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b9f2aab6
    • Nicolai Stange's avatar
      PKCS#7: pkcs7_validate_trust(): initialize the _trusted output argument · fb7a806f
      Nicolai Stange authored
      [ Upstream commit e5435891 ]
      
      Despite what the DocBook comment to pkcs7_validate_trust() says, the
      *_trusted argument is never set to false.
      
      pkcs7_validate_trust() only positively sets *_trusted upon encountering
      a trusted PKCS#7 SignedInfo block.
      
      This is quite unfortunate since its callers, system_verify_data() for
      example, depend on pkcs7_validate_trust() clearing *_trusted on non-trust.
      
      Indeed, UBSAN splats when attempting to load the uninitialized local
      variable 'trusted' from system_verify_data() in pkcs7_validate_trust():
      
        UBSAN: Undefined behaviour in crypto/asymmetric_keys/pkcs7_trust.c:194:14
        load of value 82 is not a valid value for type '_Bool'
        [...]
        Call Trace:
          [<ffffffff818c4d35>] dump_stack+0xbc/0x117
          [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
          [<ffffffff8194113b>] ubsan_epilogue+0xd/0x4e
          [<ffffffff819419fa>] __ubsan_handle_load_invalid_value+0x111/0x158
          [<ffffffff819418e9>] ? val_to_string.constprop.12+0xcf/0xcf
          [<ffffffff818334a4>] ? x509_request_asymmetric_key+0x114/0x370
          [<ffffffff814b83f0>] ? kfree+0x220/0x370
          [<ffffffff818312c2>] ? public_key_verify_signature_2+0x32/0x50
          [<ffffffff81835e04>] pkcs7_validate_trust+0x524/0x5f0
          [<ffffffff813c391a>] system_verify_data+0xca/0x170
          [<ffffffff813c3850>] ? top_trace_array+0x9b/0x9b
          [<ffffffff81510b29>] ? __vfs_read+0x279/0x3d0
          [<ffffffff8129372f>] mod_verify_sig+0x1ff/0x290
          [...]
      
      The implication is that pkcs7_validate_trust() effectively grants trust
      when it really shouldn't have.
      
      Fix this by explicitly setting *_trusted to false at the very beginning
      of pkcs7_validate_trust().
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      fb7a806f
    • Guenter Roeck's avatar
      hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated · bf15ba34
      Guenter Roeck authored
      [ Upstream commit 3c2e2266 ]
      
      arm:pxa_defconfig can result in the following crash if the max1111 driver
      is not instantiated.
      
      Unhandled fault: page domain fault (0x01b) at 0x00000000
      pgd = c0004000
      [00000000] *pgd=00000000
      Internal error: : 1b [#1] PREEMPT ARM
      Modules linked in:
      CPU: 0 PID: 300 Comm: kworker/0:1 Not tainted 4.5.0-01301-g1701f680 #10
      Hardware name: SHARP Akita
      Workqueue: events sharpsl_charge_toggle
      task: c390a000 ti: c391e000 task.ti: c391e000
      PC is at max1111_read_channel+0x20/0x30
      LR is at sharpsl_pm_pxa_read_max1111+0x2c/0x3c
      pc : [<c03aaab0>]    lr : [<c0024b50>]    psr: 20000013
      ...
      [<c03aaab0>] (max1111_read_channel) from [<c0024b50>]
      					(sharpsl_pm_pxa_read_max1111+0x2c/0x3c)
      [<c0024b50>] (sharpsl_pm_pxa_read_max1111) from [<c00262e0>]
      					(spitzpm_read_devdata+0x5c/0xc4)
      [<c00262e0>] (spitzpm_read_devdata) from [<c0024094>]
      					(sharpsl_check_battery_temp+0x78/0x110)
      [<c0024094>] (sharpsl_check_battery_temp) from [<c0024f9c>]
      					(sharpsl_charge_toggle+0x48/0x110)
      [<c0024f9c>] (sharpsl_charge_toggle) from [<c004429c>]
      					(process_one_work+0x14c/0x48c)
      [<c004429c>] (process_one_work) from [<c0044618>] (worker_thread+0x3c/0x5d4)
      [<c0044618>] (worker_thread) from [<c004a238>] (kthread+0xd0/0xec)
      [<c004a238>] (kthread) from [<c000a670>] (ret_from_fork+0x14/0x24)
      
      This can occur because the SPI controller driver (SPI_PXA2XX) is built as
      module and thus not necessarily loaded. While building SPI_PXA2XX into the
      kernel would make the problem disappear, it appears prudent to ensure that
      the driver is instantiated before accessing its data structures.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      bf15ba34
    • Takashi Iwai's avatar
      ALSA: pcm: Avoid "BUG:" string for warnings again · 4c06bf7e
      Takashi Iwai authored
      [ Upstream commit 0ab1ace8 ]
      
      The commit [d507941b: ALSA: pcm: Correct PCM BUG error message]
      made the warning prefix back to "BUG:" due to its previous wrong
      prefix.  But a kernel message containing "BUG:" seems taken as an Oops
      message wrongly by some brain-dead daemons, and it annoys users in the
      end.  Instead of teaching daemons, change the string again to a more
      reasonable one.
      
      Fixes: 507941beb1e ('ALSA: pcm: Correct PCM BUG error message')
      Cc: <stable@vger.kernel.org> # v3.19+
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4c06bf7e
    • Asai Thambi SP's avatar
      mtip32xx: Fix broken service thread handling · cab68096
      Asai Thambi SP authored
      [ Upstream commit 1b899eb4 ]
      
      commit cfc05bd3 upstream.
      
      Service thread does not detect the need for taskfile error hanlding. Fixed the
      flag condition to process taskfile error.
      Signed-off-by: default avatarSelvan Mani <smani@micron.com>
      Signed-off-by: default avatarAsai Thambi S P <asamymuthupa@micron.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cab68096
    • Asai Thambi SP's avatar
      mtip32xx: Fix for rmmod crash when drive is in FTL rebuild · 30dbed7b
      Asai Thambi SP authored
      [ Upstream commit 59cf70e2 ]
      
      When FTL rebuild is in progress, alloc_disk() initializes the disk
      but device node will be created by add_disk() only after successful
      completion of FTL rebuild. So, skip deletion of device node in
      removal path when FTL rebuild is in progress.
      Signed-off-by: default avatarSelvan Mani <smani@micron.com>
      Signed-off-by: default avatarAsai Thambi S P <asamymuthupa@micron.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      30dbed7b
    • Sebastian Frias's avatar
      8250: use callbacks to access UART_DLL/UART_DLM · c6aa8bae
      Sebastian Frias authored
      [ Upstream commit 0b41ce99 ]
      
      Some UART HW has a single register combining UART_DLL/UART_DLM
      (this was probably forgotten in the change that introduced the
      callbacks, commit b32b19b8)
      
      Fixes: b32b19b8 ("[SERIAL] 8250: set divisor register correctly ...")
      Signed-off-by: default avatarSebastian Frias <sf84@laposte.net>
      Reviewed-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      c6aa8bae
    • Grazvydas Ignotas's avatar
      HID: logitech: fix Dual Action gamepad support · e4e503b2
      Grazvydas Ignotas authored
      [ Upstream commit 5d74325a ]
      
      The patch that added Logitech Dual Action gamepad support forgot to
      update the special driver list for the device. This caused the logitech
      driver not to probe unless kernel module load order was favorable.
      Update the special driver list to fix it. Thanks to Simon Wood for the
      idea.
      
      Cc: Vitaly Katraew <zawullon@gmail.com>
      Fixes: 56d0c8b7 ("HID: add support for Logitech Dual Action gamepads")
      Signed-off-by: default avatarGrazvydas Ignotas <notasas@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e4e503b2
    • Jarkko Sakkinen's avatar
      tpm: fix the cleanup of struct tpm_chip · eae5f796
      Jarkko Sakkinen authored
      [ Upstream commit 8e0ee3c9 ]
      
      If the initialization fails before tpm_chip_register(), put_device()
      will be not called, which causes release callback not to be called.
      This patch fixes the issue by adding put_device() to devres list of
      the parent device.
      
      Fixes: 313d21ee ("tpm: device class for tpm")
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      cc: stable@vger.kernel.org
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      eae5f796
    • Vladis Dronov's avatar
      ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call · 03e046ef
      Vladis Dronov authored
      [ Upstream commit 836b34a9 ]
      
      create_fixed_stream_quirk(), snd_usb_parse_audio_interface() and
      create_uaxx_quirk() functions allocate the audioformat object by themselves
      and free it upon error before returning. However, once the object is linked
      to a stream, it's freed again in snd_usb_audio_pcm_free(), thus it'll be
      double-freed, eventually resulting in a memory corruption.
      
      This patch fixes these failures in the error paths by unlinking the audioformat
      object before freeing it.
      
      Based on a patch by Takashi Iwai <tiwai@suse.de>
      
      [Note for stable backports:
       this patch requires the commit 902eb7fd ('ALSA: usb-audio: Minor
       code cleanup in create_fixed_stream_quirk()')]
      
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1283358Reported-by: default avatarRalf Spenneberg <ralf@spenneberg.net>
      Cc: <stable@vger.kernel.org> # see the note above
      Signed-off-by: default avatarVladis Dronov <vdronov@redhat.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      03e046ef
    • Takashi Iwai's avatar
      ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk() · 00ef5df8
      Takashi Iwai authored
      [ Upstream commit 902eb7fd ]
      
      Just a minor code cleanup: unify the error paths.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      00ef5df8
    • DingXiang's avatar
      dm snapshot: disallow the COW and origin devices from being identical · b5ba0d06
      DingXiang authored
      [ Upstream commit 4df2bf46 ]
      
      Otherwise loading a "snapshot" table using the same device for the
      origin and COW devices, e.g.:
      
      echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap
      
      will trigger:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1958.989655] PGD 0
      [ 1958.991903] Oops: 0000 [#1] SMP
      ...
      [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G          IO    4.5.0-rc5.snitm+ #150
      ...
      [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000
      [ 1959.091865] RIP: 0010:[<ffffffffa040efba>]  [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1959.104295] RSP: 0018:ffff88032a957b30  EFLAGS: 00010246
      [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001
      [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00
      [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001
      [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80
      [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80
      [ 1959.150021] FS:  00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000
      [ 1959.159047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0
      [ 1959.173415] Stack:
      [ 1959.175656]  ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc
      [ 1959.183946]  ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000
      [ 1959.192233]  ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf
      [ 1959.200521] Call Trace:
      [ 1959.203248]  [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot]
      [ 1959.211986]  [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot]
      [ 1959.219469]  [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod]
      [ 1959.226656]  [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod]
      [ 1959.234328]  [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod]
      [ 1959.241129]  [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod]
      [ 1959.248607]  [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod]
      [ 1959.255307]  [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20
      [ 1959.261816]  [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
      [ 1959.268615]  [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0
      [ 1959.274637]  [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100
      [ 1959.281726]  [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
      [ 1959.288814]  [<ffffffff81216449>] SyS_ioctl+0x79/0x90
      [ 1959.294450]  [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71
      ...
      [ 1959.323277] RIP  [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot]
      [ 1959.333090]  RSP <ffff88032a957b30>
      [ 1959.336978] CR2: 0000000000000098
      [ 1959.344121] ---[ end trace b049991ccad1169e ]---
      
      Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDing Xiang <dingxiang@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b5ba0d06
    • Arnd Bergmann's avatar
      ASoC: samsung: pass DMA channels as pointers · 8f828413
      Arnd Bergmann authored
      [ Upstream commit b9a1a743 ]
      
      ARM64 allmodconfig produces a bunch of warnings when building the
      samsung ASoC code:
      
      sound/soc/samsung/dmaengine.c: In function 'samsung_asoc_init_dma_data':
      sound/soc/samsung/dmaengine.c:53:32: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
         playback_data->filter_data = (void *)playback->channel;
      sound/soc/samsung/dmaengine.c:60:31: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
         capture_data->filter_data = (void *)capture->channel;
      
      We could easily shut up the warning by adding an intermediate cast,
      but there is a bigger underlying problem: The use of IORESOURCE_DMA
      to pass data from platform code to device drivers is dubious to start
      with, as what we really want is a pointer that can be passed into
      a filter function.
      
      Note that on s3c64xx, the pl08x DMA data is already a pointer, but
      gets cast to resource_size_t so we can pass it as a resource, and it
      then gets converted back to a pointer. In contrast, the data we pass
      for s3c24xx is an index into a device specific table, and we artificially
      convert that into a pointer for the filter function.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8f828413
    • Krzysztof Hałasa's avatar
      PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr() · 85aa23b0
      Krzysztof Hałasa authored
      [ Upstream commit 54c6e2dd ]
      
      pci_create_root_bus() passes a "parent" pointer to
      pci_bus_assign_domain_nr().  When CONFIG_PCI_DOMAINS_GENERIC is defined,
      pci_bus_assign_domain_nr() dereferences that pointer.  Many callers of
      pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL
      pointer dereference error.
      
      7c674700 ("PCI: Move domain assignment from arm64 to generic code")
      moved the "parent" dereference from arm64 to generic code.  Only arm64 used
      that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it
      always supplied a valid "parent" pointer.  Other arches supplied NULL
      "parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they
      used a no-op version of pci_bus_assign_domain_nr().
      
      8c7d1474 ("ARM/PCI: Move to generic PCI domains") defined
      CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use
      pci_common_init(), which supplies a NULL "parent" pointer.
      These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash
      with a NULL pointer dereference like this while probing PCI:
      
        Unable to handle kernel NULL pointer dereference at virtual address 000000a4
        PC is at pci_bus_assign_domain_nr+0x10/0x84
        LR is at pci_create_root_bus+0x48/0x2e4
        Kernel panic - not syncing: Attempted to kill init!
      
      [bhelgaas: changelog, add "Reported:" and "Fixes:" tags]
      Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1
      Fixes: 8c7d1474 ("ARM/PCI: Move to generic PCI domains")
      Fixes: 7c674700 ("PCI: Move domain assignment from arm64 to generic code")
      Signed-off-by: default avatarKrzysztof Hałasa <khalasa@piap.pl>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      CC: stable@vger.kernel.org	# v4.0+
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      85aa23b0
    • Miklos Szeredi's avatar
      locks: use file_inode() · a1f678e5
      Miklos Szeredi authored
      [ Upstream commit 6343a212 ]
      
      (Another one for the f_path debacle.)
      
      ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.
      
      The reason is that generic_add_lease() used filp->f_path.dentry->inode
      while all the others use file_inode().  This makes a difference for files
      opened on overlayfs since the former will point to the overlay inode the
      latter to the underlying inode.
      
      So generic_add_lease() added the lease to the overlay inode and
      generic_delete_lease() removed it from the underlying inode.  When the file
      was released the lease remained on the overlay inode's lock list, resulting
      in use after free.
      Reported-by: default avatarEryu Guan <eguan@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a1f678e5
    • Andrey Ulanov's avatar
      namespace: update event counter when umounting a deleted dentry · 2119a62b
      Andrey Ulanov authored
      [ Upstream commit e06b933e ]
      
      - m_start() in fs/namespace.c expects that ns->event is incremented each
        time a mount added or removed from ns->list.
      - umount_tree() removes items from the list but does not increment event
        counter, expecting that it's done before the function is called.
      - There are some codepaths that call umount_tree() without updating
        "event" counter. e.g. from __detach_mounts().
      - When this happens m_start may reuse a cached mount structure that no
        longer belongs to ns->list (i.e. use after free which usually leads
        to infinite loop).
      
      This change fixes the above problem by incrementing global event counter
      before invoking umount_tree().
      
      Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrey Ulanov <andreyu@google.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2119a62b
    • Trond Myklebust's avatar
      NFS: Fix another OPEN_DOWNGRADE bug · 4a883b82
      Trond Myklebust authored
      [ Upstream commit e547f262 ]
      
      Olga Kornievskaia reports that the following test fails to trigger
      an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.
      
      	fd0 = open(foo, RDRW)   -- should be open on the wire for "both"
      	fd1 = open(foo, RDONLY)  -- should be open on the wire for "read"
      	close(fd0) -- should trigger an open_downgrade
      	read(fd1)
      	close(fd1)
      
      The issue is that we're missing a check for whether or not the current
      state transitioned from an O_RDWR state as opposed to having transitioned
      from a combination of O_RDONLY and O_WRONLY.
      Reported-by: default avatarOlga Kornievskaia <aglo@umich.edu>
      Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code")
      Cc: stable@vger.kernel.org # 2.6.33+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4a883b82
    • Michael Holzheu's avatar
      Revert "s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL" · 2bbd6a57
      Michael Holzheu authored
      [ Upstream commit 5419447e ]
      
      This reverts commit 852ffd0f.
      
      There are use cases where an intermediate boot kernel (1) uses kexec
      to boot the final production kernel (2). For this scenario we should
      provide the original boot information to the production kernel (2).
      Therefore clearing the boot information during kexec() should not
      be done.
      
      Cc: stable@vger.kernel.org # v3.17+
      Reported-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2bbd6a57
    • Alexey Brodkin's avatar
      arc: unwind: warn only once if DW2_UNWIND is disabled · 49cacd2b
      Alexey Brodkin authored
      [ Upstream commit 9bd54517 ]
      
      If CONFIG_ARC_DW2_UNWIND is disabled every time arc_unwind_core()
      gets called following message gets printed in debug console:
      ----------------->8---------------
      CONFIG_ARC_DW2_UNWIND needs to be enabled
      ----------------->8---------------
      
      That message makes sense if user indeed wants to see a backtrace or
      get nice function call-graphs in perf but what if user disabled
      unwinder for the purpose? Why pollute his debug console?
      
      So instead we'll warn user about possibly missing feature once and
      let him decide if that was what he or she really wanted.
      Signed-off-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      49cacd2b
    • Vineet Gupta's avatar
      ARC: unwind: ensure that .debug_frame is generated (vs. .eh_frame) · 7678c949
      Vineet Gupta authored
      [ Upstream commit f52e126c ]
      
      With recent binutils update to support dwarf CFI pseudo-ops in gas, we
      now get .eh_frame vs. .debug_frame. Although the call frame info is
      exactly the same in both, the CIE differs, which the current kernel
      unwinder can't cope with.
      
      This broke both the kernel unwinder as well as loadable modules (latter
      because of a new unhandled relo R_ARC_32_PCREL from .rela.eh_frame in
      the module loader)
      
      The ideal solution would be to switch unwinder to .eh_frame.
      For now however we can make do by just ensureing .debug_frame is
      generated by removing -fasynchronous-unwind-tables
      
       .eh_frame    generated with -gdwarf-2 -fasynchronous-unwind-tables
       .debug_frame generated with -gdwarf-2
      
      Fixes STAR 9001058196
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7678c949
    • Alan Stern's avatar
      USB: don't free bandwidth_mutex too early · 74219215
      Alan Stern authored
      [ Upstream commit ab2a4bf8 ]
      
      The USB core contains a bug that can show up when a USB-3 host
      controller is removed.  If the primary (USB-2) hcd structure is
      released before the shared (USB-3) hcd, the core will try to do a
      double-free of the common bandwidth_mutex.
      
      The problem was described in graphical form by Chung-Geol Kim, who
      first reported it:
      
      =================================================
           At *remove USB(3.0) Storage
           sequence <1> --> <5> ((Problem Case))
      =================================================
                                        VOLD
      ------------------------------------|------------
                                       (uevent)
                                  ________|_________
                                 |<1>               |
                                 |dwc3_otg_sm_work  |
                                 |usb_put_hcd       |
                                 |peer_hcd(kref=2)|
                                 |__________________|
                                  ________|_________
                                 |<2>               |
                                 |New USB BUS #2    |
                                 |                  |
                                 |peer_hcd(kref=1)  |
                                 |                  |
                               --(Link)-bandXX_mutex|
                               | |__________________|
                               |
          ___________________  |
         |<3>                | |
         |dwc3_otg_sm_work   | |
         |usb_put_hcd        | |
         |primary_hcd(kref=1)| |
         |___________________| |
          _________|_________  |
         |<4>                | |
         |New USB BUS #1     | |
         |hcd_release        | |
         |primary_hcd(kref=0)| |
         |                   | |
         |bandXX_mutex(free) |<-
         |___________________|
                                     (( VOLD ))
                                  ______|___________
                                 |<5>               |
                                 |      SCSI        |
                                 |usb_put_hcd       |
                                 |peer_hcd(kref=0)  |
                                 |*hcd_release      |
                                 |bandXX_mutex(free*)|<- double free
                                 |__________________|
      
      =================================================
      
      This happens because hcd_release() frees the bandwidth_mutex whenever
      it sees a primary hcd being released (which is not a very good idea
      in any case), but in the course of releasing the primary hcd, it
      changes the pointers in the shared hcd in such a way that the shared
      hcd will appear to be primary when it gets released.
      
      This patch fixes the problem by changing hcd_release() so that it
      deallocates the bandwidth_mutex only when the _last_ hcd structure
      referencing it is released.  The patch also removes an unnecessary
      test, so that when an hcd is released, both the shared_hcd and
      primary_hcd pointers in the hcd's peer will be cleared.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarChung-Geol Kim <chunggeol.kim@samsung.com>
      Tested-by: default avatarChung-Geol Kim <chunggeol.kim@samsung.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      74219215
    • Al Viro's avatar
      make nfs_atomic_open() call d_drop() on all ->open_context() errors. · 3f3d3526
      Al Viro authored
      [ Upstream commit d20cb71d ]
      
      In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
      unconditional d_drop() after the ->open_context() had been removed.  It had
      been correct for success cases (there ->open_context() itself had been doing
      dcache manipulations), but not for error ones.  Only one of those (ENOENT)
      got a compensatory d_drop() added in that commit, but in fact it should've
      been done for all errors.  As it is, the case of O_CREAT non-exclusive open
      on a hashed negative dentry racing with e.g. symlink creation from another
      client ended up with ->open_context() getting an error and proceeding to
      call nfs_lookup().  On a hashed dentry, which would've instantly triggered
      BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
      d_splice_alias()).
      
      Cc: stable@vger.kernel.org # v3.10+
      Tested-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3f3d3526
    • James Morse's avatar
      KVM: arm/arm64: Stop leaking vcpu pid references · d4b08964
      James Morse authored
      [ Upstream commit 591d215a ]
      
      kvm provides kvm_vcpu_uninit(), which amongst other things, releases the
      last reference to the struct pid of the task that was last running the vcpu.
      
      On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool,
      then killing it with SIGKILL results (after some considerable time) in:
      > cat /sys/kernel/debug/kmemleak
      > unreferenced object 0xffff80007d5ea080 (size 128):
      >  comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s)
      >  hex dump (first 32 bytes):
      >    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      >    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      >  backtrace:
      >    [<ffff8000001b30ec>] create_object+0xfc/0x278
      >    [<ffff80000071da34>] kmemleak_alloc+0x34/0x70
      >    [<ffff80000019fa2c>] kmem_cache_alloc+0x16c/0x1d8
      >    [<ffff8000000d0474>] alloc_pid+0x34/0x4d0
      >    [<ffff8000000b5674>] copy_process.isra.6+0x79c/0x1338
      >    [<ffff8000000b633c>] _do_fork+0x74/0x320
      >    [<ffff8000000b66b0>] SyS_clone+0x18/0x20
      >    [<ffff800000085cb0>] el0_svc_naked+0x24/0x28
      >    [<ffffffffffffffff>] 0xffffffffffffffff
      
      On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(),
      on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free().
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Fixes: 749cf76c ("KVM: ARM: Initial skeleton to compile KVM support")
      Cc: <stable@vger.kernel.org> # 3.10+
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d4b08964
    • Cyril Bur's avatar
      powerpc/tm: Always reclaim in start_thread() for exec() class syscalls · 848be477
      Cyril Bur authored
      [ Upstream commit 8e96a87c ]
      
      Userspace can quite legitimately perform an exec() syscall with a
      suspended transaction. exec() does not return to the old process, rather
      it load a new one and starts that, the expectation therefore is that the
      new process starts not in a transaction. Currently exec() is not treated
      any differently to any other syscall which creates problems.
      
      Firstly it could allow a new process to start with a suspended
      transaction for a binary that no longer exists. This means that the
      checkpointed state won't be valid and if the suspended transaction were
      ever to be resumed and subsequently aborted (a possibility which is
      exceedingly likely as exec()ing will likely doom the transaction) the
      new process will jump to invalid state.
      
      Secondly the incorrect attempt to keep the transactional state while
      still zeroing state for the new process creates at least two TM Bad
      Things. The first triggers on the rfid to return to userspace as
      start_thread() has given the new process a 'clean' MSR but the suspend
      will still be set in the hardware MSR. The second TM Bad Thing triggers
      in __switch_to() as the processor is still transactionally suspended but
      __switch_to() wants to zero the TM sprs for the new process.
      
      This is an example of the outcome of calling exec() with a suspended
      transaction. Note the first 700 is likely the first TM bad thing
      decsribed earlier only the kernel can't report it as we've loaded
      userspace registers. c000000000009980 is the rfid in
      fast_exception_return()
      
        Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        CPU: 0 PID: 2006 Comm: tm-execed Not tainted
        NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
        REGS: c00000003ffefd40 TRAP: 0700   Not tainted
        MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]>  CR: 00000000  XER: 00000000
        CFAR: c0000000000098b4 SOFTE: 0
        PACATMSCRATCH: b00000010000d033
        GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
        GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
        NIP [c000000000009980] fast_exception_return+0xb0/0xb8
        LR [0000000000000000]           (null)
        Call Trace:
        Instruction dump:
        f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
        e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b
      
        Kernel BUG at c000000000043e80 [verbose debug info unavailable]
        Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
        Oops: Unrecoverable exception, sig: 6 [#2]
        CPU: 0 PID: 2006 Comm: tm-execed Tainted: G      D
        task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
        NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
        REGS: c00000003ffef7e0 TRAP: 0700   Tainted: G      D
        MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]>  CR: 28002828  XER: 00000000
        CFAR: c000000000015a20 SOFTE: 0
        PACATMSCRATCH: b00000010000d033
        GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
        GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
        GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
        GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
        GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
        GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
        NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
        LR [c000000000015a24] __switch_to+0x1f4/0x420
        Call Trace:
        Instruction dump:
        7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
        4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
      
      This fixes CVE-2016-5828.
      
      Fixes: bc2a9408 ("powerpc: Hook in new transactional memory code")
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      848be477
    • Torsten Hilbrich's avatar
      fs/nilfs2: fix potential underflow in call to crc32_le · cc6fd729
      Torsten Hilbrich authored
      [ Upstream commit 63d2f95d ]
      
      The value `bytes' comes from the filesystem which is about to be
      mounted.  We cannot trust that the value is always in the range we
      expect it to be.
      
      Check its value before using it to calculate the length for the crc32_le
      call.  It value must be larger (or equal) sumoff + 4.
      
      This fixes a kernel bug when accidentially mounting an image file which
      had the nilfs2 magic value 0x3434 at the right offset 0x406 by chance.
      The bytes 0x01 0x00 were stored at 0x408 and were interpreted as a
      s_bytes value of 1.  This caused an underflow when substracting sumoff +
      4 (20) in the call to crc32_le.
      
        BUG: unable to handle kernel paging request at ffff88021e600000
        IP:  crc32_le+0x36/0x100
        ...
        Call Trace:
          nilfs_valid_sb.part.5+0x52/0x60 [nilfs2]
          nilfs_load_super_block+0x142/0x300 [nilfs2]
          init_nilfs+0x60/0x390 [nilfs2]
          nilfs_mount+0x302/0x520 [nilfs2]
          mount_fs+0x38/0x160
          vfs_kern_mount+0x67/0x110
          do_mount+0x269/0xe00
          SyS_mount+0x9f/0x100
          entry_SYSCALL_64_fastpath+0x16/0x71
      
      Link: http://lkml.kernel.org/r/1466778587-5184-2-git-send-email-konishi.ryusuke@lab.ntt.co.jpSigned-off-by: default avatarTorsten Hilbrich <torsten.hilbrich@secunet.com>
      Tested-by: default avatarTorsten Hilbrich <torsten.hilbrich@secunet.com>
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cc6fd729
    • David Rientjes's avatar
      mm, compaction: abort free scanner if split fails · 284f69fb
      David Rientjes authored
      [ Upstream commit a4f04f2c ]
      
      If the memory compaction free scanner cannot successfully split a free
      page (only possible due to per-zone low watermark), terminate the free
      scanner rather than continuing to scan memory needlessly.  If the
      watermark is insufficient for a free page of order <= cc->order, then
      terminate the scanner since all future splits will also likely fail.
      
      This prevents the compaction freeing scanner from scanning all memory on
      very large zones (very noticeable for zones > 128GB, for instance) when
      all splits will likely fail while holding zone->lock.
      
      compaction_alloc() iterating a 128GB zone has been benchmarked to take
      over 400ms on some systems whereas any free page isolated and ready to
      be split ends up failing in split_free_page() because of the low
      watermark check and thus the iteration continues.
      
      The next time compaction occurs, the freeing scanner will likely start
      at the end of the zone again since no success was made previously and we
      get the same lengthy iteration until the zone is brought above the low
      watermark.  All thp page faults can take >400ms in such a state without
      this fix.
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211820350.97086@chino.kir.corp.google.comSigned-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      284f69fb
    • Vlastimil Babka's avatar
      mm, compaction: skip compound pages by order in free scanner · 68385427
      Vlastimil Babka authored
      [ Upstream commit 9fcd6d2e ]
      
      The compaction free scanner is looking for PageBuddy() pages and
      skipping all others.  For large compound pages such as THP or hugetlbfs,
      we can save a lot of iterations if we skip them at once using their
      compound_order().  This is generally unsafe and we can read a bogus
      value of order due to a race, but if we are careful, the only danger is
      skipping too much.
      
      When tested with stress-highalloc from mmtests on 4GB system with 1GB
      hugetlbfs pages, the vmstat compact_free_scanned count decreased by at
      least 15%.
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Acked-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      68385427
    • Lukasz Odzioba's avatar
      mm/swap.c: flush lru pvecs on compound page arrival · c5ad3318
      Lukasz Odzioba authored
      [ Upstream commit 8f182270 ]
      
      Currently we can have compound pages held on per cpu pagevecs, which
      leads to a lot of memory unavailable for reclaim when needed.  In the
      systems with hundreads of processors it can be GBs of memory.
      
      On of the way of reproducing the problem is to not call munmap
      explicitly on all mapped regions (i.e.  after receiving SIGTERM).  After
      that some pages (with THP enabled also huge pages) may end up on
      lru_add_pvec, example below.
      
        void main() {
        #pragma omp parallel
        {
      	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
      	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
      		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
      	if (p != MAP_FAILED)
      		memset(p, 0, size);
      	//munmap(p, size); // uncomment to make the problem go away
        }
        }
      
      When we run it with THP enabled it will leave significant amount of
      memory on lru_add_pvec.  This memory will be not reclaimed if we hit
      OOM, so when we run above program in a loop:
      
      	for i in `seq 100`; do ./a.out; done
      
      many processes (95% in my case) will be killed by OOM.
      
      The primary point of the LRU add cache is to save the zone lru_lock
      contention with a hope that more pages will belong to the same zone and
      so their addition can be batched.  The huge page is already a form of
      batched addition (it will add 512 worth of memory in one go) so skipping
      the batching seems like a safer option when compared to a potential
      excess in the caching which can be quite large and much harder to fix
      because lru_add_drain_all is way to expensive and it is not really clear
      what would be a good moment to call it.
      
      Similarly we can reproduce the problem on lru_deactivate_pvec by adding:
      madvise(p, size, MADV_FREE); after memset.
      
      This patch flushes lru pvecs on compound page arrival making the problem
      less severe - after applying it kill rate of above example drops to 0%,
      due to reducing maximum amount of memory held on pvec from 28MB (with
      THP) to 56kB per CPU.
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Ming Li <mingli199x@qq.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      c5ad3318
    • Anthony Romano's avatar
      tmpfs: don't undo fallocate past its last page · c5bcec6c
      Anthony Romano authored
      [ Upstream commit b9b4bb26 ]
      
      When fallocate is interrupted it will undo a range that extends one byte
      past its range of allocated pages.  This can corrupt an in-use page by
      zeroing out its first byte.  Instead, undo using the inclusive byte
      range.
      
      Fixes: 1635f6a7 ("tmpfs: undo fallocation on failure")
      Link: http://lkml.kernel.org/r/1462713387-16724-1-git-send-email-anthony.romano@coreos.comSigned-off-by: default avatarAnthony Romano <anthony.romano@coreos.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Brandon Philips <brandon@ifup.co>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      c5bcec6c
    • Alan Stern's avatar
      USB: EHCI: declare hostpc register as zero-length array · 7f3724b8
      Alan Stern authored
      [ Upstream commit 7e8b3dfe ]
      
      The HOSTPC extension registers found in some EHCI implementations form
      a variable-length array, with one element for each port.  Therefore
      the hostpc field in struct ehci_regs should be declared as a
      zero-length array, not a single-element array.
      
      This fixes a problem reported by UBSAN.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      Tested-by: default avatarWilfried Klaebe <linux-kernel@lebenslange-mailadresse.de>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7f3724b8
    • Steve French's avatar
      File names with trailing period or space need special case conversion · 655d8d19
      Steve French authored
      [ Upstream commit 45e8a258 ]
      
      POSIX allows files with trailing spaces or a trailing period but
      SMB3 does not, so convert these using the normal Services For Mac
      mapping as we do for other reserved characters such as
      	: < > | ? *
      This is similar to what Macs do for the same problem over SMB3.
      
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Acked-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      655d8d19
    • Steve French's avatar
      Fix reconnect to not defer smb3 session reconnect long after socket reconnect · e20c888e
      Steve French authored
      [ Upstream commit 4fcd1813 ]
      
      Azure server blocks clients that open a socket and don't do anything on it.
      In our reconnect scenarios, we can reconnect the tcp session and
      detect the socket is available but we defer the negprot and SMB3 session
      setup and tree connect reconnection until the next i/o is requested, but
      this looks suspicous to some servers who expect SMB3 negprog and session
      setup soon after a socket is created.
      
      In the echo thread, reconnect SMB3 sessions and tree connections
      that are disconnected.  A later patch will replay persistent (and
      resilient) handle opens.
      
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Acked-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e20c888e
    • Weston Andros Adamson's avatar
      pnfs_nfs: fix _cancel_empty_pagelist · eba391c7
      Weston Andros Adamson authored
      [ Upstream commit 5e3a9888 ]
      
      pnfs_generic_commit_cancel_empty_pagelist calls nfs_commitdata_release,
      but that is wrong: nfs_commitdata_release puts the open context, something
      that isn't valid until nfs_init_commit is called, which is never the case
      when pnfs_generic_commit_cancel_empty_pagelist is called.
      
      This was introduced in "nfs: avoid race that crashes nfs_init_commit".
      Signed-off-by: default avatarWeston Andros Adamson <dros@primarydata.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      eba391c7
    • Weston Andros Adamson's avatar
      nfs: avoid race that crashes nfs_init_commit · 691c507e
      Weston Andros Adamson authored
      [ Upstream commit ade8febd ]
      
      Since the patch "NFS: Allow multiple commit requests in flight per file"
      we can run multiple simultaneous commits on the same inode.  This
      introduced a race over collecting pages to commit that made it possible
      to call nfs_init_commit() with an empty list - which causes crashes like
      the one below.
      
      The fix is to catch this race and avoid calling nfs_init_commit and
      initiate_commit when there is no work to do.
      
      Here is the crash:
      
      [600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
      [600522.078475] IP: [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs]
      [600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0
      [600522.078972] Oops: 0000 [#1] SMP
      [600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3
      [600522.081380]  vmw_pvscsi ata_generic pata_acpi
      [600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1
      [600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
      [600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000
      [600522.083378] RIP: 0010:[<ffffffffa0479e72>]  [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs]
      [600522.083973] RSP: 0018:ffff88042ae87438  EFLAGS: 00010246
      [600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588
      [600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40
      [600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0
      [600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0
      [600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840
      [600522.087484] FS:  00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
      [600522.088070] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0
      [600522.089327] Stack:
      [600522.089926]  0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa
      [600522.090549]  0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930
      [600522.091169]  ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840
      [600522.091789] Call Trace:
      [600522.092420]  [<ffffffffa04f09fa>] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4]
      [600522.093052]  [<ffffffffa0563d80>] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles]
      [600522.093696]  [<ffffffffa0562645>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles]
      [600522.094359]  [<ffffffffa047bc78>] nfs_generic_commit_list+0xe8/0x120 [nfs]
      [600522.095032]  [<ffffffffa047bd6a>] nfs_commit_inode+0xba/0x110 [nfs]
      [600522.095719]  [<ffffffffa046ac54>] nfs_release_page+0x44/0xd0 [nfs]
      [600522.096410]  [<ffffffff811a8122>] try_to_release_page+0x32/0x50
      [600522.097109]  [<ffffffff811bd4f1>] shrink_page_list+0x961/0xb30
      [600522.097812]  [<ffffffff811bdced>] shrink_inactive_list+0x1cd/0x550
      [600522.098530]  [<ffffffff811bea65>] shrink_lruvec+0x635/0x840
      [600522.099250]  [<ffffffff811bed60>] shrink_zone+0xf0/0x2f0
      [600522.099974]  [<ffffffff811bf312>] do_try_to_free_pages+0x192/0x470
      [600522.100709]  [<ffffffff811bf6ca>] try_to_free_pages+0xda/0x170
      [600522.101464]  [<ffffffff811b2198>] __alloc_pages_nodemask+0x588/0x970
      [600522.102235]  [<ffffffff811fbbd5>] alloc_pages_vma+0xb5/0x230
      [600522.103000]  [<ffffffff813a1589>] ? cpumask_any_but+0x39/0x50
      [600522.103774]  [<ffffffff811d6115>] wp_page_copy.isra.55+0x95/0x490
      [600522.104558]  [<ffffffff810e3438>] ? __wake_up+0x48/0x60
      [600522.105357]  [<ffffffff811d7d3b>] do_wp_page+0xab/0x4f0
      [600522.106137]  [<ffffffff810a1bbb>] ? release_task+0x36b/0x470
      [600522.106902]  [<ffffffff8126dbd7>] ? eventfd_ctx_read+0x67/0x1c0
      [600522.107659]  [<ffffffff811da2a8>] handle_mm_fault+0xc78/0x1900
      [600522.108431]  [<ffffffff81067ef1>] __do_page_fault+0x181/0x420
      [600522.109173]  [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280
      [600522.109893]  [<ffffffff810681c0>] do_page_fault+0x30/0x80
      [600522.110594]  [<ffffffff81024f36>] ? syscall_trace_leave+0xc6/0x120
      [600522.111288]  [<ffffffff81790a58>] page_fault+0x28/0x30
      [600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce <48> 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08
      [600522.113343] RIP  [<ffffffffa0479e72>] nfs_init_commit+0x22/0x130 [nfs]
      [600522.114003]  RSP <ffff88042ae87438>
      [600522.114636] CR2: 0000000000000040
      
      Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file)
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarWeston Andros Adamson <dros@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      691c507e
    • Trond Myklebust's avatar
      pNFS: Tighten up locking around DS commit buckets · 94d06a43
      Trond Myklebust authored
      [ Upstream commit 27571297 ]
      
      I'm not aware of any bugreports around this issue, but the locking
      around the pnfs_commit_bucket is inconsistent at best. This patch
      tightens it up by ensuring that the 'bucket->committing' list is always
      changed atomically w.r.t. the 'bucket->clseg' layout segment tracking.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      94d06a43