1. 26 Jul, 2019 40 commits
    • Marek Szyprowski's avatar
      media: s5p-mfc: Make additional clocks optional · c98a3afe
      Marek Szyprowski authored
      [ Upstream commit e08efef8 ]
      
      Since the beginning the second clock ('special', 'sclk') was optional and
      it is not available on some variants of Exynos SoCs (i.e. Exynos5420 with
      v7 of MFC hardware).
      
      However commit 1bce6fb3 ("[media] s5p-mfc: Rework clock handling")
      made handling of all specified clocks mandatory. This patch restores
      original behavior of the driver and fixes its operation on
      Exynos5420 SoCs.
      
      Fixes: 1bce6fb3 ("[media] s5p-mfc: Rework clock handling")
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c98a3afe
    • Julian Anastasov's avatar
      ipvs: defer hook registration to avoid leaks · affa31e6
      Julian Anastasov authored
      [ Upstream commit cf47a0b8 ]
      
      syzkaller reports for memory leak when registering hooks [1]
      
      As we moved the nf_unregister_net_hooks() call into
      __ip_vs_dev_cleanup(), defer the nf_register_net_hooks()
      call, so that hooks are allocated and freed from same
      pernet_operations (ipvs_core_dev_ops).
      
      [1]
      BUG: memory leak
      unreferenced object 0xffff88810acd8a80 (size 96):
       comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s)
       hex dump (first 32 bytes):
         02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff  ........P.......
         00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff  .........w......
       backtrace:
         [<0000000013db61f1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
         [<0000000013db61f1>] slab_post_alloc_hook mm/slab.h:439 [inline]
         [<0000000013db61f1>] slab_alloc_node mm/slab.c:3269 [inline]
         [<0000000013db61f1>] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597
         [<000000001a27307d>] __do_kmalloc_node mm/slab.c:3619 [inline]
         [<000000001a27307d>] __kmalloc_node+0x38/0x50 mm/slab.c:3627
         [<0000000025054add>] kmalloc_node include/linux/slab.h:590 [inline]
         [<0000000025054add>] kvmalloc_node+0x4a/0xd0 mm/util.c:431
         [<0000000050d1bc00>] kvmalloc include/linux/mm.h:637 [inline]
         [<0000000050d1bc00>] kvzalloc include/linux/mm.h:645 [inline]
         [<0000000050d1bc00>] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61
         [<00000000e8abe142>] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128
         [<000000004b94797c>] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337
         [<00000000d1545cbc>] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464
         [<00000000876c9b55>] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480
         [<000000002ea868e0>] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280
         [<000000002eb2d451>] ops_init+0x4c/0x140 net/core/net_namespace.c:130
         [<000000000284ec48>] setup_net+0xde/0x230 net/core/net_namespace.c:316
         [<00000000a70600fa>] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439
         [<00000000ff26c15e>] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107
         [<00000000b103dc79>] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165
         [<000000007cc008a2>] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035
         [<00000000c344af7c>] copy_process kernel/fork.c:1800 [inline]
         [<00000000c344af7c>] _do_fork+0x121/0x4f0 kernel/fork.c:2369
      
      Reported-by: syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com
      Fixes: 719c7d56 ("ipvs: Fix use-after-free in ip_vs_in")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      affa31e6
    • Arnd Bergmann's avatar
      ipsec: select crypto ciphers for xfrm_algo · a275d049
      Arnd Bergmann authored
      [ Upstream commit 597179b0 ]
      
      kernelci.org reports failed builds on arc because of what looks
      like an old missed 'select' statement:
      
      net/xfrm/xfrm_algo.o: In function `xfrm_probe_algs':
      xfrm_algo.c:(.text+0x1e8): undefined reference to `crypto_has_ahash'
      
      I don't see this in randconfig builds on other architectures, but
      it's fairly clear we want to select the hash code for it, like we
      do for all its other users. As Herbert points out, CRYPTO_BLKCIPHER
      is also required even though it has not popped up in build tests.
      
      Fixes: 17bc1970 ("ipsec: Use skcipher and ahash when probing algorithms")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a275d049
    • Julien Thierry's avatar
      arm64: Do not enable IRQs for ct_user_exit · 7e41783d
      Julien Thierry authored
      [ Upstream commit 9034f625 ]
      
      For el0_dbg and el0_error, DAIF bits get explicitly cleared before
      calling ct_user_exit.
      
      When context tracking is disabled, DAIF gets set (almost) immediately
      after. When context tracking is enabled, among the first things done
      is disabling IRQs.
      
      What is actually needed is:
      - PSR.D = 0 so the system can be debugged (should be already the case)
      - PSR.A = 0 so async error can be handled during context tracking
      
      Do not clear PSR.I in those two locations.
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7e41783d
    • Minwoo Im's avatar
      nvme-pci: adjust irq max_vector using num_possible_cpus() · 8d81f2b8
      Minwoo Im authored
      [ Upstream commit dad77d63 ]
      
      If the "irq_queues" are greater than num_possible_cpus(),
      nvme_calc_irq_sets() can have irq set_size for HCTX_TYPE_DEFAULT greater
      than it can be afforded.
      2039         affd->set_size[HCTX_TYPE_DEFAULT] = nrirqs - nr_read_queues;
      
      It might cause a WARN() from the irq_build_affinity_masks() like [1]:
      220         if (nr_present < numvecs)
      221                 WARN_ON(nr_present + nr_others < numvecs);
      
      This patch prevents it from the WARN() by adjusting the max_vector value
      from the nvme_setup_irqs().
      
      [1] WARN messages when modprobe nvme write_queues=32 poll_queues=0:
      root@target:~/nvme# nproc
      8
      root@target:~/nvme# modprobe nvme write_queues=32 poll_queues=0
      [   17.925326] nvme nvme0: pci function 0000:00:04.0
      [   17.940601] WARNING: CPU: 3 PID: 1030 at kernel/irq/affinity.c:221 irq_create_affinity_masks+0x222/0x330
      [   17.940602] Modules linked in: nvme nvme_core [last unloaded: nvme]
      [   17.940605] CPU: 3 PID: 1030 Comm: kworker/u17:4 Tainted: G        W         5.1.0+ #156
      [   17.940605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [   17.940608] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      [   17.940609] RIP: 0010:irq_create_affinity_masks+0x222/0x330
      [   17.940611] Code: 4c 8d 4c 24 28 4c 8d 44 24 30 e8 c9 fa ff ff 89 44 24 18 e8 c0 38 fa ff 8b 44 24 18 44 8b 54 24 1c 5a 44 01 d0 41 39 c4 76 02 <0f> 0b 48 89 df 44 01 e5 e8 f1 ce 10 00 48 8b 34 24 44 89 f0 44 01
      [   17.940611] RSP: 0018:ffffc90002277c50 EFLAGS: 00010216
      [   17.940612] RAX: 0000000000000008 RBX: ffff88807ca48860 RCX: 0000000000000000
      [   17.940612] RDX: ffff88807bc03800 RSI: 0000000000000020 RDI: 0000000000000000
      [   17.940613] RBP: 0000000000000001 R08: ffffc90002277c78 R09: ffffc90002277c70
      [   17.940613] R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000020
      [   17.940614] R13: 0000000000025d08 R14: 0000000000000001 R15: ffff88807bc03800
      [   17.940614] FS:  0000000000000000(0000) GS:ffff88807db80000(0000) knlGS:0000000000000000
      [   17.940616] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   17.940617] CR2: 00005635e583f790 CR3: 000000000240a000 CR4: 00000000000006e0
      [   17.940617] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   17.940618] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   17.940618] Call Trace:
      [   17.940622]  __pci_enable_msix_range+0x215/0x540
      [   17.940623]  ? kernfs_put+0x117/0x160
      [   17.940625]  pci_alloc_irq_vectors_affinity+0x74/0x110
      [   17.940626]  nvme_reset_work+0xc30/0x1397 [nvme]
      [   17.940628]  ? __switch_to_asm+0x34/0x70
      [   17.940628]  ? __switch_to_asm+0x40/0x70
      [   17.940629]  ? __switch_to_asm+0x34/0x70
      [   17.940630]  ? __switch_to_asm+0x40/0x70
      [   17.940630]  ? __switch_to_asm+0x34/0x70
      [   17.940631]  ? __switch_to_asm+0x40/0x70
      [   17.940632]  ? nvme_irq_check+0x30/0x30 [nvme]
      [   17.940633]  process_one_work+0x20b/0x3e0
      [   17.940634]  worker_thread+0x1f9/0x3d0
      [   17.940635]  ? cancel_delayed_work+0xa0/0xa0
      [   17.940636]  kthread+0x117/0x120
      [   17.940637]  ? kthread_stop+0xf0/0xf0
      [   17.940638]  ret_from_fork+0x3a/0x50
      [   17.940639] ---[ end trace aca8a131361cd42a ]---
      [   17.942124] nvme nvme0: 7/1/0 default/read/poll queues
      Signed-off-by: default avatarMinwoo Im <minwoo.im.dev@gmail.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d81f2b8
    • Heiner Litz's avatar
      lightnvm: pblk: fix freeing of merged pages · dc7cb800
      Heiner Litz authored
      [ Upstream commit 510fd8ea ]
      
      bio_add_pc_page() may merge pages when a bio is padded due to a flush.
      Fix iteration over the bio to free the correct pages in case of a merge.
      Signed-off-by: default avatarHeiner Litz <hlitz@ucsc.edu>
      Reviewed-by: default avatarJavier González <javier@javigon.com>
      Signed-off-by: default avatarMatias Bjørling <mb@lightnvm.io>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dc7cb800
    • Chaitanya Kulkarni's avatar
      nvme-pci: set the errno on ctrl state change error · e05c9c36
      Chaitanya Kulkarni authored
      [ Upstream commit e71afda4 ]
      
      This patch removes the confusing assignment of the variable result at
      the time of declaration and sets the value in error cases next to the
      places where the actual error is happening.
      
      Here we also set the result value to -ENODEV when we fail at the final
      ctrl state transition in nvme_reset_work(). Without this assignment
      result will hold 0 from nvme_setup_io_queue() and on failure 0 will be
      passed to he nvme_remove_dead_ctrl() from final state transition.
      Signed-off-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e05c9c36
    • Minwoo Im's avatar
      nvme-pci: properly report state change failure in nvme_reset_work · bd3df654
      Minwoo Im authored
      [ Upstream commit cee6c269 ]
      
      If the state change to NVME_CTRL_CONNECTING fails, the dmesg is going to
      be like:
      
        [  293.689160] nvme nvme0: failed to mark controller CONNECTING
        [  293.689160] nvme nvme0: Removing after probe failure status: 0
      
      Even it prints the first line to indicate the situation, the second line
      is not proper because the status is 0 which means normally success of
      the previous operation.
      
      This patch makes it indicate the proper error value when it fails.
        [   25.932367] nvme nvme0: failed to mark controller CONNECTING
        [   25.932369] nvme nvme0: Removing after probe failure status: -16
      
      This situation is able to be easily reproduced by:
        root@target:~# rmmod nvme && modprobe nvme && rmmod nvme
      Signed-off-by: default avatarMinwoo Im <minwoo.im.dev@gmail.com>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bd3df654
    • Anton Eidelman's avatar
      nvme: fix possible io failures when removing multipathed ns · 942c09be
      Anton Eidelman authored
      [ Upstream commit 2181e455 ]
      
      When a shared namespace is removed, we call blk_cleanup_queue()
      when the device can still be accessed as the current path and this can
      result in submission to a dying queue. Hence, direct_make_request()
      called by our mpath device may fail (propagating the failure to userspace).
      Instead, we want to failover this I/O to a different path if one exists.
      Thus, before we cleanup the request queue, we make sure that the device is
      cleared from the current path nor it can be selected again as such.
      
      Fix this by:
      - clear the ns from the head->list and synchronize rcu to make sure there is
        no concurrent path search that restores it as the current path
      - clear the mpath current path in order to trigger a subsequent path search
        and sync srcu to wait for any ongoing request submissions
      - safely continue to namespace removal and blk_cleanup_queue
      Signed-off-by: default avatarAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      942c09be
    • Pan Bian's avatar
      EDAC/sysfs: Fix memory leak when creating a csrow object · 3db69908
      Pan Bian authored
      [ Upstream commit 585fb3d9 ]
      
      In edac_create_csrow_object(), the reference to the object is not
      released when adding the device to the device hierarchy fails
      (device_add()). This may result in a memory leak.
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: https://lkml.kernel.org/r/1555554438-103953-1-git-send-email-bianpan2016@163.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      3db69908
    • Greg KH's avatar
      EDAC/sysfs: Drop device references properly · 4b036124
      Greg KH authored
      [ Upstream commit 7adc05d2 ]
      
      Do put_device() if device_add() fails.
      
       [ bp: do device_del() for the successfully created devices in
         edac_create_csrow_objects(), on the unwind path. ]
      Signed-off-by: default avatarGreg KH <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20190427214925.GE16338@kroah.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      4b036124
    • Tudor Ambarus's avatar
      spi: fix ctrl->num_chipselect constraint · a2fd0a9b
      Tudor Ambarus authored
      [ Upstream commit f9481b08 ]
      
      at91sam9g25ek showed the following error at probe:
      atmel_spi f0000000.spi: Using dma0chan2 (tx) and dma0chan3 (rx)
      for DMA transfers
      atmel_spi: probe of f0000000.spi failed with error -22
      
      Commit 0a919ae4 ("spi: Don't call spi_get_gpio_descs() before device name is set")
      moved the calling of spi_get_gpio_descs() after ctrl->dev is set,
      but didn't move the !ctrl->num_chipselect check. When there are
      chip selects in the device tree, the spi-atmel driver lets the
      SPI core discover them when registering the SPI master.
      The ctrl->num_chipselect is thus expected to be set by
      spi_get_gpio_descs().
      
      Move the !ctlr->num_chipselect after spi_get_gpio_descs() as it was
      before the aforementioned commit. While touching this block, get rid
      of the explicit comparison with 0 and update the commenting style.
      
      Fixes: 0a919ae4 ("spi: Don't call spi_get_gpio_descs() before device name is set")
      Signed-off-by: default avatarTudor Ambarus <tudor.ambarus@microchip.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a2fd0a9b
    • Rafael J. Wysocki's avatar
      ACPICA: Clear status of GPEs on first direct enable · 467ebe42
      Rafael J. Wysocki authored
      [ Upstream commit 44758baf ]
      
      ACPI GPEs (other than the EC one) can be enabled in two situations.
      First, the GPEs with existing _Lxx and _Exx methods are enabled
      implicitly by ACPICA during system initialization.  Second, the
      GPEs without these methods (like GPEs listed by _PRW objects for
      wakeup devices) need to be enabled directly by the code that is
      going to use them (e.g. ACPI power management or device drivers).
      
      In the former case, if the status of a given GPE is set to start
      with, its handler method (either _Lxx or _Exx) needs to be invoked
      to take care of the events (possibly) signaled before the GPE was
      enabled.  In the latter case, however, the first caller of
      acpi_enable_gpe() for a given GPE should not be expected to care
      about any events that might be signaled through it earlier.  In
      that case, it is better to clear the status of the GPE before
      enabling it, to prevent stale events from triggering unwanted
      actions (like spurious system resume, for example).
      
      For this reason, modify acpi_ev_add_gpe_reference() to take an
      additional boolean argument indicating whether or not the GPE
      status needs to be cleared when its reference counter changes from
      zero to one and make acpi_enable_gpe() pass TRUE to it through
      that new argument.
      
      Fixes: 18996f2d ("ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume")
      Reported-by: default avatarFurquan Shaikh <furquan@google.com>
      Tested-by: default avatarFurquan Shaikh <furquan@google.com>
      Tested-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      467ebe42
    • Dennis Zhou's avatar
      blk-iolatency: only account submitted bios · c8632e34
      Dennis Zhou authored
      [ Upstream commit a3fb01ba ]
      
      As is, iolatency recognizes done_bio and cleanup as ending paths. If a
      request is marked REQ_NOWAIT and fails to get a request, the bio is
      cleaned up via rq_qos_cleanup() and ended in bio_wouldblock_error().
      This results in underflowing the inflight counter. Fix this by only
      accounting bios that were actually submitted.
      Signed-off-by: default avatarDennis Zhou <dennis@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c8632e34
    • Qian Cai's avatar
      x86/cacheinfo: Fix a -Wtype-limits warning · 1d4bf99c
      Qian Cai authored
      [ Upstream commit 1b7aebf0 ]
      
      cpuinfo_x86.x86_model is an unsigned type, so comparing against zero
      will generate a compilation warning:
      
        arch/x86/kernel/cpu/cacheinfo.c: In function 'cacheinfo_amd_init_llc_id':
        arch/x86/kernel/cpu/cacheinfo.c:662:19: warning: comparison is always true \
          due to limited range of data type [-Wtype-limits]
      
      Remove the unnecessary lower bound check.
      
       [ bp: Massage. ]
      
      Fixes: 68091ee7 ("x86/CPU/AMD: Calculate last level cache ID from number of sharing threads")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560954773-11967-1-git-send-email-cai@lca.pwSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d4bf99c
    • Ilias Apalodimas's avatar
      net: netsec: initialize tx ring on ndo_open · a98c1517
      Ilias Apalodimas authored
      [ Upstream commit 39e3622e ]
      
      Since we changed the Tx ring handling and now depends on bit31 to figure
      out the owner of the descriptor, we should initialize this every time
      the device goes down-up instead of doing it once on driver init. If the
      value is not correctly initialized the device won't have any available
      descriptors
      
      Changes since v1:
      - Typo fixes
      
      Fixes: 35e07d23 ("net: socionext: remove mmio reads on Tx")
      Signed-off-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a98c1517
    • Mika Westerberg's avatar
      PCI: Add missing link delays required by the PCIe spec · 3c795a8e
      Mika Westerberg authored
      [ Upstream commit c2bf1fc2 ]
      
      Currently Linux does not follow PCIe spec regarding the required delays
      after reset. A concrete example is a Thunderbolt add-in-card that
      consists of a PCIe switch and two PCIe endpoints:
      
        +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
                                        +-01.0-[04-36]-- DS hotplug port
                                        +-02.0-[37]----00.0 xHCI controller
                                        \-04.0-[38-6b]-- DS hotplug port
      
      The root port (1b.0) and the PCIe switch downstream ports are all PCIe
      gen3 so they support 8GT/s link speeds.
      
      We wait for the PCIe hierarchy to enter D3cold (runtime):
      
        pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
      
      When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
      PCIe switch is put to reset and its power is re-applied. This means that
      we must follow the rules in PCIe 4.0 section 6.6.1.
      
      For the PCIe gen3 ports we are dealing with here, the following applies:
      
        With a Downstream Port that supports Link speeds greater than 5.0
        GT/s, software must wait a minimum of 100 ms after Link training
        completes before sending a Configuration Request to the device
        immediately below that Port. Software can determine when Link training
        completes by polling the Data Link Layer Link Active bit or by setting
        up an associated interrupt (see Section 6.7.3.3).
      
      Translating this into the above topology we would need to do this (DLLLA
      stands for Data Link Layer Link Active):
      
        pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
        pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
        pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
      
      I've instrumented the kernel with additional logging so we can see the
      actual delays the kernel performs:
      
        pcieport 0000:00:1b.0: power state changed by ACPI to D0
        pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
        pcieport 0000:00:1b.0: waking up bus
        pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
        pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
        ...
        pcieport 0000:00:1b.0: PME# disabled
        pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:01:00.0: PME# disabled
        pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:00.0: PME# disabled
        pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:01.0: PME# disabled
        pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:02.0: PME# disabled
        pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:04.0: PME# disabled
        pcieport 0000:02:01.0: PME# enabled
        pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
        pcieport 0000:02:04.0: PME# enabled
        pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
        thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
        ...
        thunderbolt 0000:03:00.0: PME# disabled
        xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
        ...
        xhci_hcd 0000:37:00.0: PME# disabled
      
      For the switch upstream port (01:00.0) we wait for 100ms but not taking
      into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
      transition of the root port and the two downstream hotplug ports. This
      means that we deviate from what the spec requires.
      
      Performing the same check for system sleep (s2idle) transitions we can
      see following when resuming from s2idle:
      
        pcieport 0000:00:1b.0: power state changed by ACPI to D0
        pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
        ...
        pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
        pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
        pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
        pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
        pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
        pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
        pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
        pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
        pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
        pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
        pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
        pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
        pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
        pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
        pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
        pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
        pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
        pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
        pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
        pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
        pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
        pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
        pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
        pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
        pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
        ...
        thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
      
      This is even worse. None of the mandatory delays are performed. If this
      would be S3 instead of s2idle then according to PCI FW spec 3.2 section
      4.6.8.  there is a specific _DSM that allows the OS to skip the delays
      but this platform does not provide the _DSM and does not go to S3 anyway
      so no firmware is involved that could already handle these delays.
      
      In this particular Intel Coffee Lake platform these delays are not
      actually needed because there is an additional delay as part of the ACPI
      power resource that is used to turn on power to the hierarchy but since
      that additional delay is not required by any of standards (PCIe, ACPI)
      it is not present in the Intel Ice Lake, for example where missing the
      mandatory delays causes pciehp to start tearing down the stack too early
      (links are not yet trained).
      
      For this reason, change the PCIe portdrv PM resume hooks so that they
      perform the mandatory delays before the downstream component gets
      resumed. We perform the delays before port services are resumed because
      otherwise pciehp might find that the link is not up (even if it is just
      training) and tears-down the hierarchy.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3c795a8e
    • Alexei Starovoitov's avatar
      bpf: fix callees pruning callers · 70cc29db
      Alexei Starovoitov authored
      [ Upstream commit eea1c227 ]
      
      The commit 7640ead9 partially resolved the issue of callees
      incorrectly pruning the callers.
      With introduction of bounded loops and jmps_processed heuristic
      single verifier state may contain multiple branches and calls.
      It's possible that new verifier state (for future pruning) will be
      allocated inside callee. Then callee will exit (still within the same
      verifier state). It will go back to the caller and there R6-R9 registers
      will be read and will trigger mark_reg_read. But the reg->live for all frames
      but the top frame is not set to LIVE_NONE. Hence mark_reg_read will fail
      to propagate liveness into parent and future walking will incorrectly
      conclude that the states are equivalent because LIVE_READ is not set.
      In other words the rule for parent/live should be:
      whenever register parentage chain is set the reg->live should be set to LIVE_NONE.
      is_state_visited logic already follows this rule for spilled registers.
      
      Fixes: 7640ead9 ("bpf: verifier: make sure callees don't prune with caller differences")
      Fixes: f4d7e40a ("bpf: introduce function calls (verification)")
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      70cc29db
    • Nilkanth Ahirrao's avatar
      ASoC: rsnd: fixup mod ID calculation in rsnd_ctu_probe_ · f89b29fa
      Nilkanth Ahirrao authored
      [ Upstream commit ac28ec07 ]
      
      commit c16015f3 ("ASoC: rsnd: add .get_id/.get_id_sub")
      introduces rsnd_ctu_id which calcualates and gives
      the main Device id of the CTU by dividing the id by 4.
      rsnd_mod_id uses this interface to get the CTU main
      Device id. But this commit forgets to revert the main
      Device id calcution previously done in rsnd_ctu_probe_
      which also divides the id by 4. This path corrects the
      same to get the correct main Device id.
      
      The issue is observered when rsnd_ctu_probe_ is done for CTU1
      
      Fixes: c16015f3 ("ASoC: rsnd: add .get_id/.get_id_sub")
      Signed-off-by: default avatarNilkanth Ahirrao <anilkanth@jp.adit-jv.com>
      Signed-off-by: default avatarSuresh Udipi <sudipi@jp.adit-jv.com>
      Signed-off-by: default avatarJiada Wang <jiada_wang@mentor.com>
      Acked-by: default avatarKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f89b29fa
    • Denis Kirjanov's avatar
      ipoib: correcly show a VF hardware address · c545bda3
      Denis Kirjanov authored
      [ Upstream commit 64d701c6 ]
      
      in the case of IPoIB with SRIOV enabled hardware
      ip link show command incorrecly prints
      0 instead of a VF hardware address.
      
      Before:
      11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
      state UP mode DEFAULT group default qlen 256
          link/infiniband
      80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
      00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
          vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable,
      trust off, query_rss off
      ...
      After:
      11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
      state UP mode DEFAULT group default qlen 256
          link/infiniband
      80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
      00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
          vf 0     link/infiniband
      80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
      00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof
      checking off, link-state disable, trust off, query_rss off
      
      v1->v2: just copy an address without modifing ifla_vf_mac
      v2->v3: update the changelog
      Signed-off-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c545bda3
    • Mitch Williams's avatar
      iavf: allow null RX descriptors · 2a51e334
      Mitch Williams authored
      [ Upstream commit efa14c39 ]
      
      In some circumstances, the hardware can hand us a null receive
      descriptor, with no data attached but otherwise valid. Unfortunately,
      the driver was ill-equipped to handle such an event, and would stop
      processing packets at that point.
      
      To fix this, use the Descriptor Done bit instead of the size to
      determine whether or not a descriptor is ready to be processed. Add some
      checks to allow for unused buffers.
      Signed-off-by: default avatarMitch Williams <mitch.a.williams@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a51e334
    • Jason Wang's avatar
      vhost_net: disable zerocopy by default · 46ffa155
      Jason Wang authored
      [ Upstream commit 098eadce ]
      
      Vhost_net was known to suffer from HOL[1] issues which is not easy to
      fix. Several downstream disable the feature by default. What's more,
      the datapath was split and datacopy path got the support of batching
      and XDP support recently which makes it faster than zerocopy part for
      small packets transmission.
      
      It looks to me that disable zerocopy by default is more
      appropriate. It cold be enabled by default again in the future if we
      fix the above issues.
      
      [1] https://patchwork.kernel.org/patch/3787671/Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      46ffa155
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Make perf_evsel__name() accept a NULL argument · 0b0e0d74
      Arnaldo Carvalho de Melo authored
      [ Upstream commit fdbdd7e8 ]
      
      In which case it simply returns "unknown", like when it can't figure out
      the evsel->name value.
      
      This makes this code more robust and fixes a problem in 'perf trace'
      where a NULL evsel was being passed to a routine that only used the
      evsel for printing its name when a invalid syscall id was passed.
      Reported-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-f30ztaasku3z935cn3ak3h53@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0b0e0d74
    • Peter Zijlstra's avatar
      x86/atomic: Fix smp_mb__{before,after}_atomic() · 380ca130
      Peter Zijlstra authored
      [ Upstream commit 69d927bb ]
      
      Recent probing at the Linux Kernel Memory Model uncovered a
      'surprise'. Strongly ordered architectures where the atomic RmW
      primitive implies full memory ordering and
      smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
      fail for:
      
      	*x = 1;
      	atomic_inc(u);
      	smp_mb__after_atomic();
      	r0 = *y;
      
      Because, while the atomic_inc() implies memory order, it
      (surprisingly) does not provide a compiler barrier. This then allows
      the compiler to re-order like so:
      
      	atomic_inc(u);
      	*x = 1;
      	smp_mb__after_atomic();
      	r0 = *y;
      
      Which the CPU is then allowed to re-order (under TSO rules) like:
      
      	atomic_inc(u);
      	r0 = *y;
      	*x = 1;
      
      And this very much was not intended. Therefore strengthen the atomic
      RmW ops to include a compiler barrier.
      
      NOTE: atomic_{or,and,xor} and the bitops already had the compiler
      barrier.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      380ca130
    • Geert Uytterhoeven's avatar
      integrity: Fix __integrity_init_keyring() section mismatch · 1d0d6eb4
      Geert Uytterhoeven authored
      [ Upstream commit 8c655784 ]
      
      With gcc-4.6.3:
      
          WARNING: vmlinux.o(.text.unlikely+0x24c64): Section mismatch in reference from the function __integrity_init_keyring() to the function .init.text:set_platform_trusted_keys()
          The function __integrity_init_keyring() references
          the function __init set_platform_trusted_keys().
          This is often because __integrity_init_keyring lacks a __init
          annotation or the annotation of set_platform_trusted_keys is wrong.
      
      Indeed, if the compiler decides not to inline __integrity_init_keyring(),
      a warning is issued.
      
      Fix this by adding the missing __init annotation.
      
      Fixes: 9dc92c45 ("integrity: Define a trusted platform keyring")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarNayna Jain <nayna@linux.ibm.com>
      Reviewed-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d0d6eb4
    • Kan Liang's avatar
      perf/x86/intel/uncore: Handle invalid event coding for free-running counter · 5536b0ee
      Kan Liang authored
      [ Upstream commit 543ac280 ]
      
      Counting with invalid event coding for free-running counter may cause
      OOPs, e.g. uncore_iio_free_running_0/event=1/.
      
      Current code only validate the event with free-running event format,
      event=0xff,umask=0xXY. Non-free-running event format never be checked
      for the PMU with free-running counters.
      
      Add generic hw_config() to check and reject the invalid event coding
      for free-running PMU.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Fixes: 0f519f03 ("perf/x86/intel/uncore: Support IIO free-running counters on SKX")
      Link: https://lkml.kernel.org/r/1556672028-119221-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5536b0ee
    • Jiri Olsa's avatar
      perf/x86/intel: Disable check_msr for real HW · 826e73e7
      Jiri Olsa authored
      [ Upstream commit d0e1a507 ]
      
      Tom Vaden reported false failure of the check_msr() function, because
      some servers can do POST tracing and enable LBR tracing during
      bootup.
      
      Kan confirmed that check_msr patch was to fix a bug report in
      guest, so it's ok to disable it for real HW.
      Reported-by: default avatarTom Vaden <tom.vaden@hpe.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTom Vaden <tom.vaden@hpe.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Liang Kan <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190616141313.GD2500@krava
      [ Readability edits. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      826e73e7
    • Qian Cai's avatar
      sched/fair: Fix "runnable_avg_yN_inv" not used warnings · 7104a8f3
      Qian Cai authored
      [ Upstream commit 509466b7 ]
      
      runnable_avg_yN_inv[] is only used in kernel/sched/pelt.c but was
      included in several other places because they need other macros all
      came from kernel/sched/sched-pelt.h which was generated by
      Documentation/scheduler/sched-pelt. As the result, it causes compilation
      a lot of warnings,
      
        kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=]
        kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=]
        kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=]
        ...
      
      Silence it by appending the __maybe_unused attribute for it, so all
      generated variables and macros can still be kept in the same file.
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1559596304-31581-1-git-send-email-cai@lca.pwSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7104a8f3
    • Gao Xiang's avatar
      sched/core: Add __sched tag for io_schedule() · 1ce5c914
      Gao Xiang authored
      [ Upstream commit e3b929b0 ]
      
      Non-inline io_schedule() was introduced in:
      
        commit 10ab5643 ("sched/core: Separate out io_schedule_prepare() and io_schedule_finish()")
      
      Keep in line with io_schedule_timeout(), otherwise "/proc/<pid>/wchan" will
      report io_schedule() rather than its callers when waiting for IO.
      Reported-by: default avatarJilong Kou <koujilong@huawei.com>
      Signed-off-by: default avatarGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miao Xie <miaoxie@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 10ab5643 ("sched/core: Separate out io_schedule_prepare() and io_schedule_finish()")
      Link: https://lkml.kernel.org/r/20190603091338.2695-1-gaoxiang25@huawei.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ce5c914
    • Nicolas Dichtel's avatar
      xfrm: fix sa selector validation · db1af6e8
      Nicolas Dichtel authored
      [ Upstream commit b8d6d007 ]
      
      After commit b38ff407, the following command does not work anymore:
      $ ip xfrm state add src 10.125.0.2 dst 10.125.0.1 proto esp spi 34 reqid 1 \
        mode tunnel enc 'cbc(aes)' 0xb0abdba8b782ad9d364ec81e3a7d82a1 auth-trunc \
        'hmac(sha1)' 0xe26609ebd00acb6a4d51fca13e49ea78a72c73e6 96 flag align4
      
      In fact, the selector is not mandatory, allow the user to provide an empty
      selector.
      
      Fixes: b38ff407 ("xfrm: Fix xfrm sel prefix length validation")
      CC: Anirudh Gupta <anirudh.gupta@sophos.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      db1af6e8
    • Tejun Heo's avatar
      blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration · ded52121
      Tejun Heo authored
      [ Upstream commit 66311422 ]
      
      wbc_account_io() collects information on cgroup ownership of writeback
      pages to determine which cgroup should own the inode.  Pages can stay
      associated with dead memcgs but we want to avoid attributing IOs to
      dead blkcgs as much as possible as the association is likely to be
      stale.  However, currently, pages associated with dead memcgs
      contribute to the accounting delaying and/or confusing the
      arbitration.
      
      Fix it by ignoring pages associated with dead memcgs.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ded52121
    • Bob Liu's avatar
      block: null_blk: fix race condition for null_del_dev · 7b6ee182
      Bob Liu authored
      [ Upstream commit 7602843f ]
      
      Dulicate call of null_del_dev() will trigger null pointer error like below.
      The reason is a race condition between nullb_device_power_store() and
      nullb_group_drop_item().
      
        CPU#0                         CPU#1
        ----------------              -----------------
        do_rmdir()
         >configfs_rmdir()
          >client_drop_item()
           >nullb_group_drop_item()
                                      nullb_device_power_store()
      				>null_del_dev()
      
            >test_and_clear_bit(NULLB_DEV_FL_UP
             >null_del_dev()
             ^^^^^
             Duplicated null_dev_dev() triger null pointer error
      
      				>clear_bit(NULLB_DEV_FL_UP
      
      The fix could be keep the sequnce of clear NULLB_DEV_FL_UP and null_del_dev().
      
      [  698.613600] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [  698.613608] #PF error: [normal kernel read fault]
      [  698.613611] PGD 0 P4D 0
      [  698.613619] Oops: 0000 [#1] SMP PTI
      [  698.613627] CPU: 3 PID: 6382 Comm: rmdir Not tainted 5.0.0+ #35
      [  698.613631] Hardware name: LENOVO 20LJS2EV08/20LJS2EV08, BIOS R0SET33W (1.17 ) 07/18/2018
      [  698.613644] RIP: 0010:null_del_dev+0xc/0x110 [null_blk]
      [  698.613649] Code: 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b eb 97 e8 47 bb 2a e8 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <8b> 77 18 48 89 fb 4c 8b 27 48 c7 c7 40 57 1e c1 e8 bf c7 cb e8 48
      [  698.613654] RSP: 0018:ffffb887888bfde0 EFLAGS: 00010286
      [  698.613659] RAX: 0000000000000000 RBX: ffff9d436d92bc00 RCX: ffff9d43a9184681
      [  698.613663] RDX: ffffffffc11e5c30 RSI: 0000000068be6540 RDI: 0000000000000000
      [  698.613667] RBP: ffffb887888bfdf0 R08: 0000000000000001 R09: 0000000000000000
      [  698.613671] R10: ffffb887888bfdd8 R11: 0000000000000f16 R12: ffff9d436d92bc08
      [  698.613675] R13: ffff9d436d94e630 R14: ffffffffc11e5088 R15: ffffffffc11e5000
      [  698.613680] FS:  00007faa68be6540(0000) GS:ffff9d43d14c0000(0000) knlGS:0000000000000000
      [  698.613685] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  698.613689] CR2: 0000000000000018 CR3: 000000042f70c002 CR4: 00000000003606e0
      [  698.613693] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  698.613697] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  698.613700] Call Trace:
      [  698.613712]  nullb_group_drop_item+0x50/0x70 [null_blk]
      [  698.613722]  client_drop_item+0x29/0x40
      [  698.613728]  configfs_rmdir+0x1ed/0x300
      [  698.613738]  vfs_rmdir+0xb2/0x130
      [  698.613743]  do_rmdir+0x1c7/0x1e0
      [  698.613750]  __x64_sys_rmdir+0x17/0x20
      [  698.613759]  do_syscall_64+0x5a/0x110
      [  698.613768]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7b6ee182
    • Yunsheng Lin's avatar
      net: hns3: delay ring buffer clearing during reset · 6d3e3e03
      Yunsheng Lin authored
      [ Upstream commit 3a30964a ]
      
      The driver may not be able to disable the ring through firmware
      when downing the netdev during reset process, which may cause
      hardware accessing freed buffer problem.
      
      This patch delays the ring buffer clearing to reset uninit
      process because hardware will not access the ring buffer after
      hardware reset is completed.
      
      Fixes: bb6b94a8 ("net: hns3: Add reset interface implementation in client")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6d3e3e03
    • Yunsheng Lin's avatar
      net: hns3: fix for skb leak when doing selftest · 09b3600e
      Yunsheng Lin authored
      [ Upstream commit 8f9eed1a ]
      
      If hns3_nic_net_xmit does not return NETDEV_TX_BUSY when doing
      a loopback selftest, the skb is not freed in hns3_clean_tx_ring
      or hns3_nic_net_xmit, which causes skb not freed problem.
      
      This patch fixes it by freeing skb when hns3_nic_net_xmit does
      not return NETDEV_TX_OK.
      
      Fixes: c39c4d98 ("net: hns3: Add mac loopback selftest support in hns3 driver")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      09b3600e
    • Yunsheng Lin's avatar
      net: hns3: fix for dereferencing before null checking · 2da60b8e
      Yunsheng Lin authored
      [ Upstream commit 75718800 ]
      
      The netdev is dereferenced before null checking in the function
      hns3_setup_tc.
      
      This patch moves the dereferencing after the null checking.
      
      Fixes: 76ad4f0e ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2da60b8e
    • Michal Kalderon's avatar
      qed: iWARP - Fix tc for MPA ll2 connection · 38d13b38
      Michal Kalderon authored
      [ Upstream commit cb94d52b ]
      
      The driver needs to assign a lossless traffic class for the MPA ll2
      connection to ensure no packets are dropped when returning from the
      driver as they will never be re-transmitted by the peer.
      
      Fixes: ae3488ff ("qed: Add ll2 connection for processing unaligned MPA packets")
      Signed-off-by: default avatarAriel Elior <ariel.elior@marvell.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      38d13b38
    • Aaron Lewis's avatar
      x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS · eb3e01b6
      Aaron Lewis authored
      [ Upstream commit cbb99c0f ]
      
      Add the CPUID enumeration for Intel's de-feature bits to accommodate
      passing these de-features through to kvm guests.
      
      These de-features are (from SDM vol 1, section 8.1.8):
       - X86_FEATURE_FDP_EXCPTN_ONLY: If CPUID.(EAX=07H,ECX=0H):EBX[bit 6] = 1, the
         data pointer (FDP) is updated only for the x87 non-control instructions that
         incur unmasked x87 exceptions.
       - X86_FEATURE_ZERO_FCS_FDS: If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 1, the
         processor deprecates FCS and FDS; it saves each as 0000H.
      Signed-off-by: default avatarAaron Lewis <aaronlewis@google.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: marcorr@google.com
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: pshier@google.com
      Cc: Robert Hoo <robert.hu@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190605220252.103406-1-aaronlewis@google.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      eb3e01b6
    • Waiman Long's avatar
      rcu: Force inlining of rcu_read_lock() · 32d7d88f
      Waiman Long authored
      [ Upstream commit 6da9f775 ]
      
      When debugging options are turned on, the rcu_read_lock() function
      might not be inlined. This results in lockdep's print_lock() function
      printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller.
      For example:
      
      [   10.579995] =============================
      [   10.584033] WARNING: suspicious RCU usage
      [   10.588074] 4.18.0.memcg_v2+ #1 Not tainted
      [   10.593162] -----------------------------
      [   10.597203] include/linux/rcupdate.h:281 Illegal context switch in
      RCU read-side critical section!
      [   10.606220]
      [   10.606220] other info that might help us debug this:
      [   10.606220]
      [   10.614280]
      [   10.614280] rcu_scheduler_active = 2, debug_locks = 1
      [   10.620853] 3 locks held by systemd/1:
      [   10.624632]  #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70
      [   10.633232]  #1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70
      [   10.640954]  #2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70
      
      These "rcu_read_lock+0x0/0x70" strings are not providing any useful
      information.  This commit therefore forces inlining of the rcu_read_lock()
      function so that rcu_read_lock()'s caller is instead shown.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      32d7d88f
    • Jerome Brunet's avatar
      ASoC: meson: axg-tdm: fix sample clock inversion · 031cf89b
      Jerome Brunet authored
      [ Upstream commit cb36ff78 ]
      
      The content of SND_SOC_DAIFMT_FORMAT_MASK is a number, not a bitfield,
      so the test to check if the format is i2s is wrong. Because of this the
      clock setting may be wrong. For example, the sample clock gets inverted
      in DSP B mode, when it should not.
      
      Fix the lrclk invert helper function
      
      Fixes: 1a11d88f ("ASoC: meson: add tdm formatter base driver")
      Signed-off-by: default avatarJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      031cf89b
    • Rajneesh Bhardwaj's avatar
      x86/cpu: Add Ice Lake NNPI to Intel family · c3ee70b8
      Rajneesh Bhardwaj authored
      [ Upstream commit e32d045c ]
      
      Add the CPUID model number of Ice Lake Neural Network Processor for Deep
      Learning Inference (ICL-NNPI) to the Intel family list. Ice Lake NNPI uses
      model number 0x9D and this will be documented in a future version of Intel
      Software Development Manual.
      Signed-off-by: default avatarRajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@suse.de
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: platform-driver-x86@vger.kernel.org
      Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linux PM <linux-pm@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190606012419.13250-1-rajneesh.bhardwaj@linux.intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      c3ee70b8