1. 12 Feb, 2019 40 commits
    • Waiman Long's avatar
      mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init · f73c7753
      Waiman Long authored
      [ Upstream commit 3c0c12cc ]
      
      When CONFIG_KASAN is enabled on large memory SMP systems, the deferrred
      pages initialization can take a long time.  Below were the reported init
      times on a 8-socket 96-core 4TB IvyBridge system.
      
        1) Non-debug kernel without CONFIG_KASAN
           [    8.764222] node 1 initialised, 132086516 pages in 7027ms
      
        2) Debug kernel with CONFIG_KASAN
           [  146.288115] node 1 initialised, 132075466 pages in 143052ms
      
      So the page init time in a debug kernel was 20X of the non-debug kernel.
      The long init time can be problematic as the page initialization is done
      with interrupt disabled.  In this particular case, it caused the
      appearance of following warning messages as well as NMI backtraces of all
      the cores that were doing the initialization.
      
      [   68.240049] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
      [   68.241000] rcu: 	25-...0: (100 ticks this GP) idle=b72/1/0x4000000000000000 softirq=915/915 fqs=16252
      [   68.241000] rcu: 	44-...0: (95 ticks this GP) idle=49a/1/0x4000000000000000 softirq=788/788 fqs=16253
      [   68.241000] rcu: 	54-...0: (104 ticks this GP) idle=03a/1/0x4000000000000000 softirq=721/825 fqs=16253
      [   68.241000] rcu: 	60-...0: (103 ticks this GP) idle=cbe/1/0x4000000000000000 softirq=637/740 fqs=16253
      [   68.241000] rcu: 	72-...0: (105 ticks this GP) idle=786/1/0x4000000000000000 softirq=536/641 fqs=16253
      [   68.241000] rcu: 	84-...0: (99 ticks this GP) idle=292/1/0x4000000000000000 softirq=537/537 fqs=16253
      [   68.241000] rcu: 	111-...0: (104 ticks this GP) idle=bde/1/0x4000000000000000 softirq=474/476 fqs=16253
      [   68.241000] rcu: 	(detected by 13, t=65018 jiffies, g=249, q=2)
      
      The long init time was mainly caused by the call to kasan_free_pages() to
      poison the newly initialized pages.  On a 4TB system, we are talking about
      almost 500GB of memory probably on the same node.
      
      In reality, we may not need to poison the newly initialized pages before
      they are ever allocated.  So KASAN poisoning of freed pages before the
      completion of deferred memory initialization is now disabled.  Those pages
      will be properly poisoned when they are allocated or freed after deferred
      pages initialization is done.
      
      With this change, the new page initialization time became:
      
      [   21.948010] node 1 initialised, 132075466 pages in 18702ms
      
      This was still about double the non-debug kernel time, but was much
      better than before.
      
      Link: http://lkml.kernel.org/r/1544459388-8736-1-git-send-email-longman@redhat.comSigned-off-by: default avatarWaiman Long <longman@redhat.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f73c7753
    • Larry Chen's avatar
      ocfs2: improve ocfs2 Makefile · 066206bc
      Larry Chen authored
      [ Upstream commit 9e6aea22 ]
      
      Included file path was hard-wired in the ocfs2 makefile, which might
      causes some confusion when compiling ocfs2 as an external module.
      
      Say if we compile ocfs2 module as following.
      cp -r /kernel/tree/fs/ocfs2 /other/dir/ocfs2
      cd /other/dir/ocfs2
      make -C /path/to/kernel_source M=`pwd` modules
      
      Acutally, the compiler wil try to find included file in
      /kernel/tree/fs/ocfs2, rather than the directory /other/dir/ocfs2.
      
      To fix this little bug, we introduce the var $(src) provided by kbuild.
      $(src) means the absolute path of the running kbuild file.
      
      Link: http://lkml.kernel.org/r/20181108085546.15149-1-lchen@suse.comSigned-off-by: default avatarLarry Chen <lchen@suse.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      066206bc
    • Junxiao Bi's avatar
      ocfs2: don't clear bh uptodate for block read · 69e63b49
      Junxiao Bi authored
      [ Upstream commit 70306d9d ]
      
      For sync io read in ocfs2_read_blocks_sync(), first clear bh uptodate flag
      and submit the io, second wait io done, last check whether bh uptodate, if
      not return io error.
      
      If two sync io for the same bh were issued, it could be the first io done
      and set uptodate flag, but just before check that flag, the second io came
      in and cleared uptodate, then ocfs2_read_blocks_sync() for the first io
      will return IO error.
      
      Indeed it's not necessary to clear uptodate flag, as the io end handler
      end_buffer_read_sync() will set or clear it based on io succeed or failed.
      
      The following message was found from a nfs server but the underlying
      storage returned no error.
      
      [4106438.567376] (nfsd,7146,3):ocfs2_get_suballoc_slot_bit:2780 ERROR: read block 1238823695 failed -5
      [4106438.567569] (nfsd,7146,3):ocfs2_get_suballoc_slot_bit:2812 ERROR: status = -5
      [4106438.567611] (nfsd,7146,3):ocfs2_test_inode_bit:2894 ERROR: get alloc slot and bit failed -5
      [4106438.567643] (nfsd,7146,3):ocfs2_test_inode_bit:2932 ERROR: status = -5
      [4106438.567675] (nfsd,7146,3):ocfs2_get_dentry:94 ERROR: test inode bit failed -5
      
      Same issue in non sync read ocfs2_read_blocks(), fixed it as well.
      
      Link: http://lkml.kernel.org/r/20181121020023.3034-4-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Reviewed-by: default avatarYiwen Jiang <jiangyiwen@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      69e63b49
    • Randy Dunlap's avatar
      arch/sh/boards/mach-kfr2r09/setup.c: fix struct mtd_oob_ops build warning · dc8bd7ed
      Randy Dunlap authored
      [ Upstream commit 440e7b37 ]
      
      arch/sh/boards/mach-kfr2r09/setup.c does not need to #include
      <mtd/onenand.h>, and doing so causes a build warning, so drop that header
      file.
      
      In file included from ../arch/sh/boards/mach-kfr2r09/setup.c:28:
      ../include/linux/mtd/onenand.h:225:12: warning: 'struct mtd_oob_ops' declared inside parameter list will not be visible outside of this definition or declaration
           struct mtd_oob_ops *ops);
      
      Link: http://lkml.kernel.org/r/702f0a25-c63e-6912-4640-6ab0f00afbc7@infradead.org
      Fixes: f3590dc3 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Suggested-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Reviewed-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dc8bd7ed
    • Marc Zyngier's avatar
      scripts/decode_stacktrace: only strip base path when a prefix of the path · fa3c7c09
      Marc Zyngier authored
      [ Upstream commit 67a28de4 ]
      
      Running something like:
      
      	decodecode vmlinux .
      
      leads to interested results where not only the leading "." gets stripped
      from the displayed paths, but also anywhere in the string, displaying
      something like:
      
      	kvm_vcpu_check_block (arch/arm64/kvm/virt/kvm/kvm_mainc:2141)
      
      which doesn't help further processing.
      
      Fix it by only stripping the base path if it is a prefix of the path.
      
      Link: http://lkml.kernel.org/r/20181210174659.31054-3-marc.zyngier@arm.comSigned-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fa3c7c09
    • Jiri Olsa's avatar
      perf python: Do not force closing original perf descriptor in evlist.get_pollfd() · 395cbb9a
      Jiri Olsa authored
      [ Upstream commit a389aece ]
      
      Ondřej reported that when compiled with python3, the python extension
      regresses in evlist.get_pollfd function behaviour.
      
      The evlist.get_pollfd function creates file objects from evlist's fds
      and returns them in a list. The python3 version also sets them to 'close
      the original descriptor' when the object dies (is closed), by passing
      True via the 'closefd' arg in the PyFile_FromFd call.
      
      The python's closefd doc says:
      
        If closefd is False, the underlying file descriptor will be kept open
        when the file is closed.
      
      That's why the following line in python3 closes all evlist fds:
      
        evlist.get_pollfd()
      
      the returned list is immediately destroyed and that takes down the
      original events fds.
      
      Passing closefd as False to PyFile_FromFd to fix this.
      Reported-by: default avatarOndřej Lysoněk <olysonek@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jaroslav Škarvada <jskarvad@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 66dfdff0 ("perf tools: Add Python 3 support")
      Link: http://lkml.kernel.org/r/20181226112121.5285-1-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      395cbb9a
    • Ondrej Mosnacek's avatar
      cgroup: fix parsing empty mount option string · 4b5abffd
      Ondrej Mosnacek authored
      [ Upstream commit e250d91d ]
      
      This fixes the case where all mount options specified are consumed by an
      LSM and all that's left is an empty string. In this case cgroupfs should
      accept the string and not fail.
      
      How to reproduce (with SELinux enabled):
      
          # umount /sys/fs/cgroup/unified
          # mount -o context=system_u:object_r:cgroup_t:s0 -t cgroup2 cgroup2 /sys/fs/cgroup/unified
          mount: /sys/fs/cgroup/unified: wrong fs type, bad option, bad superblock on cgroup2, missing codepage or helper program, or other error.
          # dmesg | tail -n 1
          [   31.575952] cgroup: cgroup2: unknown option ""
      
      Fixes: 67e9c74b ("cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type")
      [NOTE: should apply on top of commit 5136f636 ("cgroup: implement "nsdelegate" mount option"), older versions need manual rebase]
      Suggested-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4b5abffd
    • Sahitya Tummala's avatar
      f2fs: fix sbi->extent_list corruption issue · df13b036
      Sahitya Tummala authored
      [ Upstream commit e4589fa5 ]
      
      When there is a failure in f2fs_fill_super() after/during
      the recovery of fsync'd nodes, it frees the current sbi and
      retries again. This time the mount is successful, but the files
      that got recovered before retry, still holds the extent tree,
      whose extent nodes list is corrupted since sbi and sbi->extent_list
      is freed up. The list_del corruption issue is observed when the
      file system is getting unmounted and when those recoverd files extent
      node is being freed up in the below context.
      
      list_del corruption. prev->next should be fffffff1e1ef5480, but was (null)
      <...>
      kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
      lr : __list_del_entry_valid+0x94/0xb4
      pc : __list_del_entry_valid+0x94/0xb4
      <...>
      Call trace:
      __list_del_entry_valid+0x94/0xb4
      __release_extent_node+0xb0/0x114
      __free_extent_tree+0x58/0x7c
      f2fs_shrink_extent_tree+0xdc/0x3b0
      f2fs_leave_shrinker+0x28/0x7c
      f2fs_put_super+0xfc/0x1e0
      generic_shutdown_super+0x70/0xf4
      kill_block_super+0x2c/0x5c
      kill_f2fs_super+0x44/0x50
      deactivate_locked_super+0x60/0x8c
      deactivate_super+0x68/0x74
      cleanup_mnt+0x40/0x78
      __cleanup_mnt+0x1c/0x28
      task_work_run+0x48/0xd0
      do_notify_resume+0x678/0xe98
      work_pending+0x8/0x14
      
      Fix this by not creating extents for those recovered files if shrinker is
      not registered yet. Once mount is successful and shrinker is registered,
      those files can have extents again.
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df13b036
    • Kangjie Lu's avatar
      niu: fix missing checks of niu_pci_eeprom_read · 4d6b5b08
      Kangjie Lu authored
      [ Upstream commit 26fd962b ]
      
      niu_pci_eeprom_read() may fail, so we should check its return value
      before using the read data.
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Acked-by: default avatarShannon Nelson <shannon.lee.nelson@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4d6b5b08
    • Anton Ivanov's avatar
      um: Avoid marking pages with "changed protection" · 1e480ee6
      Anton Ivanov authored
      [ Upstream commit 8892d854 ]
      
      Changing protection is a very high cost operation in UML
      because in addition to an extra syscall it also interrupts
      mmap merge sequences generated by the tlb.
      
      While the condition is not particularly common it is worth
      avoiding.
      Signed-off-by: default avatarAnton Ivanov <anton.ivanov@cambridgegreys.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1e480ee6
    • Sahitya Tummala's avatar
      f2fs: fix use-after-free issue when accessing sbi->stat_info · 69e7f877
      Sahitya Tummala authored
      [ Upstream commit 60aa4d55 ]
      
      iput() on sbi->node_inode can update sbi->stat_info
      in the below context, if the f2fs_write_checkpoint()
      has failed with error.
      
      f2fs_balance_fs_bg+0x1ac/0x1ec
      f2fs_write_node_pages+0x4c/0x260
      do_writepages+0x80/0xbc
      __writeback_single_inode+0xdc/0x4ac
      writeback_single_inode+0x9c/0x144
      write_inode_now+0xc4/0xec
      iput+0x194/0x22c
      f2fs_put_super+0x11c/0x1e8
      generic_shutdown_super+0x70/0xf4
      kill_block_super+0x2c/0x5c
      kill_f2fs_super+0x44/0x50
      deactivate_locked_super+0x60/0x8c
      deactivate_super+0x68/0x74
      cleanup_mnt+0x40/0x78
      
      Fix this by moving f2fs_destroy_stats() further below iput() in
      both f2fs_put_super() and f2fs_fill_super() paths.
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      69e7f877
    • Ronnie Sahlberg's avatar
      cifs: check ntwrk_buf_start for NULL before dereferencing it · 5d3b4cd8
      Ronnie Sahlberg authored
      [ Upstream commit 59a63e47 ]
      
      RHBZ: 1021460
      
      There is an issue where when multiple threads open/close the same directory
      ntwrk_buf_start might end up being NULL, causing the call to smbCalcSize
      later to oops with a NULL deref.
      
      The real bug is why this happens and why this can become NULL for an
      open cfile, which should not be allowed.
      This patch tries to avoid a oops until the time when we fix the underlying
      issue.
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5d3b4cd8
    • Stefan Roese's avatar
      MIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8 · 2cdf5246
      Stefan Roese authored
      [ Upstream commit 0b153944 ]
      
      Testing has shown, that when using mainline U-Boot on MT7688 based
      boards, the system may hang or crash while mounting the root-fs. The
      main issue here is that mainline U-Boot configures EBase to a value
      near the end of system memory. And with CONFIG_CPU_MIPSR2_IRQ_VI
      disabled, trap_init() will not allocate a new area to place the
      exception handler. The original value will be used and the handler
      will be copied to this location, which might already be used by some
      userspace application.
      
      The MT7688 supports VI - its config3 register is 0x00002420, so VInt
      (Bit 5) is set. But without setting CONFIG_CPU_MIPSR2_IRQ_VI this
      bit will not be evaluated to result in "cpu_has_vi" being set. This
      patch now selects CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8 which results
      trap_init() to allocate some memory for the exception handler.
      
      Please note that this issue was not seen with the Mediatek U-Boot
      version, as it does not touch EBase (stays at default of 0x8000.0000).
      This is strictly also not correct as the kernel (_text) resides
      here.
      Signed-off-by: default avatarStefan Roese <sr@denx.de>
      [paul.burton@mips.com: s/beeing/being/]
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: John Crispin <blogic@openwrt.org>
      Cc: Daniel Schwierzeck <daniel.schwierzeck@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2cdf5246
    • Nathan Chancellor's avatar
      crypto: ux500 - Use proper enum in hash_set_dma_transfer · 1dc571ff
      Nathan Chancellor authored
      [ Upstream commit 5ac93f80 ]
      
      Clang warns when one enumerated type is implicitly converted to another:
      
      drivers/crypto/ux500/hash/hash_core.c:169:4: warning: implicit
      conversion from enumeration type 'enum dma_data_direction' to different
      enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
                              direction, DMA_CTRL_ACK | DMA_PREP_INTERRUPT);
                              ^~~~~~~~~
      1 warning generated.
      
      dmaengine_prep_slave_sg expects an enum from dma_transfer_direction.
      We know that the only direction supported by this function is
      DMA_TO_DEVICE because of the check at the top of this function so we can
      just use the equivalent value from dma_transfer_direction.
      
      DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1dc571ff
    • Nathan Chancellor's avatar
      crypto: ux500 - Use proper enum in cryp_set_dma_transfer · 2b020c09
      Nathan Chancellor authored
      [ Upstream commit 9d880c59 ]
      
      Clang warns when one enumerated type is implicitly converted to another:
      
      drivers/crypto/ux500/cryp/cryp_core.c:559:5: warning: implicit
      conversion from enumeration type 'enum dma_data_direction' to different
      enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
                                      direction, DMA_CTRL_ACK);
                                      ^~~~~~~~~
      drivers/crypto/ux500/cryp/cryp_core.c:583:5: warning: implicit
      conversion from enumeration type 'enum dma_data_direction' to different
      enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
                                      direction,
                                      ^~~~~~~~~
      2 warnings generated.
      
      dmaengine_prep_slave_sg expects an enum from dma_transfer_direction.
      Because we know the value of the dma_data_direction enum from the
      switch statement, we can just use the proper value from
      dma_transfer_direction so there is no more conversion.
      
      DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1
      DMA_FROM_DEVICE = DMA_DEV_TO_MEM = 2
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2b020c09
    • Michael Ellerman's avatar
      seq_buf: Make seq_buf_puts() null-terminate the buffer · 6201e8ad
      Michael Ellerman authored
      [ Upstream commit 0464ed24 ]
      
      Currently seq_buf_puts() will happily create a non null-terminated
      string for you in the buffer. This is particularly dangerous if the
      buffer is on the stack.
      
      For example:
      
        char buf[8];
        char secret = "secret";
        struct seq_buf s;
      
        seq_buf_init(&s, buf, sizeof(buf));
        seq_buf_puts(&s, "foo");
        printk("Message is %s\n", buf);
      
      Can result in:
      
        Message is fooªªªªªsecret
      
      We could require all users to memset() their buffer to zero before
      use. But that seems likely to be forgotten and lead to bugs.
      
      Instead we can change seq_buf_puts() to always leave the buffer in a
      null-terminated state.
      
      The only downside is that this makes the buffer 1 character smaller
      for seq_buf_puts(), but that seems like a good trade off.
      
      Link: http://lkml.kernel.org/r/20181019042109.8064-1-mpe@ellerman.id.auAcked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6201e8ad
    • Kangjie Lu's avatar
      hwmon: (lm80) fix a missing check of bus read in lm80 probe · 040f7697
      Kangjie Lu authored
      [ Upstream commit 9aa3aa15 ]
      
      In lm80_probe(), if lm80_read_value() fails, it returns a negative
      error number which is stored to data->fan[f_min] and will be further
      used. We should avoid using the data if the read fails.
      
      The fix checks if lm80_read_value() fails, and if so, returns with the
      error number.
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      040f7697
    • Kangjie Lu's avatar
      hwmon: (lm80) fix a missing check of the status of SMBus read · 8dcf8478
      Kangjie Lu authored
      [ Upstream commit c9c63915 ]
      
      If lm80_read_value() fails, it returns a negative number instead of the
      correct read data. Therefore, we should avoid using the data if it
      fails.
      
      The fix checks if lm80_read_value() fails, and if so, returns with the
      error number.
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      [groeck: One variable for return values is enough]
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8dcf8478
    • Stanislav Fomichev's avatar
      perf build: Don't unconditionally link the libbfd feature test to -liberty and -lz · 303d29d8
      Stanislav Fomichev authored
      [ Upstream commit 14541b1e ]
      
      Current libbfd feature test unconditionally links against -liberty and -lz.
      While it's required on some systems (e.g. opensuse), it's completely
      unnecessary on the others, where only -lbdf is sufficient (debian).
      This patch streamlines (and renames) the following feature checks:
      
      feature-libbfd           - only link against -lbfd (debian),
                                 see commit 2cf90407 ("perf tools: Fix bfd
      			   dependency libraries detection")
      feature-libbfd-liberty   - link against -lbfd and -liberty
      feature-libbfd-liberty-z - link against -lbfd, -liberty and -lz (opensuse),
                                 see commit 280e7c48 ("perf tools: fix BFD
      			   detection on opensuse")
      
      (feature-liberty{,-z} were renamed to feature-libbfd-liberty{,z}
      for clarity)
      
      The main motivation is to fix this feature test for bpftool which is
      currently broken on debian (libbfd feature shows OFF, but we still
      unconditionally link against -lbfd and it works).
      
      Tested on debian with only -lbfd installed (without -liberty); I'd
      appreciate if somebody on the other systems can test this new detection
      method.
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/4dfc634cfcfb236883971b5107cf3c28ec8a31be.1542328222.git.sdf@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      303d29d8
    • Chris Perl's avatar
      NFS: nfs_compare_mount_options always compare auth flavors. · 8c642d71
      Chris Perl authored
      [ Upstream commit 594d1644 ]
      
      This patch removes the check from nfs_compare_mount_options to see if a
      `sec' option was passed for the current mount before comparing auth
      flavors and instead just always compares auth flavors.
      
      Consider the following scenario:
      
      You have a server with the address 192.168.1.1 and two exports /export/a
      and /export/b.  The first export supports `sys' and `krb5' security, the
      second just `sys'.
      
      Assume you start with no mounts from the server.
      
      The following results in EIOs being returned as the kernel nfs client
      incorrectly thinks it can share the underlying `struct nfs_server's:
      
      $ mkdir /tmp/{a,b}
      $ sudo mount -t nfs -o vers=3,sec=krb5 192.168.1.1:/export/a /tmp/a
      $ sudo mount -t nfs -o vers=3          192.168.1.1:/export/b /tmp/b
      $ df >/dev/null
      df: ‘/tmp/b’: Input/output error
      Signed-off-by: default avatarChris Perl <cperl@janestreet.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8c642d71
    • Jim Mattson's avatar
      kvm: Change offset in kvm_write_guest_offset_cached to unsigned · ad9241f2
      Jim Mattson authored
      [ Upstream commit 7a86dab8 ]
      
      Since the offset is added directly to the hva from the
      gfn_to_hva_cache, a negative offset could result in an out of bounds
      write. The existing BUG_ON only checks for addresses beyond the end of
      the gfn_to_hva_cache, not for addresses before the start of the
      gfn_to_hva_cache.
      
      Note that all current call sites have non-negative offsets.
      
      Fixes: 4ec6e863 ("kvm: Introduce kvm_write_guest_offset_cached()")
      Reported-by: default avatarCfir Cohen <cfir@google.com>
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarCfir Cohen <cfir@google.com>
      Reviewed-by: default avatarPeter Shier <pshier@google.com>
      Reviewed-by: default avatarKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ad9241f2
    • Mahesh Salgaonkar's avatar
      powerpc/fadump: Do not allow hot-remove memory from fadump reserved area. · fc090081
      Mahesh Salgaonkar authored
      [ Upstream commit 0db6896f ]
      
      For fadump to work successfully there should not be any holes in reserved
      memory ranges where kernel has asked firmware to move the content of old
      kernel memory in event of crash. Now that fadump uses CMA for reserved
      area, this memory area is now not protected from hot-remove operations
      unless it is cma allocated. Hence, fadump service can fail to re-register
      after the hot-remove operation, if hot-removed memory belongs to fadump
      reserved region. To avoid this make sure that memory from fadump reserved
      area is not hot-removable if fadump is registered.
      
      However, if user still wants to remove that memory, he can do so by
      manually stopping fadump service before hot-remove operation.
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fc090081
    • Vitaly Kuznetsov's avatar
      KVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported · 84a4572b
      Vitaly Kuznetsov authored
      [ Upstream commit e87555e5 ]
      
      AMD doesn't seem to implement MSR_IA32_MCG_EXT_CTL and svm code in kvm
      knows nothing about it, however, this MSR is among emulated_msrs and
      thus returned with KVM_GET_MSR_INDEX_LIST. The consequent KVM_GET_MSRS,
      of course, fails.
      
      Report the MSR as unsupported to not confuse userspace.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      84a4572b
    • Martin Blumenstingl's avatar
      pinctrl: meson: meson8b: fix the GPIO function for the GPIOAO pins · a41cd69e
      Martin Blumenstingl authored
      [ Upstream commit 2b745ac3 ]
      
      The GPIOAO pins (as well as the two exotic GPIO_BSD_EN and GPIO_TEST_N)
      only belong to the pin controller in the AO domain. With the current
      definition these pins cannot be referred to in .dts files as group
      (which is possible on GXBB and GXL for example).
      
      Add a separate "gpio_aobus" function to fix the mapping between the pin
      controller and the GPIO pins in the AO domain. This is similar to how
      the GXBB and GXL drivers implement this functionality.
      
      Fixes: 9dab1868 ("pinctrl: amlogic: Make driver independent from two-domain configuration")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a41cd69e
    • Martin Blumenstingl's avatar
      pinctrl: meson: meson8: fix the GPIO function for the GPIOAO pins · 098fa489
      Martin Blumenstingl authored
      [ Upstream commit 42f9b48c ]
      
      The GPIOAO pins (as well as the two exotic GPIO_BSD_EN and GPIO_TEST_N)
      only belong to the pin controller in the AO domain. With the current
      definition these pins cannot be referred to in .dts files as group
      (which is possible on GXBB and GXL for example).
      
      Add a separate "gpio_aobus" function to fix the mapping between the pin
      controller and the GPIO pins in the AO domain. This is similar to how
      the GXBB and GXL drivers implement this functionality.
      
      Fixes: 9dab1868 ("pinctrl: amlogic: Make driver independent from two-domain configuration")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      098fa489
    • Christophe Leroy's avatar
      powerpc/mm: Fix reporting of kernel execute faults on the 8xx · 279eb1d9
      Christophe Leroy authored
      [ Upstream commit ffca395b ]
      
      On the 8xx, no-execute is set via PPP bits in the PTE. Therefore
      a no-exec fault generates DSISR_PROTFAULT error bits,
      not DSISR_NOEXEC_OR_G.
      
      This patch adds DSISR_PROTFAULT in the test mask.
      
      Fixes: d3ca5874 ("powerpc/mm: Fix reporting of kernel execute faults")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      279eb1d9
    • Noralf Trønnes's avatar
      fbdev: fbcon: Fix unregister crash when more than one framebuffer · 9e9435bc
      Noralf Trønnes authored
      [ Upstream commit 2122b405 ]
      
      When unregistering fbdev using unregister_framebuffer(), any bound
      console will unbind automatically. This is working fine if this is the
      only framebuffer, resulting in a switch to the dummy console. However if
      there is a fb0 and I unregister fb1 having a bound console, I eventually
      get a crash. The fastest way for me to trigger the crash is to do a
      reboot, resulting in this splat:
      
      [   76.478825] WARNING: CPU: 0 PID: 527 at linux/kernel/workqueue.c:1442 __queue_work+0x2d4/0x41c
      [   76.478849] Modules linked in: raspberrypi_hwmon gpio_backlight backlight bcm2835_rng rng_core [last unloaded: tinydrm]
      [   76.478916] CPU: 0 PID: 527 Comm: systemd-udevd Not tainted 4.20.0-rc4+ #4
      [   76.478933] Hardware name: BCM2835
      [   76.478949] Backtrace:
      [   76.478995] [<c010d388>] (dump_backtrace) from [<c010d670>] (show_stack+0x20/0x24)
      [   76.479022]  r6:00000000 r5:c0bc73be r4:00000000 r3:6fb5bf81
      [   76.479060] [<c010d650>] (show_stack) from [<c08e82f4>] (dump_stack+0x20/0x28)
      [   76.479102] [<c08e82d4>] (dump_stack) from [<c0120070>] (__warn+0xec/0x12c)
      [   76.479134] [<c011ff84>] (__warn) from [<c01201e4>] (warn_slowpath_null+0x4c/0x58)
      [   76.479165]  r9:c0eb6944 r8:00000001 r7:c0e927f8 r6:c0bc73be r5:000005a2 r4:c0139e84
      [   76.479197] [<c0120198>] (warn_slowpath_null) from [<c0139e84>] (__queue_work+0x2d4/0x41c)
      [   76.479222]  r6:d7666a00 r5:c0e918ee r4:dbc4e700
      [   76.479251] [<c0139bb0>] (__queue_work) from [<c013a02c>] (queue_work_on+0x60/0x88)
      [   76.479281]  r10:c0496bf8 r9:00000100 r8:c0e92ae0 r7:00000001 r6:d9403700 r5:d7666a00
      [   76.479298]  r4:20000113
      [   76.479348] [<c0139fcc>] (queue_work_on) from [<c0496c28>] (cursor_timer_handler+0x30/0x54)
      [   76.479374]  r7:d8a8fabc r6:c0e08088 r5:d8afdc5c r4:d8a8fabc
      [   76.479413] [<c0496bf8>] (cursor_timer_handler) from [<c0178744>] (call_timer_fn+0x100/0x230)
      [   76.479435]  r4:c0e9192f r3:d758a340
      [   76.479465] [<c0178644>] (call_timer_fn) from [<c0178980>] (expire_timers+0x10c/0x12c)
      [   76.479495]  r10:40000000 r9:c0e9192f r8:c0e92ae0 r7:d8afdccc r6:c0e19280 r5:c0496bf8
      [   76.479513]  r4:d8a8fabc
      [   76.479541] [<c0178874>] (expire_timers) from [<c0179630>] (run_timer_softirq+0xa8/0x184)
      [   76.479570]  r9:00000001 r8:c0e19280 r7:00000000 r6:c0e08088 r5:c0e1a3e0 r4:c0e19280
      [   76.479603] [<c0179588>] (run_timer_softirq) from [<c0102404>] (__do_softirq+0x1ac/0x3fc)
      [   76.479632]  r10:c0e91680 r9:d8afc020 r8:0000000a r7:00000100 r6:00000001 r5:00000002
      [   76.479650]  r4:c0eb65ec
      [   76.479686] [<c0102258>] (__do_softirq) from [<c0124d10>] (irq_exit+0xe8/0x168)
      [   76.479716]  r10:d8d1a9b0 r9:d8afc000 r8:00000001 r7:d949c000 r6:00000000 r5:c0e8b3f0
      [   76.479734]  r4:00000000
      [   76.479764] [<c0124c28>] (irq_exit) from [<c016b72c>] (__handle_domain_irq+0x94/0xb0)
      [   76.479793] [<c016b698>] (__handle_domain_irq) from [<c01021dc>] (bcm2835_handle_irq+0x3c/0x48)
      [   76.479823]  r8:d8afdebc r7:d8afddfc r6:ffffffff r5:c0e089f8 r4:d8afddc8 r3:d8afddc8
      [   76.479851] [<c01021a0>] (bcm2835_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
      
      The problem is in the console rebinding in fbcon_fb_unbind(). It uses the
      virtual console index as the new framebuffer index to bind the console(s)
      to. The correct way is to use the con2fb_map lookup table to find the
      framebuffer index.
      
      Fixes: cfafca80 ("fbdev: fbcon: console unregistration from unregister_framebuffer")
      Signed-off-by: default avatarNoralf Trønnes <noralf@tronnes.org>
      Reviewed-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Acked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9e9435bc
    • Lenny Szubowicz's avatar
      ACPI/APEI: Clear GHES block_status before panic() · 00e391ec
      Lenny Szubowicz authored
      [ Upstream commit 98cff8b2 ]
      
      In __ghes_panic() clear the block status in the APEI generic
      error status block for that generic hardware error source before
      calling panic() to prevent a second panic() in the crash kernel
      for exactly the same fatal error.
      
      Otherwise ghes_probe(), running in the crash kernel, would see
      an unhandled error in the APEI generic error status block and
      panic again, thereby precluding any crash dump.
      Signed-off-by: default avatarLenny Szubowicz <lszubowi@redhat.com>
      Signed-off-by: default avatarDavid Arcari <darcari@redhat.com>
      Tested-by: default avatarTyler Baicar <baicar.tyler@gmail.com>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      00e391ec
    • Kai-Heng Feng's avatar
      igb: Fix an issue that PME is not enabled during runtime suspend · d657e82f
      Kai-Heng Feng authored
      [ Upstream commit 1fb3a7a7 ]
      
      I210 ethernet card doesn't wakeup when a cable gets plugged. It's
      because its PME is not set.
      
      Since commit 42eca230 ("PCI: Don't touch card regs after runtime
      suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops
      calling pci_finish_runtime_suspend(), which enables the PCI PME.
      
      To fix the issue, let's not to save PCI states when it's runtime
      suspend, to let the PCI subsystem enables PME.
      
      Fixes: 42eca230 ("PCI: Don't touch card regs after runtime suspend D3")
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d657e82f
    • Young Xiao's avatar
      ice: Do not enable NAPI on q_vectors that have no rings · d01b26e7
      Young Xiao authored
      [ Upstream commit eec90376 ]
      
      If ice driver has q_vectors w/ active NAPI that has no rings,
      then this will result in a divide by zero error. To correct it
      I am updating the driver code so that we only support NAPI on
      q_vectors that have 1 or more rings allocated to them.
      
      See commit 13a8cd19 ("i40e: Do not enable NAPI on q_vectors
      that have no rings") for detail.
      Signed-off-by: default avatarYoung Xiao <YangX92@hotmail.com>
      Acked-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d01b26e7
    • Konstantin Khorenko's avatar
      i40e: define proper net_device::neigh_priv_len · 8a35f678
      Konstantin Khorenko authored
      [ Upstream commit 31389b53 ]
      
      Out of bound read reported by KASan.
      
      i40iw_net_event() reads unconditionally 16 bytes from
      neigh->primary_key while the memory allocated for
      "neighbour" struct is evaluated in neigh_alloc() as
      
        tbl->entry_size + dev->neigh_priv_len
      
      where "dev" is a net_device.
      
      But the driver does not setup dev->neigh_priv_len and
      we read beyond the neigh entry allocated memory,
      so the patch in the next mail fixes this.
      Signed-off-by: default avatarKonstantin Khorenko <khorenko@virtuozzo.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8a35f678
    • Peter Rosin's avatar
      fbdev: fbmem: behave better with small rotated displays and many CPUs · 6584e6c0
      Peter Rosin authored
      [ Upstream commit f75df8d4 ]
      
      Blitting an image with "negative" offsets is not working since there
      is no clipping. It hopefully just crashes. For the bootup logo, there
      is protection so that blitting does not happen as the image is drawn
      further and further to the right (ROTATE_UR) or further and further
      down (ROTATE_CW). There is however no protection when drawing in the
      opposite directions (ROTATE_UD and ROTATE_CCW).
      
      Add back this protection.
      
      The regression is 20-odd years old but the mindless warning-killing
      mentality displayed in commit 34bdb666 ("fbdev: fbmem: remove
      positive test on unsigned values") is also to blame, methinks.
      
      Fixes: 448d4797 ("fbdev: fb_do_show_logo() updates")
      Signed-off-by: default avatarPeter Rosin <peda@axentia.se>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Fabian Frederick <ffrederick@users.sourceforge.net>
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      cc: Geoff Levand <geoff@infradead.org>
      Cc: James Simmons <jsimmons@users.sf.net>
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6584e6c0
    • Guoqing Jiang's avatar
      md: fix raid10 hang issue caused by barrier · 54541a75
      Guoqing Jiang authored
      [ Upstream commit e820d55c ]
      
      When both regular IO and resync IO happen at the same time,
      and if we also need to split regular. Then we can see tasks
      hang due to barrier.
      
      1. resync thread
      [ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 seconds.
      [ 1463.757207]       Not tainted 4.19.5-1-default #1
      [ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 1463.757212] md1_resync      D    0  5215      2 0x80000000
      [ 1463.757216] Call Trace:
      [ 1463.757223]  ? __schedule+0x29a/0x880
      [ 1463.757231]  ? raise_barrier+0x8d/0x140 [raid10]
      [ 1463.757236]  schedule+0x78/0x110
      [ 1463.757243]  raise_barrier+0x8d/0x140 [raid10]
      [ 1463.757248]  ? wait_woken+0x80/0x80
      [ 1463.757257]  raid10_sync_request+0x1f6/0x1e30 [raid10]
      [ 1463.757265]  ? _raw_spin_unlock_irq+0x22/0x40
      [ 1463.757284]  ? is_mddev_idle+0x125/0x137 [md_mod]
      [ 1463.757302]  md_do_sync.cold.78+0x404/0x969 [md_mod]
      [ 1463.757311]  ? wait_woken+0x80/0x80
      [ 1463.757336]  ? md_rdev_init+0xb0/0xb0 [md_mod]
      [ 1463.757351]  md_thread+0xe9/0x140 [md_mod]
      [ 1463.757358]  ? _raw_spin_unlock_irqrestore+0x2e/0x60
      [ 1463.757364]  ? __kthread_parkme+0x4c/0x70
      [ 1463.757369]  kthread+0x112/0x130
      [ 1463.757374]  ? kthread_create_worker_on_cpu+0x40/0x40
      [ 1463.757380]  ret_from_fork+0x3a/0x50
      
      2. regular IO
      [ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 seconds.
      [ 1463.760683]       Not tainted 4.19.5-1-default #1
      [ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 1463.760687] kworker/0:8     D    0  5367      2 0x80000000
      [ 1463.760718] Workqueue: md submit_flushes [md_mod]
      [ 1463.760721] Call Trace:
      [ 1463.760731]  ? __schedule+0x29a/0x880
      [ 1463.760741]  ? wait_barrier+0xdd/0x170 [raid10]
      [ 1463.760746]  schedule+0x78/0x110
      [ 1463.760753]  wait_barrier+0xdd/0x170 [raid10]
      [ 1463.760761]  ? wait_woken+0x80/0x80
      [ 1463.760768]  raid10_write_request+0xf2/0x900 [raid10]
      [ 1463.760774]  ? wait_woken+0x80/0x80
      [ 1463.760778]  ? mempool_alloc+0x55/0x160
      [ 1463.760795]  ? md_write_start+0xa9/0x270 [md_mod]
      [ 1463.760801]  ? try_to_wake_up+0x44/0x470
      [ 1463.760810]  raid10_make_request+0xc1/0x120 [raid10]
      [ 1463.760816]  ? wait_woken+0x80/0x80
      [ 1463.760831]  md_handle_request+0x121/0x190 [md_mod]
      [ 1463.760851]  md_make_request+0x78/0x190 [md_mod]
      [ 1463.760860]  generic_make_request+0x1c6/0x470
      [ 1463.760870]  raid10_write_request+0x77a/0x900 [raid10]
      [ 1463.760875]  ? wait_woken+0x80/0x80
      [ 1463.760879]  ? mempool_alloc+0x55/0x160
      [ 1463.760895]  ? md_write_start+0xa9/0x270 [md_mod]
      [ 1463.760904]  raid10_make_request+0xc1/0x120 [raid10]
      [ 1463.760910]  ? wait_woken+0x80/0x80
      [ 1463.760926]  md_handle_request+0x121/0x190 [md_mod]
      [ 1463.760931]  ? _raw_spin_unlock_irq+0x22/0x40
      [ 1463.760936]  ? finish_task_switch+0x74/0x260
      [ 1463.760954]  submit_flushes+0x21/0x40 [md_mod]
      
      So resync io is waiting for regular write io to complete to
      decrease nr_pending (conf->barrier++ is called before waiting).
      The regular write io splits another bio after call wait_barrier
      which call nr_pending++, then the splitted bio would continue
      with raid10_write_request -> wait_barrier, so the splitted bio
      has to wait for barrier to be zero, then deadlock happens as
      follows.
      
      	resync io		regular io
      
      	raise_barrier
      				wait_barrier
      				generic_make_request
      				wait_barrier
      
      To resolve the issue, we need to call allow_barrier to decrease
      nr_pending before generic_make_request since regular IO is not
      issued to underlying devices, and wait_barrier is called again
      to ensure no internal IO happening.
      
      Fixes: fc9977dd ("md/raid10: simplify the splitting of requests.")
      Reported-and-tested-by: default avatarSiniša Bandin <sinisa@4net.rs>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      54541a75
    • Alexey Khoroshilov's avatar
      video: clps711x-fb: release disp device node in probe() · 8d138565
      Alexey Khoroshilov authored
      [ Upstream commit fdac7513 ]
      
      clps711x_fb_probe() increments refcnt of disp device node by
      of_parse_phandle() and leaves it undecremented on both
      successful and error paths.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Cc: Alexander Shiyan <shc_work@mail.ru>
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d138565
    • Wenjing Liu's avatar
      drm/amd/display: validate extended dongle caps · 8f0132db
      Wenjing Liu authored
      [ Upstream commit 99b922f9 ]
      
      [why]
      Some dongle doesn't have a valid extended dongle caps,
      but we still set the extended dongle caps to be valid.
      This causes validation fails for all timing.
      
      [how]
      If no dp_hdmi_max_pixel_clk is provided,
      don't use extended dongle caps.
      Signed-off-by: default avatarWenjing Liu <Wenjing.Liu@amd.com>
      Reviewed-by: default avatarAric Cyr <Aric.Cyr@amd.com>
      Reviewed-by: default avatarJun Lei <Jun.Lei@amd.com>
      Acked-by: default avatarAbdoulaye Berthe <Abdoulaye.Berthe@amd.com>
      Acked-by: default avatarLeo Li <sunpeng.li@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f0132db
    • Nathan Chancellor's avatar
      drbd: Avoid Clang warning about pointless switch statment · bd9218ab
      Nathan Chancellor authored
      [ Upstream commit a52c5a16 ]
      
      There are several warnings from Clang about no case statement matching
      the constant 0:
      
      In file included from drivers/block/drbd/drbd_receiver.c:48:
      In file included from drivers/block/drbd/drbd_int.h:48:
      In file included from ./include/linux/drbd_genl_api.h:54:
      In file included from ./include/linux/genl_magic_struct.h:236:
      ./include/linux/drbd_genl.h:321:1: warning: no case matching constant
      switch condition '0'
      GENL_struct(DRBD_NLA_HELPER, 24, drbd_helper_info,
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ./include/linux/genl_magic_struct.h:220:10: note: expanded from macro
      'GENL_struct'
              switch (0) {
                      ^
      
      Silence this warning by adding a 'case 0:' statement. Additionally,
      adjust the alignment of the statements in the ct_assert_unique macro to
      avoid a checkpatch warning.
      
      This solution was originally sent by Arnd Bergmann with a default case
      statement: https://lore.kernel.org/patchwork/patch/756723/
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/43Suggested-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bd9218ab
    • Lars Ellenberg's avatar
      drbd: skip spurious timeout (ping-timeo) when failing promote · 66345d53
      Lars Ellenberg authored
      [ Upstream commit 9848b6dd ]
      
      If you try to promote a Secondary while connected to a Primary
      and allow-two-primaries is NOT set, we will wait for "ping-timeout"
      to give this node a chance to detect a dead primary,
      in case the cluster manager noticed faster than we did.
      
      But if we then are *still* connected to a Primary,
      we fail (after an additional timeout of ping-timout).
      
      This change skips the spurious second timeout.
      
      Most people won't notice really,
      since "ping-timeout" by default is half a second.
      
      But in some installations, ping-timeout may be 10 or 20 seconds or more,
      and spuriously delaying the error return becomes annoying.
      Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      66345d53
    • Lars Ellenberg's avatar
      drbd: disconnect, if the wrong UUIDs are attached on a connected peer · af70af5b
      Lars Ellenberg authored
      [ Upstream commit b17b5960 ]
      
      With "on-no-data-accessible suspend-io", DRBD requires the next attach
      or connect to be to the very same data generation uuid tag it lost last.
      
      If we first lost connection to the peer,
      then later lost connection to our own disk,
      we would usually refuse to re-connect to the peer,
      because it presents the wrong data set.
      
      However, if the peer first connects without a disk,
      and then attached its disk, we accepted that same wrong data set,
      which would be "unexpected" by any user of that DRBD
      and cause "undefined results" (read: very likely data corruption).
      
      The fix is to forcefully disconnect as soon as we notice that the peer
      attached to the "wrong" dataset.
      Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      af70af5b
    • Roland Kammerer's avatar
      drbd: narrow rcu_read_lock in drbd_sync_handshake · 3d67b428
      Roland Kammerer authored
      [ Upstream commit d29e89e3 ]
      
      So far there was the possibility that we called
      genlmsg_new(GFP_NOIO)/mutex_lock() while holding an rcu_read_lock().
      
      This included cases like:
      
      drbd_sync_handshake (acquire the RCU lock)
        drbd_asb_recover_1p
          drbd_khelper
            drbd_bcast_event
              genlmsg_new(GFP_NOIO) --> may sleep
      
      drbd_sync_handshake (acquire the RCU lock)
        drbd_asb_recover_1p
          drbd_khelper
            notify_helper
              genlmsg_new(GFP_NOIO) --> may sleep
      
      drbd_sync_handshake (acquire the RCU lock)
        drbd_asb_recover_1p
          drbd_khelper
            notify_helper
              mutex_lock --> may sleep
      
      While using GFP_ATOMIC whould have been possible in the first two cases,
      the real fix is to narrow the rcu_read_lock.
      Reported-by: default avatarJia-Ju Bai <baijiaju1990@163.com>
      Reviewed-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
      Signed-off-by: default avatarRoland Kammerer <roland.kammerer@linbit.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3d67b428
    • Miroslav Lichvar's avatar
      mlx5: update timecounter at least twice per counter overflow · 8d317b0a
      Miroslav Lichvar authored
      [ Upstream commit 5d867836 ]
      
      The timecounter needs to be updated at least once in half of the
      cyclecounter interval to prevent timecounter_cyc2time() interpreting a
      new timestamp as an old value and causing a backward jump.
      
      This would be an issue if the timecounter multiplier was so small that
      the update interval would not be limited by the 64-bit overflow in
      multiplication.
      
      Shorten the calculated interval to make sure the timecounter is updated
      in time even when the system clock is slowed down by up to 10%, the
      multiplier is increased by up to 10%, and the scheduled overflow check
      is late by 15%.
      
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ariel Levkovich <lariel@mellanox.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d317b0a