1. 26 Jun, 2020 10 commits
    • Waiman Long's avatar
      mm, slab: fix sign conversion problem in memcg_uncharge_slab() · d7670879
      Waiman Long authored
      It was found that running the LTP test on a PowerPC system could produce
      erroneous values in /proc/meminfo, like:
      
        MemTotal:       531915072 kB
        MemFree:        507962176 kB
        MemAvailable:   1100020596352 kB
      
      Using bisection, the problem is tracked down to commit 9c315e4d ("mm:
      memcg/slab: cache page number in memcg_(un)charge_slab()").
      
      In memcg_uncharge_slab() with a "int order" argument:
      
        unsigned int nr_pages = 1 << order;
          :
        mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);
      
      The mod_lruvec_state() function will eventually call the
      __mod_zone_page_state() which accepts a long argument.  Depending on the
      compiler and how inlining is done, "-nr_pages" may be treated as a
      negative number or a very large positive number.  Apparently, it was
      treated as a large positive number in that PowerPC system leading to
      incorrect stat counts.  This problem hasn't been seen in x86-64 yet,
      perhaps the gcc compiler there has some slight difference in behavior.
      
      It is fixed by making nr_pages a signed value.  For consistency, a similar
      change is applied to memcg_charge_slab() as well.
      
      Link: http://lkml.kernel.org/r/20200620184719.10994-1-longman@redhat.com
      Fixes: 9c315e4d ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7670879
    • Randy Dunlap's avatar
      lib: fix test_hmm.c reference after free · 786ae133
      Randy Dunlap authored
      Coccinelle scripts report the following errors:
      
        lib/test_hmm.c:523:20-26: ERROR: reference preceded by free on line 521
        lib/test_hmm.c:524:21-27: ERROR: reference preceded by free on line 521
        lib/test_hmm.c:523:28-35: ERROR: devmem is NULL but dereferenced.
        lib/test_hmm.c:524:29-36: ERROR: devmem is NULL but dereferenced.
      
      Fix these by using the local variable 'res' instead of devmem.
      
      Link: http://lkml.kernel.org/r/c845c158-9c65-9665-0d0b-00342846dd07@infradead.orgSigned-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      786ae133
    • Junxiao Bi's avatar
      ocfs2: fix value of OCFS2_INVALID_SLOT · 9277f833
      Junxiao Bi authored
      In the ocfs2 disk layout, slot number is 16 bits, but in ocfs2
      implementation, slot number is 32 bits.  Usually this will not cause any
      issue, because slot number is converted from u16 to u32, but
      OCFS2_INVALID_SLOT was defined as -1, when an invalid slot number from
      disk was obtained, its value was (u16)-1, and it was converted to u32.
      Then the following checking in get_local_system_inode will be always
      skipped:
      
       static struct inode **get_local_system_inode(struct ocfs2_super *osb,
                                                     int type,
                                                     u32 slot)
       {
       	BUG_ON(slot == OCFS2_INVALID_SLOT);
      	...
       }
      
      Link: http://lkml.kernel.org/r/20200616183829.87211-5-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9277f833
    • Junxiao Bi's avatar
      ocfs2: fix panic on nfs server over ocfs2 · e5a15e17
      Junxiao Bi authored
      The following kernel panic was captured when running nfs server over
      ocfs2, at that time ocfs2_test_inode_bit() was checking whether one
      inode locating at "blkno" 5 was valid, that is ocfs2 root inode, its
      "suballoc_slot" was OCFS2_INVALID_SLOT(65535) and it was allocted from
      //global_inode_alloc, but here it wrongly assumed that it was got from per
      slot inode alloctor which would cause array overflow and trigger kernel
      panic.
      
        BUG: unable to handle kernel paging request at 0000000000001088
        IP: [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0
        PGD 1e06ba067 PUD 1e9e7d067 PMD 0
        Oops: 0002 [#1] SMP
        CPU: 6 PID: 24873 Comm: nfsd Not tainted 4.1.12-124.36.1.el6uek.x86_64 #2
        Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 3.87 02/02/2018
        RIP: _raw_spin_lock+0x18/0xf0
        RSP: e02b:ffff88005ae97908  EFLAGS: 00010206
        RAX: ffff88005ae98000 RBX: 0000000000001088 RCX: 0000000000000000
        RDX: 0000000000020000 RSI: 0000000000000009 RDI: 0000000000001088
        RBP: ffff88005ae97928 R08: 0000000000000000 R09: ffff880212878e00
        R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000001088
        R13: ffff8800063c0aa8 R14: ffff8800650c27d0 R15: 000000000000ffff
        FS:  0000000000000000(0000) GS:ffff880218180000(0000) knlGS:ffff880218180000
        CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000001088 CR3: 00000002033d0000 CR4: 0000000000042660
        Call Trace:
          igrab+0x1e/0x60
          ocfs2_get_system_file_inode+0x63/0x3a0 [ocfs2]
          ocfs2_test_inode_bit+0x328/0xa00 [ocfs2]
          ocfs2_get_parent+0xba/0x3e0 [ocfs2]
          reconnect_path+0xb5/0x300
          exportfs_decode_fh+0xf6/0x2b0
          fh_verify+0x350/0x660 [nfsd]
          nfsd4_putfh+0x4d/0x60 [nfsd]
          nfsd4_proc_compound+0x3d3/0x6f0 [nfsd]
          nfsd_dispatch+0xe0/0x290 [nfsd]
          svc_process_common+0x412/0x6a0 [sunrpc]
          svc_process+0x123/0x210 [sunrpc]
          nfsd+0xff/0x170 [nfsd]
          kthread+0xcb/0xf0
          ret_from_fork+0x61/0x90
        Code: 83 c2 02 0f b7 f2 e8 18 dc 91 ff 66 90 eb bf 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 0f 1f 44 00 00 48 89 fb ba 00 00 02 00 <f0> 0f c1 17 89 d0 45 31 e4 45 31 ed c1 e8 10 66 39 d0 41 89 c6
        RIP   _raw_spin_lock+0x18/0xf0
        CR2: 0000000000001088
        ---[ end trace 7264463cd1aac8f9 ]---
        Kernel panic - not syncing: Fatal exception
      
      Link: http://lkml.kernel.org/r/20200616183829.87211-4-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e5a15e17
    • Junxiao Bi's avatar
      ocfs2: load global_inode_alloc · 7569d3c7
      Junxiao Bi authored
      Set global_inode_alloc as OCFS2_FIRST_ONLINE_SYSTEM_INODE, that will
      make it load during mount.  It can be used to test whether some
      global/system inodes are valid.  One use case is that nfsd will test
      whether root inode is valid.
      
      Link: http://lkml.kernel.org/r/20200616183829.87211-3-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7569d3c7
    • Junxiao Bi's avatar
      ocfs2: avoid inode removal while nfsd is accessing it · 4cd9973f
      Junxiao Bi authored
      Patch series "ocfs2: fix nfsd over ocfs2 issues", v2.
      
      This is a series of patches to fix issues on nfsd over ocfs2.  patch 1
      is to avoid inode removed while nfsd access it patch 2 & 3 is to fix a
      panic issue.
      
      This patch (of 4):
      
      When nfsd is getting file dentry using handle or parent dentry of some
      dentry, one cluster lock is used to avoid inode removed from other node,
      but it still could be removed from local node, so use a rw lock to avoid
      this.
      
      Link: http://lkml.kernel.org/r/20200616183829.87211-1-junxiao.bi@oracle.com
      Link: http://lkml.kernel.org/r/20200616183829.87211-2-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4cd9973f
    • Lianbo Jiang's avatar
      kexec: do not verify the signature without the lockdown or mandatory signature · fd7af71b
      Lianbo Jiang authored
      Signature verification is an important security feature, to protect
      system from being attacked with a kernel of unknown origin.  Kexec
      rebooting is a way to replace the running kernel, hence need be secured
      carefully.
      
      In the current code of handling signature verification of kexec kernel,
      the logic is very twisted.  It mixes signature verification, IMA
      signature appraising and kexec lockdown.
      
      If there is no KEXEC_SIG_FORCE, kexec kernel image doesn't have one of
      signature, the supported crypto, and key, we don't think this is wrong,
      Unless kexec lockdown is executed.  IMA is considered as another kind of
      signature appraising method.
      
      If kexec kernel image has signature/crypto/key, it has to go through the
      signature verification and pass.  Otherwise it's seen as verification
      failure, and won't be loaded.
      
      Seems kexec kernel image with an unqualified signature is even worse
      than those w/o signature at all, this sounds very unreasonable.  E.g.
      If people get a unsigned kernel to load, or a kernel signed with expired
      key, which one is more dangerous?
      
      So, here, let's simplify the logic to improve code readability.  If the
      KEXEC_SIG_FORCE enabled or kexec lockdown enabled, signature
      verification is mandated.  Otherwise, we lift the bar for any kernel
      image.
      
      Link: http://lkml.kernel.org/r/20200602045952.27487-1-lijiang@redhat.comSigned-off-by: default avatarLianbo Jiang <lijiang@redhat.com>
      Reviewed-by: default avatarJiri Bohac <jbohac@suse.cz>
      Acked-by: default avatarDave Young <dyoung@redhat.com>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Matthew Garrett <mjg59@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd7af71b
    • Vlastimil Babka's avatar
      mm, compaction: make capture control handling safe wrt interrupts · b9e20f0d
      Vlastimil Babka authored
      Hugh reports:
      
       "While stressing compaction, one run oopsed on NULL capc->cc in
        __free_one_page()'s task_capc(zone): compact_zone_order() had been
        interrupted, and a page was being freed in the return from interrupt.
      
        Though you would not expect it from the source, both gccs I was using
        (4.8.1 and 7.5.0) had chosen to compile compact_zone_order() with the
        ".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before
        callq compact_zone - long after the "current->capture_control =
        &capc". An interrupt in between those finds capc->cc NULL (zeroed by
        an earlier rep stos).
      
        This could presumably be fixed by a barrier() before setting
        current->capture_control in compact_zone_order(); but would also need
        more care on return from compact_zone(), in order not to risk leaking
        a page captured by interrupt just before capture_control is reset.
      
        Maybe that is the preferable fix, but I felt safer for task_capc() to
        exclude the rather surprising possibility of capture at interrupt
        time"
      
      I have checked that gcc10 also behaves the same.
      
      The advantage of fix in compact_zone_order() is that we don't add
      another test in the page freeing hot path, and that it might prevent
      future problems if we stop exposing pointers to uninitialized structures
      in current task.
      
      So this patch implements the suggestion for compact_zone_order() with
      barrier() (and WRITE_ONCE() to prevent store tearing) for setting
      current->capture_control, and prevents page leaking with
      WRITE_ONCE/READ_ONCE in the proper order.
      
      Link: http://lkml.kernel.org/r/20200616082649.27173-1-vbabka@suse.cz
      Fixes: 5e1f0f09 ("mm, compaction: capture a page under direct compaction")
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarHugh Dickins <hughd@google.com>
      Suggested-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alex Shi <alex.shi@linux.alibaba.com>
      Cc: Li Wang <liwang@redhat.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>	[5.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9e20f0d
    • Michal Hocko's avatar
      mm: do_swap_page(): fix up the error code · 545b1b07
      Michal Hocko authored
      do_swap_page() returns error codes from the VM_FAULT* space.  try_charge()
      might return -ENOMEM, though, and then do_swap_page() simply returns 0
      which means a success.
      
      We almost never return ENOMEM for GFP_KERNEL single page charge.  Except
      for async OOM handling (oom_disabled v1).  So this needs translation to
      VM_FAULT_OOM otherwise the the page fault path will not notify the
      userspace and wait for an action.
      
      Link: http://lkml.kernel.org/r/20200617090238.GL9499@dhcp22.suse.cz
      Fixes: 4c6355b2 ("mm: memcontrol: charge swapin pages on instantiation")
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Alex Shi <alex.shi@linux.alibaba.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Roman Gushchin <guro@fb.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      545b1b07
    • Stafford Horne's avatar
      openrisc: fix boot oops when DEBUG_VM is enabled · 313a5257
      Stafford Horne authored
      Since v5.8-rc1 OpenRISC Linux fails to boot when DEBUG_VM is enabled.
      This has been bisected to commit 42fc5414 ("mmap locking API: add
      mmap_assert_locked() and mmap_assert_write_locked()").
      
      The added locking checks exposed the issue that OpenRISC was not taking
      this mmap lock when during page walks for DMA operations.  This patch
      locks and unlocks the mmap lock for page walking.
      
      Link: http://lkml.kernel.org/r/20200617090247.1680188-1-shorne@gmail.com
      Fixes: 42fc5414 ("mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()"
      Signed-off-by: default avatarStafford Horne <shorne@gmail.com>
      Reviewed-by: default avatarMichel Lespinasse <walken@google.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Steven Price <steven.price@arm.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      313a5257
  2. 25 Jun, 2020 3 commits
    • Linus Torvalds's avatar
      Merge tag 's390-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 908f7d12
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Fix kernel crash on system call single stepping.
      
       - Make sure early program check handler is executed with DAT on to
         avoid an endless program check loop.
      
       - Add __GFP_NOWARN flag to debug feature to avoid user triggerable
         allocation failure messages.
      
      * tag 's390-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/debug: avoid kernel warning on too large number of pages
        s390/kasan: fix early pgm check handler execution
        s390: fix system call single stepping
      908f7d12
    • Linus Torvalds's avatar
      Merge tag 'sound-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · a4d3712b
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes gathered in the last two weeks.
      
        The major changes here are fixes for the recent DPCM regressions found
        on i.MX and Qualcomm platforms and fixes for resource leaks in ASoC
        DAI registrations.
      
        Other than those are mostly device-specific fixes including the usual
        USB- and HD-audio quirks, and a fix for syzkaller case and ID updates
        for new Intel platforms"
      
      * tag 'sound-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits)
        ALSA: usb-audio: Fix OOB access of mixer element list
        ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG)
        ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Flight S
        ASoC: rockchip: Fix a reference count leak.
        ASoC: amd: closing specific instance.
        ALSA: hda: Intel: add missing PCI IDs for ICL-H, TGL-H and EKL
        ASoC: hdac_hda: fix memleak with regmap not freed on remove
        ASoC: SOF: Intel: add PCI IDs for ICL-H and TGL-H
        ASoC: SOF: Intel: add PCI ID for CometLake-S
        ASoC: Intel: SOF: merge COMETLAKE_LP and COMETLAKE_H
        ALSA: hda/realtek: Add mute LED and micmute LED support for HP systems
        ALSA: usb-audio: Fix potential use-after-free of streams
        ALSA: hda/realtek - Add quirk for MSI GE63 laptop
        ASoC: fsl_ssi: Fix bclk calculation for mono channel
        ASoC: SOF: Intel: hda: Clear RIRB status before reading WP
        ASoC: rt1015: Update rt1015 default register value according to spec modification.
        ASoC: qcom: common: set correct directions for dailinks
        ASoc: q6afe: add support to get port direction
        ASoC: soc-pcm: fix checks for multi-cpu FE dailinks
        ASoC: rt5682: Let dai clks be registered whether mclk exists or not
        ...
      a4d3712b
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-5.8-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 8be3a53e
      Linus Torvalds authored
      Pull erofs fix from Gao Xiang:
       "Fix a regression which uses potential uninitialized high 32-bit value
        unexpectedly recently observed with specific compiler options"
      
      * tag 'erofs-for-5.8-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup
      8be3a53e
  3. 24 Jun, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · fc10807d
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "Fixes all over the place.
      
        This includes a couple of tests that I would normally defer, but since
        they have already been helpful in catching some bugs, don't build for
        any users at all, and having them upstream makes life easier for
        everyone, I think it's ok even at this late stage"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        tools/virtio: Use tools/include/list.h instead of stubs
        tools/virtio: Reset index in virtio_test --reset.
        tools/virtio: Extract virtqueue initialization in vq_reset
        tools/virtio: Use __vring_new_virtqueue in virtio_test.c
        tools/virtio: Add --reset
        tools/virtio: Add --batch=random option
        tools/virtio: Add --batch option
        virtio-mem: add memory via add_memory_driver_managed()
        virtio-mem: silence a static checker warning
        vhost_vdpa: Fix potential underflow in vhost_vdpa_mmap()
        vdpa: fix typos in the comments for __vdpa_alloc_device()
      fc10807d
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2020-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · fbb58011
      Linus Torvalds authored
      Pull thread fix from Christian Brauner:
       "This fixes a regression introduced with 303cc571 ("nsproxy: attach
        to namespaces via pidfds").
      
        The LTP testsuite reported a regression where users would now see
        EBADF returned instead of EINVAL when an fd was passed that referred
        to an open file but the file was not a namespace file.
      
        Fix this by continuing to report EINVAL and add a regression test"
      
      * tag 'for-linus-2020-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        tests: test for setns() EINVAL regression
        nsproxy: restore EINVAL for non-namespace file descriptor
      fbb58011
    • Takashi Iwai's avatar
      ALSA: usb-audio: Fix OOB access of mixer element list · 220345e9
      Takashi Iwai authored
      The USB-audio mixer code holds a linked list of usb_mixer_elem_list,
      and several operations are performed for each mixer element.  A few of
      them (snd_usb_mixer_notify_id() and snd_usb_mixer_interrupt_v2())
      assume each mixer element being a usb_mixer_elem_info object that is a
      subclass of usb_mixer_elem_list, cast via container_of() and access it
      members.  This may result in an out-of-bound access when a
      non-standard list element has been added, as spotted by syzkaller
      recently.
      
      This patch adds a new field, is_std_info, in usb_mixer_elem_list to
      indicate that the element is the usb_mixer_elem_info type or not, and
      skip the access to such an element if needed.
      
      Reported-by: syzbot+fb14314433463ad51625@syzkaller.appspotmail.com
      Reported-by: syzbot+2405ca3401e943c538b5@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200624122340.9615-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      220345e9
    • Gao Xiang's avatar
      erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup · 3c597282
      Gao Xiang authored
      Hongyu reported "id != index" in z_erofs_onlinepage_fixup() with
      specific aarch64 environment easily, which wasn't shown before.
      
      After digging into that, I found that high 32 bits of page->private
      was set to 0xaaaaaaaa rather than 0 (due to z_erofs_onlinepage_init
      behavior with specific compiler options). Actually we only use low
      32 bits to keep the page information since page->private is only 4
      bytes on most 32-bit platforms. However z_erofs_onlinepage_fixup()
      uses the upper 32 bits by mistake.
      
      Let's fix it now.
      Reported-and-tested-by: default avatarHongyu Jin <hongyu.jin@unisoc.com>
      Fixes: 3883a79a ("staging: erofs: introduce VLE decompression support")
      Cc: <stable@vger.kernel.org> # 4.19+
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Link: https://lore.kernel.org/r/20200618234349.22553-1-hsiangkao@aol.comSigned-off-by: default avatarGao Xiang <hsiangkao@redhat.com>
      3c597282
  4. 23 Jun, 2020 11 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 26e122e9
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "All bugfixes except for a couple cleanup patches"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: VMX: Remove vcpu_vmx's defunct copy of host_pkru
        KVM: x86: allow TSC to differ by NTP correction bounds without TSC scaling
        KVM: X86: Fix MSR range of APIC registers in X2APIC mode
        KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL
        KVM: nVMX: Plumb L2 GPA through to PML emulation
        KVM: x86/mmu: Avoid mixing gpa_t with gfn_t in walk_addr_generic()
        KVM: LAPIC: ensure APIC map is up to date on concurrent update requests
        kvm: lapic: fix broken vcpu hotplug
        Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU"
        KVM: VMX: Add helpers to identify interrupt type from intr_info
        kvm/svm: disable KCSAN for svm_vcpu_run()
        KVM: MIPS: Fix a build error for !CPU_LOONGSON64
      26e122e9
    • Linus Torvalds's avatar
      Merge tag 'for-5.8-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 3e08a952
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A number of fixes, located in two areas, one performance fix and one
        fixup for better integration with another patchset.
      
         - bug fixes in nowait aio:
             - fix snapshot creation hang after nowait-aio was used
             - fix failure to write to prealloc extent past EOF
             - don't block when extent range is locked
      
         - block group fixes:
             - relocation failure when scrub runs in parallel
             - refcount fix when removing fails
             - fix race between removal and creation
             - space accounting fixes
      
         - reinstante fast path check for log tree at unlink time, fixes
           performance drop up to 30% in REAIM
      
         - kzfree/kfree fixup to ease treewide patchset renaming kzfree"
      
      * tag 'for-5.8-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: use kfree() in btrfs_ioctl_get_subvol_info()
        btrfs: fix RWF_NOWAIT writes blocking on extent locks and waiting for IO
        btrfs: fix RWF_NOWAIT write not failling when we need to cow
        btrfs: fix failure of RWF_NOWAIT write into prealloc extent beyond eof
        btrfs: fix hang on snapshot creation after RWF_NOWAIT write
        btrfs: check if a log root exists before locking the log_mutex on unlink
        btrfs: fix bytes_may_use underflow when running balance and scrub in parallel
        btrfs: fix data block group relocation failure due to concurrent scrub
        btrfs: fix race between block group removal and block group creation
        btrfs: fix a block group ref counter leak after failure to remove block group
      3e08a952
    • Macpaul Lin's avatar
      ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG) · a32a1fc9
      Macpaul Lin authored
      We've found Samsung USBC Headset (AKG) (VID: 0x04e8, PID: 0xa051)
      need a tiny delay after each class compliant request.
      Otherwise the device might not be able to be recognized each times.
      Signed-off-by: default avatarChihhao Chen <chihhao.chen@mediatek.com>
      Signed-off-by: default avatarMacpaul Lin <macpaul.lin@mediatek.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/1592910203-24035-1-git-send-email-macpaul.lin@mediatek.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      a32a1fc9
    • Christian Borntraeger's avatar
      s390/debug: avoid kernel warning on too large number of pages · 827c4913
      Christian Borntraeger authored
      When specifying insanely large debug buffers a kernel warning is
      printed. The debug code does handle the error gracefully, though.
      Instead of duplicating the check let us silence the warning to
      avoid crashes when panic_on_warn is used.
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      827c4913
    • Vasily Gorbik's avatar
      s390/kasan: fix early pgm check handler execution · 998f5bbe
      Vasily Gorbik authored
      Currently if early_pgm_check_handler is called it ends up in pgm check
      loop. The problem is that early_pgm_check_handler is instrumented by
      KASAN but executed without DAT flag enabled which leads to addressing
      exception when KASAN checks try to access shadow memory.
      
      Fix that by executing early handlers with DAT flag on under KASAN as
      expected.
      Reported-and-tested-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      998f5bbe
    • Sven Schnelle's avatar
      s390: fix system call single stepping · e64a1618
      Sven Schnelle authored
      When single stepping an svc instruction on s390, the kernel is entered
      with a PER program check interruption. The program check handler than
      jumps to the system call handler by reloading the PSW. The code didn't
      set GPR13 to the thread pointer in struct task_struct. This made the
      kernel access invalid memory while trying to fetch the syscall function
      address. Fix this by always assigned GPR13 after .Lsysc_per.
      
      Fixes: 0b0ed657 ("s390: remove critical section cleanup from entry.S")
      Reported-and-tested-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      e64a1618
    • Christoffer Nielsen's avatar
      ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Flight S · 73094608
      Christoffer Nielsen authored
      Similar to the Kingston HyperX AMP, the Kingston HyperX Cloud
      Alpha S (0951:0x16ea) uses two interfaces, but only the second
      interface contains the capture stream. This patch delays the
      registration until the second interface appears.
      Signed-off-by: default avatarChristoffer Nielsen <cn@obviux.dk>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/CAOtG2YHOM3zy+ed9KS-J4HkZo_QGzcUG9MigSp4e4_-13r6B=Q@mail.gmail.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      73094608
    • Sean Christopherson's avatar
      KVM: VMX: Remove vcpu_vmx's defunct copy of host_pkru · e4553b49
      Sean Christopherson authored
      Remove vcpu_vmx.host_pkru, which got left behind when PKRU support was
      moved to common x86 code.
      
      No functional change intended.
      
      Fixes: 37486135 ("KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c")
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200617034123.25647-1-sean.j.christopherson@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e4553b49
    • Marcelo Tosatti's avatar
      KVM: x86: allow TSC to differ by NTP correction bounds without TSC scaling · 26769f96
      Marcelo Tosatti authored
      The Linux TSC calibration procedure is subject to small variations
      (its common to see +-1 kHz difference between reboots on a given CPU, for example).
      
      So migrating a guest between two hosts with identical processor can fail, in case
      of a small variation in calibrated TSC between them.
      
      Without TSC scaling, the current kernel interface will either return an error
      (if user_tsc_khz <= tsc_khz) or enable TSC catchup mode.
      
      This change enables the following TSC tolerance check to
      accept KVM_SET_TSC_KHZ within tsc_tolerance_ppm (which is 250ppm by default).
      
              /*
               * Compute the variation in TSC rate which is acceptable
               * within the range of tolerance and decide if the
               * rate being applied is within that bounds of the hardware
               * rate.  If so, no scaling or compensation need be done.
               */
              thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
              thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
              if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
                      pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
                      use_scaling = 1;
              }
      
      NTP daemon in the guest can correct this difference (NTP can correct upto 500ppm).
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      
      Message-Id: <20200616114741.GA298183@fuller.cnet>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      26769f96
    • Xiaoyao Li's avatar
      KVM: X86: Fix MSR range of APIC registers in X2APIC mode · bf10bd0b
      Xiaoyao Li authored
      Only MSR address range 0x800 through 0x8ff is architecturally reserved
      and dedicated for accessing APIC registers in x2APIC mode.
      
      Fixes: 0105d1a5 ("KVM: x2apic interface to lapic")
      Signed-off-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Message-Id: <20200616073307.16440-1-xiaoyao.li@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bf10bd0b
    • Sean Christopherson's avatar
      KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL · bf09fb6c
      Sean Christopherson authored
      Remove support for context switching between the guest's and host's
      desired UMWAIT_CONTROL.  Propagating the guest's value to hardware isn't
      required for correct functionality, e.g. KVM intercepts reads and writes
      to the MSR, and the latency effects of the settings controlled by the
      MSR are not architecturally visible.
      
      As a general rule, KVM should not allow the guest to control power
      management settings unless explicitly enabled by userspace, e.g. see
      KVM_CAP_X86_DISABLE_EXITS.  E.g. Intel's SDM explicitly states that C0.2
      can improve the performance of SMT siblings.  A devious guest could
      disable C0.2 so as to improve the performance of their workloads at the
      detriment to workloads running in the host or on other VMs.
      
      Wholesale removal of UMWAIT_CONTROL context switching also fixes a race
      condition where updates from the host may cause KVM to enter the guest
      with the incorrect value.  Because updates are are propagated to all
      CPUs via IPI (SMP function callback), the value in hardware may be
      stale with respect to the cached value and KVM could enter the guest
      with the wrong value in hardware.  As above, the guest can't observe the
      bad value, but it's a weird and confusing wart in the implementation.
      
      Removal also fixes the unnecessary usage of VMX's atomic load/store MSR
      lists.  Using the lists is only necessary for MSRs that are required for
      correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on
      old hardware, or for MSRs that need to-the-uop precision, e.g. perf
      related MSRs.  For UMWAIT_CONTROL, the effects are only visible in the
      kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in
      vcpu_vmx_run().  Using the atomic lists is undesirable as they are more
      expensive than direct RDMSR/WRMSR.
      
      Furthermore, even if giving the guest control of the MSR is legitimate,
      e.g. in pass-through scenarios, it's not clear that the benefits would
      outweigh the overhead.  E.g. saving and restoring an MSR across a VMX
      roundtrip costs ~250 cycles, and if the guest diverged from the host
      that cost would be paid on every run of the guest.  In other words, if
      there is a legitimate use case then it should be enabled by a new
      per-VM capability.
      
      Note, KVM still needs to emulate MSR_IA32_UMWAIT_CONTROL so that it can
      correctly expose other WAITPKG features to the guest, e.g. TPAUSE,
      UMWAIT and UMONITOR.
      
      Fixes: 6e3ba4ab ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
      Cc: stable@vger.kernel.org
      Cc: Jingqi Liu <jingqi.liu@intel.com>
      Cc: Tao Xu <tao3.xu@intel.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200623005135.10414-1-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bf09fb6c
  5. 22 Jun, 2020 12 commits