1. 06 Jan, 2022 7 commits
    • Oleksandr Tyshchenko's avatar
      arm/xen: Read extended regions from DT and init Xen resource · b2371587
      Oleksandr Tyshchenko authored
      This patch implements arch_xen_unpopulated_init() on Arm where
      the extended regions (if any) are gathered from DT and inserted
      into specific Xen resource to be used as unused address space
      for Xen scratch pages by unpopulated-alloc code.
      
      The extended region (safe range) is a region of guest physical
      address space which is unused and could be safely used to create
      grant/foreign mappings instead of wasting real RAM pages from
      the domain memory for establishing these mappings.
      
      The extended regions are chosen by the hypervisor at the domain
      creation time and advertised to it via "reg" property under
      hypervisor node in the guest device-tree. As region 0 is reserved
      for grant table space (always present), the indexes for extended
      regions are 1...N.
      
      If arch_xen_unpopulated_init() fails for some reason the default
      behaviour will be restored (allocate xenballooned pages).
      
      This patch also removes XEN_UNPOPULATED_ALLOC dependency on x86.
      Signed-off-by: default avatarOleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Link: https://lore.kernel.org/r/1639080336-26573-6-git-send-email-olekstysh@gmail.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      b2371587
    • Oleksandr Tyshchenko's avatar
      xen/unpopulated-alloc: Add mechanism to use Xen resource · d1a928ea
      Oleksandr Tyshchenko authored
      The main reason of this change is that unpopulated-alloc
      code cannot be used in its current form on Arm, but there
      is a desire to reuse it to avoid wasting real RAM pages
      for the grant/foreign mappings.
      
      The problem is that system "iomem_resource" is used for
      the address space allocation, but the really unallocated
      space can't be figured out precisely by the domain on Arm
      without hypervisor involvement. For example, not all device
      I/O regions are known by the time domain starts creating
      grant/foreign mappings. And following the advise from
      "iomem_resource" we might end up reusing these regions by
      a mistake. So, the hypervisor which maintains the P2M for
      the domain is in the best position to provide unused regions
      of guest physical address space which could be safely used
      to create grant/foreign mappings.
      
      Introduce new helper arch_xen_unpopulated_init() which purpose
      is to create specific Xen resource based on the memory regions
      provided by the hypervisor to be used as unused space for Xen
      scratch pages. If arch doesn't define arch_xen_unpopulated_init()
      the default "iomem_resource" will be used.
      
      Update the arguments list of allocate_resource() in fill_list()
      to always allocate a region from the hotpluggable range
      (maximum possible addressable physical memory range for which
      the linear mapping could be created). If arch doesn't define
      arch_get_mappable_range() the default range (0,-1) will be used.
      
      The behaviour on x86 won't be changed by current patch as both
      arch_xen_unpopulated_init() and arch_get_mappable_range()
      are not implemented for it.
      
      Also fallback to allocate xenballooned pages (balloon out RAM
      pages) if we do not have any suitable resource to work with
      (target_resource is invalid) and as the result we won't be able
      to provide unpopulated pages on a request.
      Signed-off-by: default avatarOleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Link: https://lore.kernel.org/r/1639080336-26573-5-git-send-email-olekstysh@gmail.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      d1a928ea
    • Oleksandr Tyshchenko's avatar
      xen/balloon: Bring alloc(free)_xenballooned_pages helpers back · 9dd060af
      Oleksandr Tyshchenko authored
      This patch rolls back some of the changes introduced by commit
      121f2fac "xen/balloon: rename alloc/free_xenballooned_pages"
      in order to make possible to still allocate xenballooned pages
      if CONFIG_XEN_UNPOPULATED_ALLOC is enabled.
      
      On Arm the unpopulated pages will be allocated on top of extended
      regions provided by Xen via device-tree (the subsequent patches
      will add required bits to support unpopulated-alloc feature on Arm).
      The problem is that extended regions feature has been introduced
      into Xen quite recently (during 4.16 release cycle). So this
      effectively means that Linux must only use unpopulated-alloc on Arm
      if it is running on "new Xen" which advertises these regions.
      But, it will only be known after parsing the "hypervisor" node
      at boot time, so before doing that we cannot assume anything.
      
      In order to keep working if CONFIG_XEN_UNPOPULATED_ALLOC is enabled
      and the extended regions are not advertised (Linux is running on
      "old Xen", etc) we need the fallback to alloc_xenballooned_pages().
      
      This way we wouldn't reduce the amount of memory usable (wasting
      RAM pages) for any of the external mappings anymore (and eliminate
      XSA-300) with "new Xen", but would be still functional ballooning
      out RAM pages with "old Xen".
      
      Also rename alloc(free)_xenballooned_pages to xen_alloc(free)_ballooned_pages
      and make xen_alloc(free)_unpopulated_pages static inline in xen.h
      if CONFIG_XEN_UNPOPULATED_ALLOC is disabled.
      Signed-off-by: default avatarOleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Link: https://lore.kernel.org/r/1639080336-26573-4-git-send-email-olekstysh@gmail.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      9dd060af
    • Oleksandr Tyshchenko's avatar
      arm/xen: Switch to use gnttab_setup_auto_xlat_frames() for DT · 5e1cdb8e
      Oleksandr Tyshchenko authored
      Read the start address of the grant table space from DT
      (region 0).
      
      This patch mostly restores behaviour before commit 3cf4095d
      ("arm/xen: Use xen_xlate_map_ballooned_pages to setup grant table")
      but trying not to break the ACPI support added after that commit.
      So the patch touches DT part only and leaves the ACPI part with
      xen_xlate_map_ballooned_pages(). Also in order to make a code more
      resilient use a fallback to xen_xlate_map_ballooned_pages() if grant
      table region wasn't found.
      
      This is a preparation for using Xen extended region feature
      where unused regions of guest physical address space (provided
      by the hypervisor) will be used to create grant/foreign/whatever
      mappings instead of wasting real RAM pages from the domain memory
      for establishing these mappings.
      
      The immediate benefit of this change:
      - Avoid superpage shattering in Xen P2M when establishing
        stage-2 mapping (GFN <-> MFN) for the grant table space
      - Avoid wasting real RAM pages (reducing the amount of memory
        usuable) for mapping grant table space
      - The grant table space is always mapped at the exact
        same place (region 0 is reserved for the grant table)
      Signed-off-by: default avatarOleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Link: https://lore.kernel.org/r/1639080336-26573-3-git-send-email-olekstysh@gmail.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      5e1cdb8e
    • Oleksandr Tyshchenko's avatar
      xen/unpopulated-alloc: Drop check for virt_addr_valid() in fill_list() · fbf3a5c3
      Oleksandr Tyshchenko authored
      If memremap_pages() succeeds the range is guaranteed to have proper page
      table, there is no need for an additional virt_addr_valid() check.
      Signed-off-by: default avatarOleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Link: https://lore.kernel.org/r/1639080336-26573-2-git-send-email-olekstysh@gmail.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      fbf3a5c3
    • Jan Beulich's avatar
      xen/x86: obtain upper 32 bits of video frame buffer address for Dom0 · 335e4dd6
      Jan Beulich authored
      The hypervisor has been supplying this information for a couple of major
      releases. Make use of it. The need to set a flag in the capabilities
      field also points out that the prior setting of that field from the
      hypervisor interface's gbl_caps one was wrong, so that code gets deleted
      (there's also no equivalent of this in native boot code).
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      
      Link: https://lore.kernel.org/r/a3df8bf3-d044-b7bb-3383-cd5239d6d4af@suse.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      335e4dd6
    • Oleksandr Andrushchenko's avatar
      xen/gntdev: fix unmap notification order · ce2f46f3
      Oleksandr Andrushchenko authored
      While working with Xen's libxenvchan library I have faced an issue with
      unmap notifications sent in wrong order if both UNMAP_NOTIFY_SEND_EVENT
      and UNMAP_NOTIFY_CLEAR_BYTE were requested: first we send an event channel
      notification and then clear the notification byte which renders in the below
      inconsistency (cli_live is the byte which was requested to be cleared on unmap):
      
      [  444.514243] gntdev_put_map UNMAP_NOTIFY_SEND_EVENT map->notify.event 6
      libxenvchan_is_open cli_live 1
      [  444.515239] __unmap_grant_pages UNMAP_NOTIFY_CLEAR_BYTE at 14
      
      Thus it is not possible to reliably implement the checks like
      - wait for the notification (UNMAP_NOTIFY_SEND_EVENT)
      - check the variable (UNMAP_NOTIFY_CLEAR_BYTE)
      because it is possible that the variable gets checked before it is cleared
      by the kernel.
      
      To fix that we need to re-order the notifications, so the variable is first
      gets cleared and then the event channel notification is sent.
      With this fix I can see the correct order of execution:
      
      [   54.522611] __unmap_grant_pages UNMAP_NOTIFY_CLEAR_BYTE at 14
      [   54.537966] gntdev_put_map UNMAP_NOTIFY_SEND_EVENT map->notify.event 6
      libxenvchan_is_open cli_live 0
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarOleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Link: https://lore.kernel.org/r/20211210092817.580718-1-andr2000@gmail.comSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      ce2f46f3
  2. 02 Jan, 2022 6 commits
    • Linus Torvalds's avatar
      Linux 5.16-rc8 · c9e6606c
      Linus Torvalds authored
      c9e6606c
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.16-2022-01-02' of... · 24a0b220
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix TUI exit screen refresh race condition in 'perf top'.
      
       - Fix parsing of Intel PT VM time correlation arguments.
      
       - Honour CPU filtering command line request of a script's switch events
         in 'perf script'.
      
       - Fix printing of switch events in Intel PT python script.
      
       - Fix duplicate alias events list printing in 'perf list', noticed on
         heterogeneous arm64 systems.
      
       - Fix return value of ids__new(), users expect NULL for failure, not
         ERR_PTR(-ENOMEM).
      
      * tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf top: Fix TUI exit screen refresh race condition
        perf pmu: Fix alias events list
        perf scripts python: intel-pt-events.py: Fix printing of switch events
        perf script: Fix CPU filtering of a script's switch events
        perf intel-pt: Fix parsing of VM time correlation arguments
        perf expr: Fix return value of ids__new()
      24a0b220
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 859431ac
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Better input validation for compat ioctls and a documentation bugfix
        for 5.16"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        Docs: Fixes link to I2C specification
        i2c: validate user data in compat ioctl
      859431ac
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.16_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1286cc48
      Linus Torvalds authored
      Pull x86 fix from Borislav Petkov:
      
       - Use the proper CONFIG symbol in a preprocessor check.
      
      * tag 'x86_urgent_for_v5.16_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/build: Use the proper name CONFIG_FW_LOADER
      1286cc48
    • yaowenbin's avatar
      perf top: Fix TUI exit screen refresh race condition · 64f18d2d
      yaowenbin authored
      When the following command is executed several times, a coredump file is
      generated.
      
      	$ timeout -k 9 5 perf top -e task-clock
      	*******
      	*******
      	*******
      	0.01%  [kernel]                  [k] __do_softirq
      	0.01%  libpthread-2.28.so        [.] __pthread_mutex_lock
      	0.01%  [kernel]                  [k] __ll_sc_atomic64_sub_return
      	double free or corruption (!prev) perf top --sort comm,dso
      	timeout: the monitored command dumped core
      
      When we terminate "perf top" using sending signal method,
      SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen
      management routines by freeing all memory allocated while it was active.
      
      However SLsmg_reinit_smg() maybe be called by another thread.
      
      SLsmg_reinit_smg() will free the same memory accessed by
      SLsmg_reset_smg(), thus it results in a double free.
      
      SLsmg_reinit_smg() is called already protected by ui__lock, so we fix
      the problem by adding pthread_mutex_trylock of ui__lock when calling
      SLsmg_reset_smg().
      Signed-off-by: default avatarWenyu Liu <liuwenyu7@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: wuxu.wu@huawei.com
      Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.comSigned-off-by: default avatarHewenliang <hewenliang4@huawei.com>
      Signed-off-by: default avataryaowenbin <yaowenbin1@huawei.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      64f18d2d
    • John Garry's avatar
      perf pmu: Fix alias events list · e0257a01
      John Garry authored
      Commit 0e0ae874 ("perf list: Display hybrid PMU events with cpu
      type") changes the event list for uncore PMUs or arm64 heterogeneous CPU
      systems, such that duplicate aliases are incorrectly listed per PMU
      (which they should not be), like:
      
        # perf list
        ...
        unc_cbo_cache_lookup.any_es
        [Unit: uncore_cbox L3 Lookup any request that access cache and found
        line in E or S-state]
        unc_cbo_cache_lookup.any_es
        [Unit: uncore_cbox L3 Lookup any request that access cache and found
        line in E or S-state]
        unc_cbo_cache_lookup.any_i
        [Unit: uncore_cbox L3 Lookup any request that access cache and found
        line in I-state]
        unc_cbo_cache_lookup.any_i
        [Unit: uncore_cbox L3 Lookup any request that access cache and found
        line in I-state]
        ...
      
      Notice how the events are listed twice.
      
      The named commit changed how we remove duplicate events, in that events
      for different PMUs are not treated as duplicates. I suppose this is to
      handle how "Each hybrid pmu event has been assigned with a pmu name".
      
      Fix PMU alias listing by restoring behaviour to remove duplicates for
      non-hybrid PMUs.
      
      Fixes: 0e0ae874 ("perf list: Display hybrid PMU events with cpu type")
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Tested-by: default avatarZhengjun Xing <zhengjun.xing@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0257a01
  3. 01 Jan, 2022 1 commit
  4. 31 Dec, 2021 12 commits
    • Mel Gorman's avatar
      mm: vmscan: reduce throttling due to a failure to make progress -fix · 80082938
      Mel Gorman authored
      Hugh Dickins reported the following
      
      	My tmpfs swapping load (tweaked to use huge pages more heavily
      	than in real life) is far from being a realistic load: but it was
      	notably slowed down by your throttling mods in 5.16-rc, and this
      	patch makes it well again - thanks.
      
      	But: it very quickly hit NULL pointer until I changed that last
      	line to
      
              if (first_pgdat)
                      consider_reclaim_throttle(first_pgdat, sc);
      
      The likely issue is that huge pages are a major component of the test
      workload.  When this is the case, first_pgdat may never get set if
      compaction is ready to continue due to this check
      
              if (IS_ENABLED(CONFIG_COMPACTION) &&
                  sc->order > PAGE_ALLOC_COSTLY_ORDER &&
                  compaction_ready(zone, sc)) {
                      sc->compaction_ready = true;
                      continue;
              }
      
      If this was true for every zone in the zonelist, first_pgdat would never
      get set resulting in a NULL pointer exception.
      
      Link: https://lkml.kernel.org/r/20211209095453.GM3366@techsingularity.net
      Fixes: 1b4e3f26 ("mm: vmscan: Reduce throttling due to a failure to make progress")
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reported-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Darrick J. Wong <djwong@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80082938
    • Mel Gorman's avatar
      mm: vmscan: Reduce throttling due to a failure to make progress · 1b4e3f26
      Mel Gorman authored
      Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
      problems due to reclaim throttling for excessive lengths of time.  In
      Alexey's case, a memory hog that should go OOM quickly stalls for
      several minutes before stalling.  In Mike and Darrick's cases, a small
      memcg environment stalled excessively even though the system had enough
      memory overall.
      
      Commit 69392a40 ("mm/vmscan: throttle reclaim when no progress is
      being made") introduced the problem although commit a19594ca
      ("mm/vmscan: increase the timeout if page reclaim is not making
      progress") made it worse.  Systems at or near an OOM state that cannot
      be recovered must reach OOM quickly and memcg should kill tasks if a
      memcg is near OOM.
      
      To address this, only stall for the first zone in the zonelist, reduce
      the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
      the scan control nr_reclaimed is 0, kswapd is still active and there
      were excessive pages pending for writeback.  If kswapd has stopped
      reclaiming due to excessive failures, do not stall at all so that OOM
      triggers relatively quickly.  Similarly, if an LRU is simply congested,
      only lightly throttle similar to NOPROGRESS.
      
      Alexey's original case was the most straight forward
      
      	for i in {1..3}; do tail /dev/zero; done
      
      On vanilla 5.16-rc1, this test stalled heavily, after the patch the test
      completes in a few seconds similar to 5.15.
      
      Alexey's second test case added watching a youtube video while tail runs
      10 times.  On 5.15, playback only jitters slightly, 5.16-rc1 stalls a
      lot with lots of frames missing and numerous audio glitches.  With this
      patch applies, the video plays similarly to 5.15.
      
      [lkp@intel.com: Fix W=1 build warning]
      
      Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de
      Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv
      Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net
      Link: https://lore.kernel.org/r/20211202150614.22440-1-mgorman@techsingularity.net
      Link: https://linux-regtracking.leemhuis.info/regzbot/regression/20211124011954.7cab9bb4@mail.inbox.lv/Reported-and-tested-by: default avatarAlexey Avramov <hakavlad@inbox.lv>
      Reported-and-tested-by: default avatarMike Galbraith <efault@gmx.de>
      Reported-and-tested-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Tracked-by: default avatarThorsten Leemhuis <regressions@leemhuis.info>
      Fixes: 69392a40 ("mm/vmscan: throttle reclaim when no progress is being made")
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b4e3f26
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · f87bcc88
      Linus Torvalds authored
      Merge misc mm fixes from Andrew Morton:
       "2 patches.
      
        Subsystems affected by this patch series: mm (userfaultfd and damon)"
      
      * akpm:
        mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'
        userfaultfd/selftests: fix hugetlb area allocations
      f87bcc88
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · e46227bf
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three fixes, all in drivers. The lpfc one doesn't look exploitable,
        but nasty things could happen in string operations if mybuf ends up
        with an on stack unterminated string"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: vmw_pvscsi: Set residual data length conditionally
        scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown()
        scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()
      e46227bf
    • SeongJae Park's avatar
      mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()' · ebb3f994
      SeongJae Park authored
      DAMON debugfs interface increases the reference counts of 'struct pid's
      for targets from the 'target_ids' file write callback
      ('dbgfs_target_ids_write()'), but decreases the counts only in DAMON
      monitoring termination callback ('dbgfs_before_terminate()').
      
      Therefore, when 'target_ids' file is repeatedly written without DAMON
      monitoring start/termination, the reference count is not decreased and
      therefore memory for the 'struct pid' cannot be freed.  This commit
      fixes this issue by decreasing the reference counts when 'target_ids' is
      written.
      
      Link: https://lkml.kernel.org/r/20211229124029.23348-1-sj@kernel.org
      Fixes: 4bc05954 ("mm/damon: implement a debugfs-based user space interface")
      Signed-off-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: <stable@vger.kernel.org>	[5.15+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ebb3f994
    • Mike Kravetz's avatar
      userfaultfd/selftests: fix hugetlb area allocations · f5c73297
      Mike Kravetz authored
      Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh
      or any environment where there are 'just enough' hugetlb pages will
      always fail with:
      
        testing events (fork, remap, remove):
      		ERROR: UFFDIO_COPY error: -12 (errno=12, line=616)
      
      The ENOMEM error code implies there are not enough hugetlb pages.
      However, there are free hugetlb pages but they are all reserved.  There
      is a basic problem with the way the test allocates hugetlb pages which
      has existed since the test was originally written.
      
      Due to the way 'cleanup' was done between different phases of the test,
      this issue was masked until recently.  The issue was uncovered by commit
      8ba6e864 ("userfaultfd/selftests: reinitialize test context in each
      test").
      
      For the hugetlb test, src and dst areas are allocated as PRIVATE
      mappings of a hugetlb file.  This means that at mmap time, pages are
      reserved for the src and dst areas.  At the start of event testing (and
      other tests) the src area is populated which results in allocation of
      huge pages to fill the area and consumption of reserves associated with
      the area.  Then, a child is forked to fault in the dst area.  Note that
      the dst area was allocated in the parent and hence the parent owns the
      reserves associated with the mapping.  The child has normal access to
      the dst area, but can not use the reserves created/owned by the parent.
      Thus, if there are no other huge pages available allocation of a page
      for the dst by the child will fail.
      
      Fix by not creating reserves for the dst area.  In this way the child
      can use free (non-reserved) pages.
      
      Also, MAP_PRIVATE of a file only makes sense if you are interested in
      the contents of the file before making a COW copy.  The test does not do
      this.  So, just use MAP_ANONYMOUS | MAP_HUGETLB to create an anonymous
      hugetlb mapping.  There is no need to create a hugetlb file in the
      non-shared case.
      
      Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.comSigned-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5c73297
    • Deep Majumder's avatar
      Docs: Fixes link to I2C specification · c116fe1e
      Deep Majumder authored
      The link to the I2C specification is broken. Although
      "https://www.nxp.com" hosts Rev 7 (2021) of this specification, it is
      behind a login-wall. Thus, an additional link has been added (which
      doesn't require a login) and the NXP official docs link has been
      updated.
      Signed-off-by: default avatarDeep Majumder <deep@fastmail.in>
      [wsa: minor updates to text and commit message]
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      c116fe1e
    • Pavel Skripkin's avatar
      i2c: validate user data in compat ioctl · bb436283
      Pavel Skripkin authored
      Wrong user data may cause warning in i2c_transfer(), ex: zero msgs.
      Userspace should not be able to trigger warnings, so this patch adds
      validation checks for user data in compact ioctl to prevent reported
      warnings
      
      Reported-and-tested-by: syzbot+e417648b303855b91d8a@syzkaller.appspotmail.com
      Fixes: 7d5cb456 ("i2c compat ioctls: move to ->compat_ioctl()")
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      bb436283
    • Leo L. Schwab's avatar
      Input: spaceball - fix parsing of movement data packets · bc7ec917
      Leo L. Schwab authored
      The spaceball.c module was not properly parsing the movement reports
      coming from the device.  The code read axis data as signed 16-bit
      little-endian values starting at offset 2.
      
      In fact, axis data in Spaceball movement reports are signed 16-bit
      big-endian values starting at offset 3.  This was determined first by
      visually inspecting the data packets, and later verified by consulting:
      http://spacemice.org/pdf/SpaceBall_2003-3003_Protocol.pdf
      
      If this ever worked properly, it was in the time before Git...
      Signed-off-by: default avatarLeo L. Schwab <ewhac@ewhac.org>
      Link: https://lore.kernel.org/r/20211221101630.1146385-1-ewhac@ewhac.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      bc7ec917
    • Pavel Skripkin's avatar
      Input: appletouch - initialize work before device registration · 9f3ccdc3
      Pavel Skripkin authored
      Syzbot has reported warning in __flush_work(). This warning is caused by
      work->func == NULL, which means missing work initialization.
      
      This may happen, since input_dev->close() calls
      cancel_work_sync(&dev->work), but dev->work initalization happens _after_
      input_register_device() call.
      
      So this patch moves dev->work initialization before registering input
      device
      
      Fixes: 5a6eb676 ("Input: appletouch - improve powersaving for Geyser3 devices")
      Reported-and-tested-by: syzbot+b88c5eae27386b252bbd@syzkaller.appspotmail.com
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Link: https://lore.kernel.org/r/20211230141151.17300-1-paskripkin@gmail.com
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      9f3ccdc3
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-12-31' of git://anongit.freedesktop.org/drm/drm · 4f3d93c6
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "This is a bit bigger than I'd like, however it has two weeks of amdgpu
        fixes in it, since they missed last week, which was very small.
      
        The nouveau regression is probably the biggest fix in here, and it
        needs to go into 5.15 as well, two i915 fixes, and then a scattering
        of amdgpu fixes. The biggest fix in there is for a fencing NULL
        pointer dereference, the rest are pretty minor.
      
        For the misc team, I've pulled the two misc fixes manually since I'm
        not sure what is happening at this time of year!
      
        The amdgpu maintainers have the outstanding runpm regression to fix
        still, they are just working through the last bits of it now.
      
        Summary:
      
        nouveau:
         - fencing regression fix
      
        i915:
         - Fix possible uninitialized variable
         - Fix composite fence seqno icrement on each fence creation
      
        amdgpu:
         - Fencing fix
         - XGMI fix
         - VCN regression fix
         - IP discovery regression fixes
         - Fix runpm documentation
         - Suspend/resume fixes
         - Yellow Carp display fixes
         - MCLK power management fix
         - dma-buf fix"
      
      * tag 'drm-fixes-2021-12-31' of git://anongit.freedesktop.org/drm/drm:
        drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
        drm/amd/display: Fix USB4 null pointer dereference in update_psp_stream_config
        drm/amd/display: Set optimize_pwr_state for DCN31
        drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization
        drm/amd/display: Added power down for DCN10
        drm/amd/display: fix B0 TMDS deepcolor no dislay issue
        drm/amdgpu: no DC support for headless chips
        drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
        drm/amdgpu: always reset the asic in suspend (v2)
        drm/amd/pm: skip setting gfx cgpg in the s0ix suspend-resume
        drm/i915: Increment composite fence seqno
        drm/i915: Fix possible uninitialized variable in parallel extension
        drm/amdgpu: fix runpm documentation
        drm/nouveau: wait for the exclusive fence after the shared ones v2
        drm/amdgpu: add support for IP discovery gc_info table v2
        drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
        drm/amd/pm: Fix xgmi link control on aldebaran
        drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence
        drm/amdgpu: fix dropped backing store handling in amdgpu_dma_buf_move_notify
      4f3d93c6
    • Dave Airlie's avatar
      Merge branch 'drm-misc-fixes' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes · ce9b333c
      Dave Airlie authored
      This merges two fixes that haven't been sent to me yet, but I wanted to get in.
      
      One amdgpu fix, but one nouveau regression fixer.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      ce9b333c
  5. 30 Dec, 2021 14 commits