1. 31 Oct, 2023 1 commit
    • Dan Williams's avatar
      Merge branch 'for-6.7/cxl-rch-eh' into cxl/next · 7f946e6d
      Dan Williams authored
      Restricted CXL Host (RCH) Error Handling undoes the topology munging of
      CXL 1.1 to enabled some AER recovery, and lands some base infrastructure
      for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include
      this long running series finally for v6.7.
      7f946e6d
  2. 28 Oct, 2023 19 commits
  3. 27 Oct, 2023 5 commits
    • Vishal Verma's avatar
      tools/testing/cxl: Slow down the mock firmware transfer · 8f61d48c
      Vishal Verma authored
      The cxl-cli unit test for firmware update does operations like starting
      an asynchronous firmware update, making sure it is in progress, and
      attempting to cancel it. In some cases, such as with no or minimal
      dynamic debugging turned on, the firmware update completes too quickly,
      not allowing the test to have a chance to verify it was in progress.
      This caused a failure of the signature:
      
        expected fw_update_in_progress:true
        test/cxl-update-firmware.sh: failed at line 88
      
      Fix this by adding a delay (~1.5 - 2 ms) to each firmware transfer
      request handled by the mocked interface.
      Reported-by: default avatarDan Williams <dan.j.williams@intel.com>
      Tested-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Link: https://lore.kernel.org/r/20231026-vv-fw_upd_test_fix-v2-1-5282fd193883@intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      8f61d48c
    • Jim Harris's avatar
      cxl/region: Fix x1 root-decoder granularity calculations · 98a04c7a
      Jim Harris authored
      Root decoder granularity must match value from CFWMS, which may not
      be the region's granularity for non-interleaved root decoders.
      
      So when calculating granularities for host bridge decoders, use the
      region's granularity instead of the root decoder's granularity to ensure
      the correct granularities are set for the host bridge decoders and any
      downstream switch decoders.
      
      Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.
      
      Region created with 2048 granularity using following command line:
      
      cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
      		  -g 2048 -s 2048M
      
      Use "cxl list -PDE | grep granularity" to get a view of the granularity
      set at each level of the topology.
      
      Before this patch:
              "interleave_granularity":2048,
              "interleave_granularity":2048,
          "interleave_granularity":512,
              "interleave_granularity":2048,
              "interleave_granularity":2048,
          "interleave_granularity":512,
      "interleave_granularity":256,
      
      After:
              "interleave_granularity":2048,
              "interleave_granularity":2048,
          "interleave_granularity":4096,
              "interleave_granularity":2048,
              "interleave_granularity":2048,
          "interleave_granularity":4096,
      "interleave_granularity":2048,
      
      Fixes: 27b3f8d1 ("cxl/region: Program target lists")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJim Harris <jim.harris@samsung.com>
      Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in
      [djbw: fixup the prebuilt cxl_test region]
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      98a04c7a
    • Li Zhijian's avatar
      cxl/region: Fix cxl_region_rwsem lock held when returning to user space · 3531b27f
      Li Zhijian authored
      Fix a missed "goto out" to unlock on error to cleanup this splat:
      
          WARNING: lock held when returning to user space!
          6.6.0-rc3-lizhijian+ #213 Not tainted
          ------------------------------------------------
          cxl/673 is leaving the kernel with locks still held!
          1 lock held by cxl/673:
           #0: ffffffffa013b9d0 (cxl_region_rwsem){++++}-{3:3}, at: commit_store+0x7d/0x3e0 [cxl_core]
      
      In terms of user visible impact of this bug for backports:
      
      cxl_region_invalidate_memregion() on x86 invokes wbinvd which is a
      problematic instruction for virtualized environments. So, on virtualized
      x86, cxl_region_invalidate_memregion() returns an error. This failure
      case got missed because CXL memory-expander device passthrough is not a
      production use case, and emulation of CXL devices is typically limited
      to kernel development builds with CONFIG_CXL_REGION_INVALIDATION_TEST=y,
      that makes cxl_region_invalidate_memregion() succeed.
      
      In other words, the expected exposure of this bug is limited to CXL
      subsystem development environments using QEMU that neglected
      CONFIG_CXL_REGION_INVALIDATION_TEST=y.
      
      Fixes: d1257d09 ("cxl/region: Move cache invalidation before region teardown, and before setup")
      Signed-off-by: default avatarLi Zhijian <lizhijian@fujitsu.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Link: https://lore.kernel.org/r/20231025085450.2514906-1-lizhijian@fujitsu.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      3531b27f
    • Alison Schofield's avatar
      cxl/region: Use cxl_calc_interleave_pos() for auto-discovery · 0cf36a85
      Alison Schofield authored
      For auto-discovered regions the driver must assign each target to
      a valid position in the region interleave set based on the decoder
      topology.
      
      The current implementation fails to parse valid decode topologies,
      as it does not consider the child offset into a parent port. The sort
      put all targets of one port ahead of another port when an interleave
      was expected, causing the region assembly to fail.
      
      Replace the existing relative sort with cxl_calc_interleave_pos() that
      finds the exact position in a region interleave for an endpoint based
      on a walk up the ancestral tree from endpoint to root decoder.
      
      cxl_calc_interleave_pos() was introduced in a prior patch, so the work
      here is to use it in cxl_region_sort_targets().
      
      Remove the obsoleted helper functions from the prior sort.
      
      Testing passes on pre-production hardware with BIOS defined regions
      that natively trigger this autodiscovery path of the region driver.
      Testing passes a CXL unit test using the dev_dbg() calculation test
      (see cxl_region_attach()) across an expanded set of region configs:
      1, 1, 1+1, 1+1+1, 2, 2+2, 2+2+2, 2+2+2+2, 4, 4+4, where each number
      represents the count of endpoints per host bridge.
      
      Fixes: a32320b7 ("cxl/region: Add region autodiscovery")
      Reported-by: default avatarDmytro Adamenko <dmytro.adamenko@intel.com>
      Signed-off-by: default avatarAlison Schofield <alison.schofield@intel.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Reviewed-by: default avatarJim Harris <jim.harris@samsung.com>
      Link: https://lore.kernel.org/r/3946cc55ddc19678733eddc9de2c317749f43f3b.1698263080.git.alison.schofield@intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      0cf36a85
    • Alison Schofield's avatar
      cxl/region: Calculate a target position in a region interleave · a3e00c96
      Alison Schofield authored
      Introduce a calculation to find a target's position in a region
      interleave. Perform a self-test of the calculation on user-defined
      regions.
      
      The region driver uses the kernel sort() function to put region
      targets in relative order. Positions are assigned based on each
      target's index in that sorted list. That relative sort doesn't
      consider the offset of a port into its parent port which causes
      some auto-discovered regions to fail creation. In one failure case,
      a 2 + 2 config (2 host bridges each with 2 endpoints), the sort
      puts all the targets of one port ahead of another port when they
      were expected to be interleaved.
      
      In preparation for repairing the autodiscovery region assembly,
      introduce a new method for discovering a target position in the
      region interleave.
      
      cxl_calc_interleave_pos() adds a method to find the target position by
      ascending from an endpoint to a root decoder. The calculation starts
      with the endpoint's local position and position in the parent port. It
      traverses towards the root decoder and examines both position and ways
      in order to allow the position to be refined all the way to the root
      decoder.
      
      This calculation: position = position * parent_ways + parent_pos;
      applied iteratively yields the correct position.
      
      Include a self-test that exercises this new position calculation against
      every successfully configured user-defined region.
      Signed-off-by: default avatarAlison Schofield <alison.schofield@intel.com>
      Link: https://lore.kernel.org/r/0ac32c75cf81dd8b86bf07d70ff139d33c2300bc.1698263080.git.alison.schofield@intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      a3e00c96
  4. 26 Oct, 2023 1 commit
  5. 24 Oct, 2023 1 commit
    • Jim Harris's avatar
      cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails · 0718588c
      Jim Harris authored
      Commit 5e42bcbc ("cxl/region: decrement ->nr_targets on error in
      cxl_region_attach()") tried to avoid 'eiw' initialization errors when
      ->nr_targets exceeded 16, by just decrementing ->nr_targets when
      cxl_region_setup_targets() failed.
      
      Commit 86987c76 ("cxl/region: Cleanup target list on attach error")
      extended that cleanup to also clear cxled->pos and p->targets[pos]. The
      initialization error was incidentally fixed separately by:
      Commit 8d428542 ("cxl/region: Fix port setup uninitialized variable
      warnings") which was merged a few days after 5e42bcbc.
      
      But now the original cleanup when cxl_region_setup_targets() fails
      prevents endpoint and switch decoder resources from being reused:
      
      1) the cleanup does not set the decoder's region to NULL, which results
         in future dpa_size_store() calls returning -EBUSY
      2) the decoder is not properly freed, which results in future commit
         errors associated with the upstream switch
      
      Now that the initialization errors were fixed separately, the proper
      cleanup for this case is to just return immediately. Then the resources
      associated with this target get cleanup up as normal when the failed
      region is deleted.
      
      The ->nr_targets decrement in the error case also helped prevent
      a p->targets[] array overflow, so add a new check to prevent against
      that overflow.
      
      Tested by trying to create an invalid region for a 2 switch * 2 endpoint
      topology, and then following up with creating a valid region.
      
      Fixes: 5e42bcbc ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJim Harris <jim.harris@samsung.com>
      Reviewed-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Acked-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.inSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      0718588c
  6. 09 Oct, 2023 3 commits
  7. 06 Oct, 2023 5 commits
  8. 01 Oct, 2023 5 commits
    • Linus Torvalds's avatar
      Linux 6.6-rc4 · 8a749fd1
      Linus Torvalds authored
      8a749fd1
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.6-2' of... · e81a2dab
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix the module compression with xz so the in-kernel decompressor
         works
      
       - Document a kconfig idiom to express an optional dependency between
         modules
      
       - Make modpost, when W=1 is given, detect broken drivers that reference
         .exit.* sections
      
       - Remove unused code
      
      * tag 'kbuild-fixes-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: remove stale code for 'source' symlink in packaging scripts
        modpost: Don't let "driver"s reference .exit.*
        vmlinux.lds.h: remove unused CPU_KEEP and CPU_DISCARD macros
        modpost: add missing else to the "of" check
        Documentation: kbuild: explain handling optional dependencies
        kbuild: Use CRC32 and a 1MiB dictionary for XZ compressed modules
      e81a2dab
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2023-10-01-08-34' of... · d2c52315
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2023-10-01-08-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "Fourteen hotfixes, eleven of which are cc:stable. The remainder
        pertain to issues which were introduced after 6.5"
      
      * tag 'mm-hotfixes-stable-2023-10-01-08-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        Crash: add lock to serialize crash hotplug handling
        selftests/mm: fix awk usage in charge_reserved_hugetlb.sh and hugetlb_reparenting_test.sh that may cause error
        mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified
        mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions()
        mm, memcg: reconsider kmem.limit_in_bytes deprecation
        mm: zswap: fix potential memory corruption on duplicate store
        arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
        mm: hugetlb: add huge page size param to set_huge_pte_at()
        maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states
        maple_tree: add mas_is_active() to detect in-tree walks
        nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()
        mm: abstract moving to the next PFN
        mm: report success more often from filemap_map_folio_range()
        fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
      d2c52315
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 8f633369
      Linus Torvalds authored
      Pull misc driver fix from Greg KH:
       "Here is a single, much requested, fix for a set of misc drivers to
        resolve a much reported regression in the -rc series that has also
        propagated back to the stable releases. Sorry for the delay, lots of
        conference travel for a few weeks put me very far behind in patch
        wrangling.
      
        It has been reported by many to resolve the reported problem, and has
        been in linux-next with no reported issues"
      
      * tag 'char-misc-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        misc: rtsx: Fix some platforms can not boot and move the l1ss judgment to probe
      8f633369
    • Linus Torvalds's avatar
      Merge tag 'tty-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 3abd15e2
      Linus Torvalds authored
      Pull tty / serial driver fixes from Greg KH:
       "Here are two tty/serial driver fixes for 6.6-rc4 that resolve some
        reported regressions:
      
         - revert a n_gsm change that ended up causing problems
      
         - 8250_port fix for irq data
      
        both have been in linux-next for over a week with no reported
        problems"
      
      * tag 'tty-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"
        serial: 8250_port: Check IRQ data before use
      3abd15e2