1. 28 Jun, 2024 11 commits
  2. 27 Jun, 2024 29 commits
    • Jann Horn's avatar
      drm/drm_file: Fix pid refcounting race · 4f2a129b
      Jann Horn authored
      <maarten.lankhorst@linux.intel.com>, Maxime Ripard
      <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de>
      
      filp->pid is supposed to be a refcounted pointer; however, before this
      patch, drm_file_update_pid() only increments the refcount of a struct
      pid after storing a pointer to it in filp->pid and dropping the
      dev->filelist_mutex, making the following race possible:
      
      process A               process B
      =========               =========
                              begin drm_file_update_pid
                              mutex_lock(&dev->filelist_mutex)
                              rcu_replace_pointer(filp->pid, <pid B>, 1)
                              mutex_unlock(&dev->filelist_mutex)
      begin drm_file_update_pid
      mutex_lock(&dev->filelist_mutex)
      rcu_replace_pointer(filp->pid, <pid A>, 1)
      mutex_unlock(&dev->filelist_mutex)
      get_pid(<pid A>)
      synchronize_rcu()
      put_pid(<pid B>)   *** pid B reaches refcount 0 and is freed here ***
                              get_pid(<pid B>)   *** UAF ***
                              synchronize_rcu()
                              put_pid(<pid A>)
      
      As far as I know, this race can only occur with CONFIG_PREEMPT_RCU=y
      because it requires RCU to detect a quiescent state in code that is not
      explicitly calling into the scheduler.
      
      This race leads to use-after-free of a "struct pid".
      It is probably somewhat hard to hit because process A has to pass
      through a synchronize_rcu() operation while process B is between
      mutex_unlock() and get_pid().
      
      Fix it by ensuring that by the time a pointer to the current task's pid
      is stored in the file, an extra reference to the pid has been taken.
      
      This fix also removes the condition for synchronize_rcu(); I think
      that optimization is unnecessary complexity, since in that case we
      would usually have bailed out on the lockless check above.
      
      Fixes: 1c7a387f ("drm: Update file owner during use")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      4f2a129b
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2024-06-27' of... · 3e6d5e11
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2024-06-27' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
      
      drm/i915 fixes for v6.10-rc6:
      - Fix potential UAF due to race on fence register revocation
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/87ikxudcpd.fsf@intel.com
      3e6d5e11
    • Linus Torvalds's avatar
      Merge tag 'pm-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · ef8abe96
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Modify the intel_pstate driver to use HWP to initialize the ITMT
        scheduler extension if ACPI CPPC cannot be used for that, which is the
        case on some hybrid x86 systems (Rafael Wysocki)"
      
      * tag 'pm-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: intel_pstate: Use HWP to initialize ITMT if CPPC is missing
      ef8abe96
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 92572d2c
      Linus Torvalds authored
      Pull thermal control fix from Rafael Wysocki:
       "Replace an earlier fix for a recent regression in the Step-Wise
        thermal governor that was not effective in all of the relevant cases
        (Rafael Wysocki)"
      
      * tag 'thermal-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        thermal: gov_step_wise: Go straight to instance->lower when mitigation is over
      92572d2c
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.10-20240627' of git://git.kernel.dk/linux · 0f47788b
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Removal of a struct member that's unused since the 6.10 merge window,
        and a fix for a regression in SQPOLL wakeups, bringing it back to how
        it worked before the SQPOLL local task_work"
      
      * tag 'io_uring-6.10-20240627' of git://git.kernel.dk/linux:
        io_uring: signal SQPOLL task_work with TWA_SIGNAL_NO_IPI
        io_uring: remove dead struct io_submit_state member
      0f47788b
    • Jens Axboe's avatar
      Merge tag 'nvme-6.10-2024-06-27' of git://git.infradead.org/nvme into block-6.10 · cab598bc
      Jens Axboe authored
      Pull NVMe fixes from Keith:
      
      "nvme fixes for Linux 6.10
      
       - Fabrics fixes (Hannes)
       - Missing module description (Jeff)
       - Clang warning fix (Nathan)"
      
      * tag 'nvme-6.10-2024-06-27' of git://git.infradead.org/nvme:
        nvmet-fc: Remove __counted_by from nvmet_fc_tgt_queue.fod[]
        nvmet: make 'tsas' attribute idempotent for RDMA
        nvme: fixup comment for nvme RDMA Provider Type
        nvme-apple: add missing MODULE_DESCRIPTION()
        nvmet: do not return 'reserved' for empty TSAS values
        nvme: fix NVME_NS_DEAC may incorrectly identifying the disk as EXT_LBA.
      cab598bc
    • Linus Torvalds's avatar
      Merge tag 's390-6.10-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 6d6444ba
      Linus Torvalds authored
      Pull s390 updates from Alexander Gordeev:
      
       - Add missing virt_to_phys() conversion for directed interrupt bit
         vectors
      
       - Fix broken configuration change notifications for virtio-ccw
      
       - Fix sclp_init() cleanup path on failure and as result - fix a list
         double add warning
      
       - Fix unconditional adjusting of GOT entries containing undefined weak
         symbols that resolve to zero
      
      * tag 's390-6.10-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/boot: Do not adjust GOT entries for undef weak sym
        s390/sclp: Fix sclp_init() cleanup on failure
        s390/virtio_ccw: Fix config change notifications
        s390/pci: Add missing virt_to_phys() for directed DIBV
      6d6444ba
    • Linus Torvalds's avatar
      Merge tag 'asm-generic-fixes-6.10' of... · adfbe364
      Linus Torvalds authored
      Merge tag 'asm-generic-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
      
      Pull asm-generic fixes from Arnd Bergmann:
       "These are some bugfixes for system call ABI issues I found while
        working on a cleanup series. None of these are urgent since these bugs
        have gone unnoticed for many years, but I think we probably want to
        backport them all to stable kernels, so it makes sense to have the
        fixes included as early as possible.
      
        One more fix addresses a compile-time warning in kallsyms that was
        uncovered by a patch I did to enable additional warnings in 6.10. I
        had mistakenly thought that this fix was already merged through the
        module tree, but as Geert pointed out it was still missing"
      
      * tag 'asm-generic-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
        kallsyms: rework symbol lookup return codes
        linux/syscalls.h: add missing __user annotations
        syscalls: mmap(): use unsigned offset type consistently
        s390: remove native mmap2() syscall
        hexagon: fix fadvise64_64 calling conventions
        csky, hexagon: fix broken sys_sync_file_range
        sh: rework sync_file_range ABI
        powerpc: restore some missing spu syscalls
        parisc: use generic sys_fanotify_mark implementation
        parisc: use correct compat recv/recvfrom syscalls
        sparc: fix compat recv/recvfrom syscalls
        sparc: fix old compat_sys_select()
        syscalls: fix compat_sys_io_pgetevents_time64 usage
        ftruncate: pass a signed offset
      adfbe364
    • Linus Torvalds's avatar
      Merge tag 'for-6.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 66e55ff1
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - fix quota root leak after quota disable failure
      
       - fix condition when checking if a zone can be added as free
      
       - allocate inode in NOFS context during logging or tree-log replay
      
       - handle raid-stripe-tree lookup correctly during scrub
      
      * tag 'for-6.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: qgroup: fix quota root leak after quota disable failure
        btrfs: scrub: handle RST lookup error correctly
        btrfs: zoned: fix initial free space detection
        btrfs: use NOFS context when getting inodes during logging and log replay
      66e55ff1
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · fd19d4a4
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from can, bpf and netfilter.
      
        There are a bunch of regressions addressed here, but hopefully nothing
        spectacular. We are still waiting the driver fix from Intel, mentioned
        by Jakub in the previous networking pull.
      
        Current release - regressions:
      
         - core: add softirq safety to netdev_rename_lock
      
         - tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed
           TFO
      
         - batman-adv: fix RCU race at module unload time
      
        Previous releases - regressions:
      
         - openvswitch: get related ct labels from its master if it is not
           confirmed
      
         - eth: bonding: fix incorrect software timestamping report
      
         - eth: mlxsw: fix memory corruptions on spectrum-4 systems
      
         - eth: ionic: use dev_consume_skb_any outside of napi
      
        Previous releases - always broken:
      
         - netfilter: fully validate NFT_DATA_VALUE on store to data registers
      
         - unix: several fixes for OoB data
      
         - tcp: fix race for duplicate reqsk on identical SYN
      
         - bpf:
             - fix may_goto with negative offset
             - fix the corner case with may_goto and jump to the 1st insn
             - fix overrunning reservations in ringbuf
      
         - can:
             - j1939: recover socket queue on CAN bus error during BAM
               transmission
             - mcp251xfd: fix infinite loop when xmit fails
      
         - dsa: microchip: monitor potential faults in half-duplex mode
      
         - eth: vxlan: pull inner IP header in vxlan_xmit_one()
      
         - eth: ionic: fix kernel panic due to multi-buffer handling
      
        Misc:
      
         - selftest: unix tests refactor and a lot of new cases added"
      
      * tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
        net: mana: Fix possible double free in error handling path
        selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c.
        af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head.
        selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c
        selftest: af_unix: Check SIGURG after every send() in msg_oob.c
        selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c
        af_unix: Don't stop recv() at consumed ex-OOB skb.
        selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c.
        af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.
        af_unix: Stop recv(MSG_PEEK) at consumed OOB skb.
        selftest: af_unix: Add msg_oob.c.
        selftest: af_unix: Remove test_unix_oob.c.
        tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
        netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
        net: usb: qmi_wwan: add Telit FN912 compositions
        tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed TFO
        ionic: use dev_consume_skb_any outside of napi
        net: dsa: microchip: fix wrong register write when masking interrupt
        Fix race for duplicate reqsk on identical SYN
        ibmvnic: Add tx check to prevent skb leak
        ...
      fd19d4a4
    • Linus Torvalds's avatar
      Merge tag 'sound-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 3c1d29e5
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "This became bigger than usual, as it receives a pile of pending ASoC
        fixes. Most of changes are for device-specific issues while there are
        a few core fixes that are all rather trivial:
      
         - DMA-engine sync fixes
      
         - Continued MIDI2 conversion fixes
      
         - Various ASoC Intel SOF fixes
      
         - A series of ASoC topology fixes for memory handling
      
         - AMD ACP fix, curing a recent regression, too
      
         - Platform / codec-specific fixes for mediatek, atmel, realtek, etc"
      
      * tag 'sound-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (40 commits)
        ASoC: rt5645: fix issue of random interrupt from push-button
        ALSA: seq: Fix missing MSB in MIDI2 SPP conversion
        ASoC: amd: yc: Fix non-functional mic on ASUS M5602RA
        ALSA: hda/realtek: fix mute/micmute LEDs don't work for EliteBook 645/665 G11.
        ALSA: hda/realtek: Fix conflicting quirk for PCI SSID 17aa:3820
        ALSA: dmaengine_pcm: terminate dmaengine before synchronize
        ALSA: hda/relatek: Enable Mute LED on HP Laptop 15-gw0xxx
        ALSA: PCM: Allow resume only for suspended streams
        ALSA: seq: Fix missing channel at encoding RPN/NRPN MIDI2 messages
        ASoC: mediatek: mt8195: Add platform entry for ETDM1_OUT_BE dai link
        ASoC: fsl-asoc-card: set priv->pdev before using it
        ASoC: amd: acp: move chip->flag variable assignment
        ASoC: amd: acp: remove i2s configuration check in acp_i2s_probe()
        ASoC: amd: acp: add a null check for chip_pdev structure
        ASoC: Intel: soc-acpi: mtl: fix speaker no sound on Dell SKU 0C64
        ASoC: q6apm-lpass-dai: close graph on prepare errors
        ASoC: cs35l56: Disconnect ASP1 TX sources when ASP1 DAI is hooked up
        ASoC: topology: Fix route memory corruption
        ASoC: rt722-sdca-sdw: add debounce time for type detection
        ASoC: SOF: sof-audio: Skip unprepare for in-use widgets on error rollback
        ...
      3c1d29e5
    • Arnd Bergmann's avatar
      kallsyms: rework symbol lookup return codes · 7e1f4eb9
      Arnd Bergmann authored
      Building with W=1 in some configurations produces a false positive
      warning for kallsyms:
      
      kernel/kallsyms.c: In function '__sprint_symbol.isra':
      kernel/kallsyms.c:503:17: error: 'strcpy' source argument is the same as destination [-Werror=restrict]
        503 |                 strcpy(buffer, name);
            |                 ^~~~~~~~~~~~~~~~~~~~
      
      This originally showed up while building with -O3, but later started
      happening in other configurations as well, depending on inlining
      decisions. The underlying issue is that the local 'name' variable is
      always initialized to the be the same as 'buffer' in the called functions
      that fill the buffer, which gcc notices while inlining, though it could
      see that the address check always skips the copy.
      
      The calling conventions here are rather unusual, as all of the internal
      lookup functions (bpf_address_lookup, ftrace_mod_address_lookup,
      ftrace_func_address_lookup, module_address_lookup and
      kallsyms_lookup_buildid) already use the provided buffer and either return
      the address of that buffer to indicate success, or NULL for failure,
      but the callers are written to also expect an arbitrary other buffer
      to be returned.
      
      Rework the calling conventions to return the length of the filled buffer
      instead of its address, which is simpler and easier to follow as well
      as avoiding the warning. Leave only the kallsyms_lookup() calling conventions
      unchanged, since that is called from 16 different functions and
      adapting this would be a much bigger change.
      
      Link: https://lore.kernel.org/lkml/20200107214042.855757-1-arnd@arndb.de/
      Link: https://lore.kernel.org/lkml/20240326130647.7bfb1d92@gandalf.local.home/Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Acked-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      7e1f4eb9
    • Kent Gibson's avatar
      gpiolib: cdev: Ignore reconfiguration without direction · b4403963
      Kent Gibson authored
      linereq_set_config() behaves badly when direction is not set.
      The configuration validation is borrowed from linereq_create(), where,
      to verify the intent of the user, the direction must be set to in order to
      effect a change to the electrical configuration of a line. But, when
      applied to reconfiguration, that validation does not allow for the unset
      direction case, making it possible to clear flags set previously without
      specifying the line direction.
      
      Adding to the inconsistency, those changes are not immediately applied by
      linereq_set_config(), but will take effect when the line value is next get
      or set.
      
      For example, by requesting a configuration with no flags set, an output
      line with GPIO_V2_LINE_FLAG_ACTIVE_LOW and GPIO_V2_LINE_FLAG_OPEN_DRAIN
      set could have those flags cleared, inverting the sense of the line and
      changing the line drive to push-pull on the next line value set.
      
      Skip the reconfiguration of lines for which the direction is not set, and
      only reconfigure the lines for which direction is set.
      
      Fixes: a54756cb ("gpiolib: cdev: support GPIO_V2_LINE_SET_CONFIG_IOCTL")
      Signed-off-by: default avatarKent Gibson <warthog618@gmail.com>
      Link: https://lore.kernel.org/r/20240626052925.174272-3-warthog618@gmail.comSigned-off-by: default avatarBartosz Golaszewski <bartosz.golaszewski@linaro.org>
      b4403963
    • Kent Gibson's avatar
      gpiolib: cdev: Disallow reconfiguration without direction (uAPI v1) · 9919cce6
      Kent Gibson authored
      linehandle_set_config() behaves badly when direction is not set.
      The configuration validation is borrowed from linehandle_create(), where,
      to verify the intent of the user, the direction must be set to in order
      to effect a change to the electrical configuration of a line. But, when
      applied to reconfiguration, that validation does not allow for the unset
      direction case, making it possible to clear flags set previously without
      specifying the line direction.
      
      Adding to the inconsistency, those changes are not immediately applied by
      linehandle_set_config(), but will take effect when the line value is next
      get or set.
      
      For example, by requesting a configuration with no flags set, an output
      line with GPIOHANDLE_REQUEST_ACTIVE_LOW and GPIOHANDLE_REQUEST_OPEN_DRAIN
      requested could have those flags cleared, inverting the sense of the line
      and changing the line drive to push-pull on the next line value set.
      
      Ensure the intent of the user by disallowing configurations which do not
      have direction set, returning an error to userspace to indicate that the
      configuration is invalid.
      
      And, for clarity, use lflags, a local copy of gcnf.flags, throughout when
      dealing with the requested flags, rather than a mixture of both.
      
      Fixes: e588bb1e ("gpio: add new SET_CONFIG ioctl() to gpio chardev")
      Signed-off-by: default avatarKent Gibson <warthog618@gmail.com>
      Link: https://lore.kernel.org/r/20240626052925.174272-2-warthog618@gmail.comSigned-off-by: default avatarBartosz Golaszewski <bartosz.golaszewski@linaro.org>
      9919cce6
    • Paolo Abeni's avatar
      Merge tag 'nf-24-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · b62cb6a7
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains two Netfilter fixes for net:
      
      Patch #1 fixes CONFIG_SYSCTL=n for a patch coming in the previous PR
      	 to move the sysctl toggle to enable SRv6 netfilter hooks from
      	 nf_conntrack to the core, from Jianguo Wu.
      
      Patch #2 fixes a possible pointer leak to userspace due to insufficient
      	 validation of NFT_DATA_VALUE.
      
      Linus found this pointer leak to userspace via zdi-disclosures@ and
      forwarded the notice to Netfilter maintainers, he appears as reporter
      because whoever found this issue never approached Netfilter
      maintainers neither via security@ nor in private.
      
      netfilter pull request 24-06-27
      
      * tag 'nf-24-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
        netfilter: fix undefined reference to 'netfilter_lwtunnel_*' when CONFIG_SYSCTL=n
      ====================
      
      Link: https://patch.msgid.link/20240626233845.151197-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b62cb6a7
    • Ma Ke's avatar
      net: mana: Fix possible double free in error handling path · 1864b822
      Ma Ke authored
      When auxiliary_device_add() returns error and then calls
      auxiliary_device_uninit(), callback function adev_release
      calls kfree(madev). We shouldn't call kfree(madev) again
      in the error handling path. Set 'madev' to NULL.
      
      Fixes: a69839d4 ("net: mana: Add support for auxiliary device")
      Signed-off-by: default avatarMa Ke <make24@iscas.ac.cn>
      Link: https://patch.msgid.link/20240625130314.2661257-1-make24@iscas.ac.cnSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1864b822
    • Vasant Hegde's avatar
      iommu/amd: Fix GT feature enablement again · 150bdf5f
      Vasant Hegde authored
      Current code configures GCR3 even when device is attached to identity
      domain. So that we can support SVA with identity domain. This means in
      attach device path it updates Guest Translation related bits in DTE.
      
      Commit de111f6b ("iommu/amd: Enable Guest Translation after reading
      IOMMU feature register") missed to enable Control[GT] bit in resume
      path. Its causing certain laptop to fail to resume after suspend.
      
      This is because we have inconsistency between between control register
      (GT is disabled) and DTE (where we have enabled guest translation related
      bits) in resume path. And IOMMU hardware throws ILLEGAL_DEV_TABLE_ENTRY.
      
      Fix it by enabling GT bit in resume path.
      Reported-by: default avatarBłażej Szczygieł <spaz16@wp.pl>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=218975
      Fixes: de111f6b ("iommu/amd: Enable Guest Translation after reading IOMMU feature register")
      Tested-by: default avatarBłażej Szczygieł <spaz16@wp.pl>
      Signed-off-by: default avatarVasant Hegde <vasant.hegde@amd.com>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Link: https://lore.kernel.org/r/20240621101533.20216-1-vasant.hegde@amd.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      150bdf5f
    • Lu Baolu's avatar
      iommu/vt-d: Fix missed device TLB cache tag · 041be271
      Lu Baolu authored
      When a domain is attached to a device, the required cache tags are
      assigned to the domain so that the related caches can be flushed
      whenever it is needed. The device TLB cache tag is created based
      on whether the ats_enabled field of the device's iommu data is set.
      This creates an ordered dependency between cache tag assignment and
      ATS enabling.
      
      The device TLB cache tag would not be created if device's ATS is
      enabled after the cache tag assignment. This causes devices with PCI
      ATS support to malfunction.
      
      The ATS control is exclusively owned by the iommu driver. Hence, move
      cache_tag_assign_domain() after PCI ATS enabling to make sure that the
      device TLB cache tag is created for the domain.
      
      Fixes: 3b1d9e2b ("iommu/vt-d: Add cache tag assignment interface")
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: default avatarKevin Tian <kevin.tian@intel.com>
      Link: https://lore.kernel.org/r/20240620062940.201786-1-baolu.lu@linux.intel.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      041be271
    • Vasant Hegde's avatar
      iommu/amd: Invalidate cache before removing device from domain list · c362f32a
      Vasant Hegde authored
      Commit 87a6f1f2 ("iommu/amd: Introduce per-device domain ID to fix
      potential TLB aliasing issue") introduced per device domain ID when
      domain is configured with v2 page table. And in invalidation path, it
      uses per device structure (dev_data->gcr3_info.domid) to get the domain ID.
      
      In detach_device() path, current code tries to invalidate IOMMU cache
      after removing dev_data from domain device list. This means when domain
      is configured with v2 page table, amd_iommu_domain_flush_all() will not be
      able to invalidate cache as device is already removed from domain device
      list.
      
      This is causing change domain tests (changing domain type from identity to DMA)
      to fail with IO_PAGE_FAULT issue.
      
      Hence invalidate cache and update DTE before updating data structures.
      Reported-by: default avatarFahHean Lee <fahhean.lee@amd.com>
      Reported-by: default avatarDheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
      Fixes: 87a6f1f2 ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue")
      Tested-by: default avatarDheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
      Tested-by: default avatarSairaj Arun Kodilkar <sairaj.arunkodilkar@amd.com>
      Tested-by: default avatarFahHean Lee <fahhean.lee@amd.com>
      Signed-off-by: default avatarVasant Hegde <vasant.hegde@amd.com>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Link: https://lore.kernel.org/r/20240620060552.13984-1-vasant.hegde@amd.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      c362f32a
    • Paolo Abeni's avatar
      Merge branch 'af_unix-fix-bunch-of-msg_oob-bugs-and-add-new-tests' · 3f4d9e4f
      Paolo Abeni authored
      Kuniyuki Iwashima says:
      
      ====================
      af_unix: Fix bunch of MSG_OOB bugs and add new tests.
      
      This series rewrites the selftest for AF_UNIX MSG_OOB and fixes
      bunch of bugs that AF_UNIX behaves differently compared to TCP.
      
      Note that the test discovered few more bugs in TCP side, which
      will be fixed in another series.
      ====================
      
      Link: https://lore.kernel.org/r/20240625013645.45034-1-kuniyu@amazon.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3f4d9e4f
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c. · 91b7186c
      Kuniyuki Iwashima authored
      To catch regression, let's check ioctl(SIOCATMARK) after every
      send() and recv() calls.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      91b7186c
    • Kuniyuki Iwashima's avatar
      af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head. · e400cfa3
      Kuniyuki Iwashima authored
      Even if OOB data is recv()ed, ioctl(SIOCATMARK) must return 1 when the
      OOB skb is at the head of the receive queue and no new OOB data is queued.
      
      Without fix:
      
        #  RUN           msg_oob.no_peek.oob ...
        # msg_oob.c:305:oob:Expected answ[0] (0) == oob_head (1)
        # oob: Test terminated by assertion
        #          FAIL  msg_oob.no_peek.oob
        not ok 2 msg_oob.no_peek.oob
      
      With fix:
      
        #  RUN           msg_oob.no_peek.oob ...
        #            OK  msg_oob.no_peek.oob
        ok 2 msg_oob.no_peek.oob
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e400cfa3
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c · 48a99837
      Kuniyuki Iwashima authored
      When OOB data is in recvq, we can detect it with epoll by checking
      EPOLLPRI.
      
      This patch add checks for EPOLLPRI after every send() and recv() in
      all test cases.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      48a99837
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Check SIGURG after every send() in msg_oob.c · d02689e6
      Kuniyuki Iwashima authored
      When data is sent with MSG_OOB, SIGURG is sent to a process if the
      receiver socket has set its owner to the process by ioctl(FIOSETOWN)
      or fcntl(F_SETOWN).
      
      This patch adds SIGURG check after every send(MSG_OOB) call.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d02689e6
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c · 436352e8
      Kuniyuki Iwashima authored
      When SO_OOBINLINE is enabled on a socket, MSG_OOB can be recv()ed
      without MSG_OOB flag, and ioctl(SIOCATMARK) will behaves differently.
      
      This patch adds some test cases for SO_OOBINLINE.
      
      Note the new test cases found two bugs in TCP.
      
        1) After reading OOB data with non-inline mode, we can re-read
           the data by setting SO_OOBINLINE.
      
        #  RUN           msg_oob.no_peek.inline_oob_ahead_break ...
        # msg_oob.c:146:inline_oob_ahead_break:AF_UNIX :world
        # msg_oob.c:147:inline_oob_ahead_break:TCP     :oworld
        #            OK  msg_oob.no_peek.inline_oob_ahead_break
        ok 14 msg_oob.no_peek.inline_oob_ahead_break
      
        2) The head OOB data is dropped if SO_OOBINLINE is disabled
           if a new OOB data is queued.
      
        #  RUN           msg_oob.no_peek.inline_ex_oob_drop ...
        # msg_oob.c:171:inline_ex_oob_drop:AF_UNIX :x
        # msg_oob.c:172:inline_ex_oob_drop:TCP     :y
        # msg_oob.c:146:inline_ex_oob_drop:AF_UNIX :y
        # msg_oob.c:147:inline_ex_oob_drop:TCP     :Resource temporarily unavailable
        #            OK  msg_oob.no_peek.inline_ex_oob_drop
        ok 17 msg_oob.no_peek.inline_ex_oob_drop
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      436352e8
    • Kuniyuki Iwashima's avatar
      af_unix: Don't stop recv() at consumed ex-OOB skb. · 36893ef0
      Kuniyuki Iwashima authored
      Currently, recv() is stopped at a consumed OOB skb even if a new
      OOB skb is queued and we can ignore the old OOB skb.
      
        >>> from socket import *
        >>> c1, c2 = socket(AF_UNIX, SOCK_STREAM)
        >>> c1.send(b'hellowor', MSG_OOB)
        8
        >>> c2.recv(1, MSG_OOB)  # consume OOB data stays at middle of recvq.
        b'r'
        >>> c1.send(b'ld', MSG_OOB)
        2
        >>> c2.recv(10)          # recv() stops at the old consumed OOB
        b'hellowo'               # should be 'hellowol'
      
      manage_oob() should not stop recv() at the old consumed OOB skb if
      there is a new OOB data queued.
      
      Note that TCP behaviour is apparently wrong in this test case because
      we can recv() the same OOB data twice.
      
      Without fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_ahead_break ...
        # msg_oob.c:138:ex_oob_ahead_break:AF_UNIX :hellowo
        # msg_oob.c:139:ex_oob_ahead_break:Expected:hellowol
        # msg_oob.c:141:ex_oob_ahead_break:Expected ret[0] (7) == expected_len (8)
        # ex_oob_ahead_break: Test terminated by assertion
        #          FAIL  msg_oob.no_peek.ex_oob_ahead_break
        not ok 11 msg_oob.no_peek.ex_oob_ahead_break
      
      With fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_ahead_break ...
        # msg_oob.c:146:ex_oob_ahead_break:AF_UNIX :hellowol
        # msg_oob.c:147:ex_oob_ahead_break:TCP     :helloworl
        #            OK  msg_oob.no_peek.ex_oob_ahead_break
        ok 11 msg_oob.no_peek.ex_oob_ahead_break
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      36893ef0
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c. · f5ea0768
      Kuniyuki Iwashima authored
      While testing, I found some weird behaviour on the TCP side as well.
      
      For example, TCP drops the preceding OOB data when queueing a new
      OOB data if the old OOB data is at the head of recvq.
      
        #  RUN           msg_oob.no_peek.ex_oob_drop ...
        # msg_oob.c:146:ex_oob_drop:AF_UNIX :x
        # msg_oob.c:147:ex_oob_drop:TCP     :Resource temporarily unavailable
        # msg_oob.c:146:ex_oob_drop:AF_UNIX :y
        # msg_oob.c:147:ex_oob_drop:TCP     :Invalid argument
        #            OK  msg_oob.no_peek.ex_oob_drop
        ok 9 msg_oob.no_peek.ex_oob_drop
      
        #  RUN           msg_oob.no_peek.ex_oob_drop_2 ...
        # msg_oob.c:146:ex_oob_drop_2:AF_UNIX :x
        # msg_oob.c:147:ex_oob_drop_2:TCP     :Resource temporarily unavailable
        #            OK  msg_oob.no_peek.ex_oob_drop_2
        ok 10 msg_oob.no_peek.ex_oob_drop_2
      
      This patch allows AF_UNIX's MSG_OOB implementation to produce different
      results from TCP when operations are guarded with tcp_incompliant{}.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f5ea0768
    • Kuniyuki Iwashima's avatar
      af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head. · 93c99f21
      Kuniyuki Iwashima authored
      Let's say a socket send()s "hello" with MSG_OOB and "world" without flags,
      
        >>> from socket import *
        >>> c1, c2 = socketpair(AF_UNIX)
        >>> c1.send(b'hello', MSG_OOB)
        5
        >>> c1.send(b'world')
        5
      
      and its peer recv()s "hell" and "o".
      
        >>> c2.recv(10)
        b'hell'
        >>> c2.recv(1, MSG_OOB)
        b'o'
      
      Now the consumed OOB skb stays at the head of recvq to return a correct
      value for ioctl(SIOCATMARK), which is broken now and fixed by a later
      patch.
      
      Then, if peer issues recv() with MSG_DONTWAIT, manage_oob() returns NULL,
      so recv() ends up with -EAGAIN.
      
        >>> c2.setblocking(False)  # This causes -EAGAIN even with available data
        >>> c2.recv(5)
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        BlockingIOError: [Errno 11] Resource temporarily unavailable
      
      However, next recv() will return the following available data, "world".
      
        >>> c2.recv(5)
        b'world'
      
      When the consumed OOB skb is at the head of the queue, we need to fetch
      the next skb to fix the weird behaviour.
      
      Note that the issue does not happen without MSG_DONTWAIT because we can
      retry after manage_oob().
      
      This patch also adds a test case that covers the issue.
      
      Without fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_break ...
        # msg_oob.c:134:ex_oob_break:AF_UNIX :Resource temporarily unavailable
        # msg_oob.c:135:ex_oob_break:Expected:ld
        # msg_oob.c:137:ex_oob_break:Expected ret[0] (-1) == expected_len (2)
        # ex_oob_break: Test terminated by assertion
        #          FAIL  msg_oob.no_peek.ex_oob_break
        not ok 8 msg_oob.no_peek.ex_oob_break
      
      With fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_break ...
        #            OK  msg_oob.no_peek.ex_oob_break
        ok 8 msg_oob.no_peek.ex_oob_break
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      93c99f21
    • Kuniyuki Iwashima's avatar
      af_unix: Stop recv(MSG_PEEK) at consumed OOB skb. · b94038d8
      Kuniyuki Iwashima authored
      After consuming OOB data, recv() reading the preceding data must break at
      the OOB skb regardless of MSG_PEEK.
      
      Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is
      not compliant with TCP.
      
        >>> from socket import *
        >>> c1, c2 = socketpair(AF_UNIX)
        >>> c1.send(b'hello', MSG_OOB)
        5
        >>> c1.send(b'world')
        5
        >>> c2.recv(1, MSG_OOB)
        b'o'
        >>> c2.recv(9, MSG_PEEK)  # This should return b'hell'
        b'hellworld'              # even with enough buffer.
      
      Let's fix it by returning NULL for consumed skb and unlinking it only if
      MSG_PEEK is not specified.
      
      This patch also adds test cases that add recv(MSG_PEEK) before each recv().
      
      Without fix:
      
        #  RUN           msg_oob.peek.oob_ahead_break ...
        # msg_oob.c:134:oob_ahead_break:AF_UNIX :hellworld
        # msg_oob.c:135:oob_ahead_break:Expected:hell
        # msg_oob.c:137:oob_ahead_break:Expected ret[0] (9) == expected_len (4)
        # oob_ahead_break: Test terminated by assertion
        #          FAIL  msg_oob.peek.oob_ahead_break
        not ok 13 msg_oob.peek.oob_ahead_break
      
      With fix:
      
        #  RUN           msg_oob.peek.oob_ahead_break ...
        #            OK  msg_oob.peek.oob_ahead_break
        ok 13 msg_oob.peek.oob_ahead_break
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b94038d8