Commits · 86c8fb4d228ed8dbe17b1abd664888bc7ee0052a · Kirill Smelkov / linux

26 May, 2022 2 commits

scsi: storvsc: Removing Pre Win8 related logic · 86c8fb4d

Saurabh Sengar authored May 25, 2022

The latest storvsc code has already removed the support for windows 7 and
earlier. There is still some code logic remaining which is there to support
pre Windows 8 OS. This patch removes these stale logic.
This patch majorly does three things :

1. Removes vmscsi_size_delta and its logic, as the vmscsi_request struct is
same for all the OS post windows 8 there is no need of delta.
2. Simplify sense_buffer_size logic, as there is single buffer size for
all the post windows 8 OS.
3. Embed the vmscsi_win8_extension structure inside the vmscsi_request,
as there is no separate handling required for different OS.
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1653478022-26621-1-git-send-email-ssengar@linux.microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

86c8fb4d

Drivers: hv: vmbus: fix typo in comment · 1940f9f8

Julia Lawall authored May 21, 2022

Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220521111145.81697-60-Julia.Lawall@inria.frSigned-off-by: Wei Liu <wei.liu@kernel.org>

1940f9f8

13 May, 2022 2 commits

PCI: hv: Fix synchronization between channel callback and hv_pci_bus_exit() · b4927bd2

Andrea Parri (Microsoft) authored May 12, 2022

[ Similarly to commit a765ed47 ("PCI: hv: Fix synchronization
  between channel callback and hv_compose_msi_msg()"): ]

The (on-stack) teardown packet becomes invalid once the completion
timeout in hv_pci_bus_exit() has expired and hv_pci_bus_exit() has
returned.  Prevent the channel callback from accessing the invalid
packet by removing the ID associated to such packet from the VMbus
requestor in hv_pci_bus_exit().
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20220511223207.3386-3-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

b4927bd2

PCI: hv: Add validation for untrusted Hyper-V values · 9937fa6d

Andrea Parri (Microsoft) authored May 12, 2022

While at it, remove a redundant validation in hv_pci_generic_compl():
hv_pci_onchannelcallback() already ensures that all processed incoming
packets are "at least as large as [in fact larger than] a response".
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20220511223207.3386-2-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

9937fa6d

11 May, 2022 7 commits

PCI: hv: Fix interrupt mapping for multi-MSI · a2bad844

Jeffrey Hugo authored May 11, 2022

According to Dexuan, the hypervisor folks beleive that multi-msi
allocations are not correct. compose_msi_msg() will allocate multi-msi
one by one. However, multi-msi is a block of related MSIs, with alignment
requirements. In order for the hypervisor to allocate properly aligned
and consecutive entries in the IOMMU Interrupt Remapping Table, there
should be a single mapping request that requests all of the multi-msi
vectors in one shot.

Dexuan suggests detecting the multi-msi case and composing a single
request related to the first MSI. Then for the other MSIs in the same
block, use the cached information. This appears to be viable, so do it.
Suggested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1652282599-21643-1-git-send-email-quic_jhugo@quicinc.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

a2bad844

PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() · b4b77778

Jeffrey Hugo authored May 11, 2022

Currently if compose_msi_msg() is called multiple times, it will free any
previous IRTE allocation, and generate a new allocation. While nothing
prevents this from occurring, it is extraneous when Linux could just reuse
the existing allocation and avoid a bunch of overhead.

However, when future IRTE allocations operate on blocks of MSIs instead of
a single line, freeing the allocation will impact all of the lines. This
could cause an issue where an allocation of N MSIs occurs, then some of
the lines are retargeted, and finally the allocation is freed/reallocated.
The freeing of the allocation removes all of the configuration for the
entire block, which requires all the lines to be retargeted, which might
not happen since some lines might already be unmasked/active.
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1652282582-21595-1-git-send-email-quic_jhugo@quicinc.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

b4b77778

drm/hyperv: Remove support for Hyper-V 2008 and 2008R2/Win7 · ac6811a9

Michael Kelley authored May 02, 2022

The DRM Hyper-V driver has special case code for running on the first
released versions of Hyper-V: 2008 and 2008 R2/Windows 7. These versions
are now out of support (except for extended security updates) and lack
support for performance features that are needed for effective production
usage of Linux guests.

The negotiation of the VMbus protocol versions required by these old
Hyper-V versions has been removed from the VMbus driver. So now remove
the handling of these VMbus protocol versions from the DRM Hyper-V
driver.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Deepak Rawat <drawat.floss@gmail.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/1651509391-2058-5-git-send-email-mikelley@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

ac6811a9

video: hyperv_fb: Remove support for Hyper-V 2008 and 2008R2/Win7 · b0cce4f6

Michael Kelley authored May 02, 2022

The hyperv_fb driver has special case code for running on the first
released versions of Hyper-V: 2008 and 2008 R2/Windows 7. These versions
are now out of support (except for extended security updates) and lack
support for performance features that are needed for effective production
usage of Linux guests.

The negotiation of the VMbus protocol versions required by these old
Hyper-V versions has been removed from the VMbus driver. So now remove
the handling of these VMbus protocol versions from the hyperv_fb driver.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/1651509391-2058-4-git-send-email-mikelley@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

b0cce4f6

scsi: storvsc: Remove support for Hyper-V 2008 and 2008R2/Win7 · 106b98a5

Michael Kelley authored May 02, 2022

The storvsc driver has special case code for running on the first released
versions of Hyper-V: 2008 and 2008 R2/Windows 7. These versions are now
out of support (except for extended security updates) and lack support
for performance features like multiple VMbus channels that are needed for
effective production usage of Linux guests.

The negotiation of the VMbus protocol versions required by these old
Hyper-V versions has been removed from the VMbus driver. So now remove
the handling of these VMbus protocol versions from the storvsc driver.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/1651509391-2058-3-git-send-email-mikelley@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

106b98a5

Drivers: hv: vmbus: Remove support for Hyper-V 2008 and Hyper-V 2008R2/Win7 · a6b94c6b

Michael Kelley authored May 02, 2022

The VMbus driver has special case code for running on the first released
versions of Hyper-V: 2008 and 2008 R2/Windows 7. These versions are now
out of support (except for extended security updates) and lack the
performance features needed for effective production usage of Linux
guests.

Simplify the code by removing the negotiation of the VMbus protocol
versions required for these releases of Hyper-V, and by removing the
special case code for handling these VMbus protocol versions.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/1651509391-2058-2-git-send-email-mikelley@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

a6b94c6b

x86/hyperv: Disable hardlockup detector by default in Hyper-V guests · f1f8288d

Michael Kelley authored May 09, 2022

In newer versions of Hyper-V, the x86/x64 PMU can be virtualized
into guest VMs by explicitly enabling it. Linux kernels are typically
built to automatically enable the hardlockup detector if the PMU is
found. To prevent the possibility of false positives due to the
vagaries of VM scheduling, disable the PMU-based hardlockup detector
by default in a VM on Hyper-V. The hardlockup detector can still be
enabled by overriding the default with the nmi_watchdog=1 option on
the kernel boot line or via sysctl at runtime.

This change mimics the approach taken with KVM guests in
commit 692297d8 ("watchdog: introduce the hardlockup_detector_disable()
function").

Linux on ARM64 does not provide a PMU-based hardlockup detector, so
there's no corresponding disable in the Hyper-V init code on ARM64.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1652111063-6535-1-git-send-email-mikelley@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

f1f8288d

03 May, 2022 2 commits

drm/hyperv: Add error message for fb size greater than allocated · 6733dd4a

Saurabh Sengar authored Apr 11, 2022

Add error message when the size of requested framebuffer is more than
the allocated size by vmbus mmio region for framebuffer.
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1649737739-10113-1-git-send-email-ssengar@linux.microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

6733dd4a

PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time · 23e118a4

Dexuan Cui authored May 02, 2022

Currently when the pci-hyperv driver finishes probing and initializing the
PCI device, it sets the PCI_COMMAND_MEMORY bit; later when the PCI device
is registered to the core PCI subsystem, the core PCI driver's BAR detection
and initialization code toggles the bit multiple times, and each toggling of
the bit causes the hypervisor to unmap/map the virtual BARs from/to the
physical BARs, which can be slow if the BAR sizes are huge, e.g., a Linux VM
with 14 GPU devices has to spend more than 3 minutes on BAR detection and
initialization, causing a long boot time.

Reduce the boot time by not setting the PCI_COMMAND_MEMORY bit when we
register the PCI device (there is no need to have it set in the first place).
The bit stays off till the PCI device driver calls pci_enable_device().
With this change, the boot time of such a 14-GPU VM is reduced by almost
3 minutes.

Link: https://lore.kernel.org/lkml/20220419220007.26550-1-decui@microsoft.com/Tested-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Jake Oshins <jakeo@microsoft.com>
Link: https://lore.kernel.org/r/20220502074255.16901-1-decui@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

23e118a4

28 Apr, 2022 6 commits

PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI · 455880df

Jeffrey Hugo authored Apr 27, 2022

In the multi-MSI case, hv_arch_irq_unmask() will only operate on the first
MSI of the N allocated. This is because only the first msi_desc is cached
and it is shared by all the MSIs of the multi-MSI block. This means that
hv_arch_irq_unmask() gets the correct address, but the wrong data (always
0).

This can break MSIs.

Lets assume MSI0 is vector 34 on CPU0, and MSI1 is vector 33 on CPU0.

hv_arch_irq_unmask() is called on MSI0. It uses a hypercall to configure
the MSI address and data (0) to vector 34 of CPU0. This is correct. Then
hv_arch_irq_unmask is called on MSI1. It uses another hypercall to
configure the MSI address and data (0) to vector 33 of CPU0. This is
wrong, and results in both MSI0 and MSI1 being routed to vector 33. Linux
will observe extra instances of MSI1 and no instances of MSI0 despite the
endpoint device behaving correctly.

For the multi-MSI case, we need unique address and data info for each MSI,
but the cached msi_desc does not provide that. However, that information
can be gotten from the int_desc cached in the chip_data by
compose_msi_msg(). Fix the multi-MSI case to use that cached information
instead. Since hv_set_msi_entry_from_desc() is no longer applicable,
remove it.
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1651068453-29588-1-git-send-email-quic_jhugo@quicinc.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

455880df

Drivers: hv: vmbus: Refactor the ring-buffer iterator functions · 1c9de08f

Andrea Parri (Microsoft) authored Apr 28, 2022

With no users of hv_pkt_iter_next_raw() and no "external" users of
hv_pkt_iter_first_raw(), the iterator functions can be refactored
and simplified to remove some indirection/code.
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220428145107.7878-6-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

1c9de08f

Drivers: hv: vmbus: Accept hv_sock offers in isolated guests · da795eb2

Andrea Parri (Microsoft) authored Apr 28, 2022

So that isolated guests can communicate with the host via hv_sock
channels.
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220428145107.7878-5-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

da795eb2

hv_sock: Add validation for untrusted Hyper-V values · dbde6d0c

Andrea Parri (Microsoft) authored Apr 28, 2022

For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest in the host-to-guest ring buffer. Ensure that
invalid values cannot cause data being copied out of the bounds of the
source buffer in hvs_stream_dequeue().
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220428145107.7878-4-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

dbde6d0c

hv_sock: Copy packets sent by Hyper-V out of the ring buffer · 066f3377

Andrea Parri (Microsoft) authored Apr 28, 2022

Pointers to VMbus packets sent by Hyper-V are used by the hv_sock driver
within the guest VM. Hyper-V can send packets with erroneous values or
modify packet fields after they are processed by the guest. To defend
against these scenarios, copy the incoming packet after validating its
length and offset fields using hv_pkt_iter_{first,next}(). Use
HVS_PKT_LEN(HVS_MTU_SIZE) to initialize the buffer which holds the
copies of the incoming packets. In this way, the packet can no longer
be modified by the host.
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220428145107.7878-3-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

066f3377

hv_sock: Check hv_pkt_iter_first_raw()'s return value · 71abb94f

Andrea Parri (Microsoft) authored Apr 28, 2022

The function returns NULL if the ring buffer doesn't contain enough
readable bytes to constitute a packet descriptor. The ring buffer's
write_index is in memory which is shared with the Hyper-V host, an
erroneous or malicious host could thus change its value and overturn
the result of hvs_stream_has_data().
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220428145107.7878-2-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

71abb94f

25 Apr, 2022 8 commits

PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg() · a765ed47

Andrea Parri (Microsoft) authored Apr 19, 2022

Dexuan wrote:

  "[...]  when we disable AccelNet, the host PCI VSP driver sends a
   PCI_EJECT message first, and the channel callback may set
   hpdev->state to hv_pcichild_ejecting on a different CPU.  This can
   cause hv_compose_msi_msg() to exit from the loop and 'return', and
   the on-stack variable 'ctxt' is invalid.  Now, if the response
   message from the host arrives, the channel callback will try to
   access the invalid 'ctxt' variable, and this may cause a crash."

Schematically:

  Hyper-V sends PCI_EJECT msg
    hv_pci_onchannelcallback()
      state = hv_pcichild_ejecting
                                       hv_compose_msi_msg()
                                         alloc and init comp_pkt
                                         state == hv_pcichild_ejecting
  Hyper-V sends VM_PKT_COMP msg
    hv_pci_onchannelcallback()
      retrieve address of comp_pkt
                                         'free' comp_pkt and return
      comp_pkt->completion_func()

Dexuan also showed how the crash can be triggered after introducing
suitable delays in the driver code, thus validating the 'assumption'
that the host can still normally respond to the guest's compose_msi
request after the host has started to eject the PCI device.

Fix the synchronization by leveraging the requestor lock as follows:

  - Before 'return'-ing in hv_compose_msi_msg(), remove the ID (while
    holding the requestor lock) associated to the completion packet.

  - Retrieve the address *and call ->completion_func() within a same
    (requestor) critical section in hv_pci_onchannelcallback().
Reported-by: Wei Hu <weh@microsoft.com>
Reported-by: Dexuan Cui <decui@microsoft.com>
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220419122325.10078-7-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

a765ed47

Drivers: hv: vmbus: Introduce {lock,unlock}_requestor() · b91eaf72

Andrea Parri (Microsoft) authored Apr 19, 2022

To abtract the lock and unlock operations on the requestor spin lock.
The helpers will come in handy in hv_pci.

No functional change.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220419122325.10078-6-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

b91eaf72

Drivers: hv: vmbus: Introduce vmbus_request_addr_match() · 0aadb6a7

Andrea Parri (Microsoft) authored Apr 19, 2022

The function can be used to retrieve and clear/remove a transation ID
from a channel requestor, provided the memory address corresponding to
the ID equals a specified address. The function, and its 'lockless'
variant __vmbus_request_addr_match(), will be used by hv_pci.

Refactor vmbus_request_addr() to reuse the 'newly' introduced code.

No functional change.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220419122325.10078-5-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

0aadb6a7

Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid() · b03afa57

Andrea Parri (Microsoft) authored Apr 19, 2022

The function can be used to send a VMbus packet and retrieve the
corresponding transaction ID. It will be used by hv_pci.

No functional change.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220419122325.10078-4-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

b03afa57

PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening · de5ddb7d

Andrea Parri (Microsoft) authored Apr 19, 2022

Currently, pointers to guest memory are passed to Hyper-V as transaction
IDs in hv_pci. In the face of errors or malicious behavior in Hyper-V,
hv_pci should not expose or trust the transaction IDs returned by
Hyper-V to be valid guest memory addresses. Instead, use small integers
generated by vmbus_requestor as request (transaction) IDs.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220419122325.10078-3-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

de5ddb7d

Drivers: hv: vmbus: Fix handling of messages with transaction ID of zero · 82cd4bac

Andrea Parri (Microsoft) authored Apr 19, 2022

vmbus_request_addr() returns 0 (zero) if the transaction ID passed
to as argument is 0. This is unfortunate for two reasons: first,
netvsc_send_completion() does not check for a NULL cmd_rqst (before
dereferencing the corresponding NVSP message); second, 0 is a *valid*
value of cmd_rqst in netvsc_send_tx_complete(), cf. the call of
vmbus_sendpacket() in netvsc_send_pkt().

vmbus_request_addr() has included the code in question since its
introduction with commit e8b7db38 ("Drivers: hv: vmbus: Add
vmbus_requestor data structure for VMBus hardening"); such code was
motivated by the early use of vmbus_requestor by hv_storvsc. Since
hv_storvsc moved to a tag-based mechanism to generate and retrieve
transaction IDs with commit bf5fd8ca ("scsi: storvsc: Use
blk_mq_unique_tag() to generate requestIDs"), vmbus_request_addr()
can be modified to return VMBUS_RQST_ERROR if the ID is 0. This
change solves the issues in hv_netvsc (and makes the handling of
messages with transaction ID of 0 consistent with the semantics
"the ID is not contained in the requestor/invalid ID").

vmbus_next_request_id(), vmbus_request_addr() should still reserve
the ID of 0 for Hyper-V, because Hyper-V will "ignore" (not respond
to) VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED packets/requests with
transaction ID of 0 from the guest.

Fixes: bf5fd8ca ("scsi: storvsc: Use blk_mq_unique_tag() to generate requestIDs")
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20220419122325.10078-2-parri.andrea@gmail.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

82cd4bac

PCI: hv: Fix multi-MSI to allow more than one MSI vector · 08e61e86

Jeffrey Hugo authored Apr 13, 2022

If the allocation of multiple MSI vectors for multi-MSI fails in the core
PCI framework, the framework will retry the allocation as a single MSI
vector, assuming that meets the min_vecs specified by the requesting
driver.

Hyper-V advertises that multi-MSI is supported, but reuses the VECTOR
domain to implement that for x86. The VECTOR domain does not support
multi-MSI, so the alloc will always fail and fallback to a single MSI
allocation.

In short, Hyper-V advertises a capability it does not implement.

Hyper-V can support multi-MSI because it coordinates with the hypervisor
to map the MSIs in the IOMMU's interrupt remapper, which is something the
VECTOR domain does not have. Therefore the fix is simple - copy what the
x86 IOMMU drivers (AMD/Intel-IR) do by removing
X86_IRQ_ALLOC_CONTIGUOUS_VECTORS after calling the VECTOR domain's
pci_msi_prepare().

Fixes: 4daace0d ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1649856981-14649-1-git-send-email-quic_jhugo@quicinc.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

08e61e86

Drivers: hv: vmbus: Add VMbus IMC device to unsupported list · 66200bbc

Michael Kelley authored Apr 12, 2022

Hyper-V may offer an Initial Machine Configuration (IMC) synthetic
device to guest VMs. The device may be used by Windows guests to get
specialization information, such as the hostname. But the device
is not used in Linux and there is no Linux driver, so it is
unsupported.

Currently, the IMC device GUID is not recognized by the VMbus driver,
which results in an "Unknown GUID" error message during boot. Add
the GUID to the list of known but unsupported devices so that the
error message is not generated. Other than avoiding the error message,
there is no change in guest behavior.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1649818140-100953-1-git-send-email-mikelley@microsoft.comSigned-off-by: Wei Liu <wei.liu@kernel.org>

66200bbc

24 Apr, 2022 8 commits

Linux 5.18-rc4 · af2d861d
Linus Torvalds authored Apr 24, 2022

af2d861d

Merge tag 'sched_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42740a2f

Linus Torvalds authored Apr 24, 2022

Pull scheduler fix from Borislav Petkov:

 - Fix a corner case when calculating sched runqueue variables

That fix also removes a check for a zero divisor in the code, without
mentioning it.  Vincent clarified that it's ok after I whined about it:

  https://lore.kernel.org/all/CAKfTPtD2QEyZ6ADd5WrwETMOX0XOwJGnVddt7VHgfURdqgOS-Q@mail.gmail.com/

* tag 'sched_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/pelt: Fix attach_entity_load_avg() corner case

42740a2f

Merge tag 'powerpc-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 5206548f

Linus Torvalds authored Apr 24, 2022

Pull powerpc fixes from Michael Ellerman:

 - Partly revert a change to our timer_interrupt() that caused lockups
   with high res timers disabled.

 - Fix a bug in KVM TCE handling that could corrupt kernel memory.

 - Two commits fixing Power9/Power10 perf alternative event selection.

Thanks to Alexey Kardashevskiy, Athira Rajeev, David Gibson, Frederic
Barrat, Madhavan Srinivasan, Miguel Ojeda, and Nicholas Piggin.

* tag 'powerpc-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/perf: Fix 32bit compile
  powerpc/perf: Fix power10 event alternatives
  powerpc/perf: Fix power9 event alternatives
  KVM: PPC: Fix TCE handling for VFIO
  powerpc/time: Always set decrementer in timer_interrupt()

5206548f

Merge tag 'perf_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f48ffef1

Linus Torvalds authored Apr 24, 2022

Pull perf fixes from Borislav Petkov:

 - Add Sapphire Rapids CPU support

 - Fix a perf vmalloc-ed buffer mapping error (PERF_USE_VMALLOC in use)

* tag 'perf_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/cstate: Add SAPPHIRERAPIDS_X CPU support
  perf/core: Fix perf_mmap fail when CONFIG_PERF_USE_VMALLOC enabled

f48ffef1

Merge tag 'edac_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · b877ca4d

Linus Torvalds authored Apr 24, 2022

Pull EDAC fix from Borislav Petkov:

 - Read the reported error count from the proper register on
   synopsys_edac

* tag 'edac_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/synopsys: Read the error count from the correct register

b877ca4d

kvmalloc: use vmalloc_huge for vmalloc allocations · 9becb688

Linus Torvalds authored Apr 22, 2022

Since commit 559089e0 ("vmalloc: replace VM_NO_HUGE_VMAP with
VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc is an
opt-in strategy, because it caused a number of problems that weren't
noticed until x86 enabled it too.

One of the issues was fixed by Nick Piggin in commit 3b8000ae
("mm/vmalloc: huge vmalloc backing pages should be split rather than
compound"), but I'm still worried about page protection issues, and
VM_FLUSH_RESET_PERMS in particular.

However, like the hash table allocation case (commit f2edd118:
"page_alloc: use vmalloc_huge for large system hash"), the use of
kvmalloc() should be safe from any such games, since the returned
pointer might be a SLUB allocation, and as such no user should
reasonably be using it in any odd ways.

We also know that the allocations are fairly large, since it falls back
to the vmalloc case only when a kmalloc() fails.  So using a hugepage
mapping seems both safe and relevant.

This patch does show a weakness in the opt-in strategy: since the opt-in
flag is in the 'vm_flags', not the usual gfp_t allocation flags, very
few of the usual interfaces actually expose it.

That's not much of an issue in this case that already used one of the
fairly specialized low-level vmalloc interfaces for the allocation, but
for a lot of other vmalloc() users that might want to opt in, it's going
to be very inconvenient.

We'll either have to fix any compatibility problems, or expose it in the
gfp flags (__GFP_COMP would have made a lot of sense) to allow normal
vmalloc() users to use hugepage mappings.  That said, the cases that
really matter were probably already taken care of by the hash tabel
allocation.

Link: https://lore.kernel.org/all/20220415164413.2727220-1-song@kernel.org/
Link: https://lore.kernel.org/all/CAHk-=whao=iosX1s5Z4SF-ZGa-ebAukJoAdUJFk5SPwnofV+Vg@mail.gmail.com/
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9becb688

page_alloc: use vmalloc_huge for large system hash · f2edd118

Song Liu authored Apr 15, 2022

Use vmalloc_huge() in alloc_large_system_hash() so that large system
hash (>= PMD_SIZE) could benefit from huge pages.

Note that vmalloc_huge only allocates huge pages for systems with
HAVE_ARCH_HUGE_VMALLOC.
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f2edd118

Merge tag '5.18-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd · 22da5264

Linus Torvalds authored Apr 23, 2022

Pull ksmbd server fixes from Steve French:

 - cap maximum sector size reported to avoid mount problems

 - reference count fix

 - fix filename rename race

* tag '5.18-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd:
  ksmbd: set fixed sector size to FS_SECTOR_SIZE_INFORMATION
  ksmbd: increment reference count of parent fp
  ksmbd: remove filename in ksmbd_file

22da5264

23 Apr, 2022 5 commits

Merge tag 'arc-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · f3935926

Linus Torvalds authored Apr 23, 2022

Pull ARC fixes from Vineet Gupta:

 - Assorted fixes

* tag 'arc-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: remove redundant READ_ONCE() in cmpxchg loop
  ARC: atomic: cleanup atomic-llsc definitions
  arc: drop definitions of pgd_index() and pgd_offset{, _k}() entirely
  ARC: dts: align SPI NOR node name with dtschema
  ARC: Remove a redundant memset()
  ARC: fix typos in comments
  ARC: entry: fix syscall_trace_exit argument

f3935926

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 6fc2586d

Linus Torvalds authored Apr 23, 2022

Pull SCSI fix from James Bottomley:
 "One fix for an information leak caused by copying a buffer to
  userspace without checking for error first in the sr driver"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sr: Do not leak information in ioctl

6fc2586d

Merge tag 'for-linus-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · b51bd23c

Linus Torvalds authored Apr 23, 2022

Pull xen fixes from Juergen Gross:
 "A simple cleanup patch and a refcount fix for Xen on Arm"

* tag 'for-linus-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  arm/xen: Fix some refcount leaks
  xen: Convert kmap() to kmap_local_page()

b51bd23c

Merge tag 'drm-fixes-2022-04-23' of git://anongit.freedesktop.org/drm/drm · 13bc32ba

Linus Torvalds authored Apr 23, 2022

Pull more drm fixes from Dave Airlie:
 "Maarten was away, so Maxine stepped up and sent me the drm-fixes
  merge, so no point leaving it for another week.

  The big change is an OF revert around bridge/panels, it may have some
  driver fallout, but hopefully this revert gets them shook out in the
  next week easier.

  Otherwise it's a bunch of locking/refcounts across drivers, a radeon
  dma_resv logic fix and some raspberry pi panel fixes.

  panel:
   - revert of patch that broke panel/bridge issues

  dma-buf:
   - remove unused header file.

  amdgpu:
   - partial revert of locking change

  radeon:
   - fix dma_resv logic inversion

  panel:
   - pi touchscreen panel init fixes

  vc4:
   - build fix
   - runtime pm refcount fix

  vmwgfx:
   - refcounting fix"

* tag 'drm-fixes-2022-04-23' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: partial revert "remove ctx->lock" v2
  Revert "drm: of: Lookup if child node has panel or bridge"
  Revert "drm: of: Properly try all possible cases for bridge/panel detection"
  drm/vc4: Use pm_runtime_resume_and_get to fix pm_runtime_get_sync() usage
  drm/vmwgfx: Fix gem refcounting and memory evictions
  drm/vc4: Fix build error when CONFIG_DRM_VC4=y && CONFIG_RASPBERRYPI_FIRMWARE=m
  drm/panel/raspberrypi-touchscreen: Initialise the bridge in prepare
  drm/panel/raspberrypi-touchscreen: Avoid NULL deref if not initialised
  dma-buf-map: remove renamed header file
  drm/radeon: fix logic inversion in radeon_sync_resv

13bc32ba

Merge tag 'input-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 0fe86b27

Linus Torvalds authored Apr 23, 2022

Pull input fixes from Dmitry Torokhov:

 - a new set of keycodes to be used by marine navigation systems

 - minor fixes to omap4-keypad and cypress-sf drivers

* tag 'input-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: add Marine Navigation Keycodes
  Input: omap4-keypad - fix pm_runtime_get_sync() error checking
  Input: cypress-sf - register a callback to disable the regulators

0fe86b27