Commits · e49bad785e550fe26ca9416ffc0c85fef84be808 · Kirill Smelkov / linux

26 Jul, 2023 3 commits

RDMA/irdma: Add table based lookup for CQ pointer during an event · e49bad78

Krzysztof Czurylo authored Jul 25, 2023

Add a CQ table based loookup to allow quick search
for CQ pointer having CQ ID in case of CQ related
asynchrononous event. The table is implemented in a
similar fashion to QP table.

Also add a reference counters for CQ. This is to prevent
destroying CQ while an asynchronous event is being processed.

The memory resource table size is sized higher with this update,
and this table doesn't need to be physically contiguous, so use
a vzalloc vs kzalloc to allocate the table.
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-4-shiraz.saleem@intel.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

e49bad78

RDMA/irdma: Refactor error handling in create CQP · 133b1cba

Sindhu Devale authored Jul 25, 2023

In case of a failure in irdma_create_cqp, do not call
irdma_destroy_cqp, but cleanup all the allocated resources
in reverse order.

Drop the extra argument in irdma_destroy_cqp as its no longer needed.
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-3-shiraz.saleem@intel.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

133b1cba

RDMA/irdma: Drop a local in irdma_sc_get_next_aeqe · 8cfc99da

Sindhu Devale authored Jul 25, 2023

Drop the local wqe_idx in irdma_sc_get_next_aeqe and instead
store the wqe_idx in the info structure for all asynchronous events(AE)
received. There is no reason it should be tied to a specific AE source.
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-2-shiraz.saleem@intel.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

8cfc99da

24 Jul, 2023 3 commits

IB/hfi1: Use struct_size() · 24b1b5d8

Christophe JAILLET authored Jul 22, 2023

Use struct_size() instead of hand-writing it, when allocating a structure
with a flex array.

This is less verbose, more robust and more informative.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/f4618a67d5ae0a30eb3f2b4558c8cc790feed79a.1690044376.git.christophe.jaillet@wanadoo.frSigned-off-by: Leon Romanovsky <leon@kernel.org>

24b1b5d8

RDMA/hns: Remove VF extend configuration · 0b5eed06

Junxian Huang authored Jul 21, 2023

Remove VF extend configuration since the relative registers are
configured in firmware currently.
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20230721025146.450831-3-huangjunxian6@hisilicon.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

0b5eed06

RDMA/hns: Support get XRCD number from firmware · f5a61344

Luoyouming authored Jul 21, 2023

Support driver get the num of XRCD from firmware.
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20230721025146.450831-2-huangjunxian6@hisilicon.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

f5a61344

23 Jul, 2023 2 commits

RDMA/qedr: Remove duplicate assignments of va · 44725a87

Minjie Du authored Jul 05, 2023

Avoid double assignment of iwqp->ietf_mem.va.
Signed-off-by: Minjie Du <duminjie@vivo.com>
Link: https://lore.kernel.org/r/20230705031849.2443-1-duminjie@vivo.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

44725a87

RDMA/qedr: Remove a duplicate assignment in qedr_create_gsi_qp() · 2f5833ea

Minjie Du authored Jul 05, 2023

Delete a duplicate statement from this function implementation.
Signed-off-by: Minjie Du <duminjie@vivo.com>
Link: https://lore.kernel.org/r/20230705103950.15225-1-duminjie@vivo.comAcked-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>

2f5833ea

21 Jul, 2023 14 commits

RDMA/bnxt_re: Add a new uapi for driver notification · 61a8118f

Chandramohan Akula authored Jul 18, 2023

Add driver notify uapi for application notifying
the driver about the doorbell FIFO congestion.

Link: https://lore.kernel.org/r/1689742977-9128-8-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

61a8118f

RDMA/bnxt_re: Implement doorbell pacing algorithm · 2ad4e630

Chandramohan Akula authored Jul 18, 2023

User applications alert the driver when the Doorbell FIFO
reaches the alarm threshold. The driver updates the pacing
parameters in the shared page to do the maximum pacing
by the application till the DB FIFO congestion reduces to
pacing threshold. Driver keeps checking the DB FIFO depth
at the pacing interval and gradually adjusts the pacing level.
Once the pacing level reaches default values (no congestion in
the FIFO) pacing gets completed.

Link: https://lore.kernel.org/r/1689742977-9128-7-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

2ad4e630

RDMA/bnxt_re: Update alloc_page uapi for pacing · ea222485

Chandramohan Akula authored Jul 18, 2023

Update the alloc_page uapi functionality for handling the
mapping of doorbell pacing shared page and bar address.

Link: https://lore.kernel.org/r/1689742977-9128-6-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

ea222485

RDMA/bnxt_re: Enable pacing support for the user apps · fa8fad92

Chandramohan Akula authored Jul 18, 2023

Report the pacing capability to the user applications.

Link: https://lore.kernel.org/r/1689742977-9128-5-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

fa8fad92

RDMA/bnxt_re: Initialize Doorbell pacing feature · 586e613d

Chandramohan Akula authored Jul 18, 2023

Checks for pacing feature capability and get the doorbell pacing
configuration using FW commands. Allocate a page and initialize
the pacing parameters for the applications. Cleanup the page and
de-initialize the pacing during device removal.

Link: https://lore.kernel.org/r/1689742977-9128-4-git-send-email-selvin.xavier@broadcom.comSigned-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

586e613d

bnxt_en: Share the bar0 address with the RoCE driver · 61220e09

Chandramohan Akula authored Jul 18, 2023

Add a parameter in the bnxt_en_dev structure to share the bar0 address
with RoCE driver.

Link: https://lore.kernel.org/r/1689742977-9128-3-git-send-email-selvin.xavier@broadcom.com
CC: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

61220e09

bnxt_en: Update HW interface headers · cf1694f0

Chandramohan Akula authored Jul 18, 2023

Updating the HW structures for the doorbell pacing related
information. Newly added interface structures will be used in
the followup patches.

Link: https://lore.kernel.org/r/1689742977-9128-2-git-send-email-selvin.xavier@broadcom.com
CC: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

cf1694f0

RDMA/cma: Avoid GID lookups on iWARP devices · f8ef1be8

Chuck Lever authored Jul 17, 2023

We would like to enable the use of siw on top of a VPN that is
constructed and managed via a tun device. That hasn't worked up
until now because ARPHRD_NONE devices (such as tun devices) have
no GID for the RDMA/core to look up.

But it turns out that the egress device has already been picked for
us -- no GID is necessary. addr_handler() just has to do the right
thing with it.

Link: https://lore.kernel.org/r/168960675257.3007.4737911174148394395.stgit@manet.1015granger.netSuggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

f8ef1be8

RDMA/cma: Deduplicate error flow in cma_validate_port() · 700c9649

Chuck Lever authored Jul 17, 2023

Clean up to prepare for the addition of new logic.

Link: https://lore.kernel.org/r/168960674597.3007.6128252077812202526.stgit@manet.1015granger.netSigned-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

700c9649

RDMA/core: Set gid_attr.ndev for iWARP devices · 448d15aa

Chuck Lever authored Jul 17, 2023

Have the iwarp side properly set the ndev in the device's sgid_attrs
so that address resolution can treat it more like a RoCE device.

Link: https://lore.kernel.org/r/168960673933.3007.8043081822081877578.stgit@manet.1015granger.netSuggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

448d15aa

RDMA/siw: Fabricate a GID on tun and loopback devices · bad5b6e3

Chuck Lever authored Jul 17, 2023

LOOPBACK and NONE (tunnel) devices have all-zero MAC addresses.
Currently, siw_device_create() falls back to copying the IB device's
name in those cases, because an all-zero MAC address breaks the RDMA
core address resolution mechanism.

However, at the point when siw_device_create() constructs a GID, the
ib_device::name field is uninitialized, leaving the MAC address to
remain in an all-zero state.

Fabricate a random artificial GID for such devices, and ensure this
artificial GID is returned for all device query operations.

Link: https://lore.kernel.org/r/168960673260.3007.12378736853793339110.stgit@manet.1015granger.netReported-by: Tom Talpey <tom@talpey.com>
Fixes: a2d36b02 ("RDMA/siw: Enable siw on tunnel devices")
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

bad5b6e3

RDMA/bnxt_re: use vmalloc_array and vcalloc · 666f526b

Julia Lawall authored Jun 27, 2023

Use vmalloc_array and vcalloc to protect against
multiplication overflows.

The changes were done using the following Coccinelle
semantic patch:

// <smpl>
@initialize:ocaml@
@@

let rename alloc =
  match alloc with
    "vmalloc" -> "vmalloc_array"
  | "vzalloc" -> "vcalloc"
  | _ -> failwith "unknown"

@@
    size_t e1,e2;
    constant C1, C2;
    expression E1, E2, COUNT, x1, x2, x3;
    typedef u8;
    typedef __u8;
    type t = {u8,__u8,char,unsigned char};
    identifier alloc = {vmalloc,vzalloc};
    fresh identifier realloc = script:ocaml(alloc) { rename alloc };
@@

(
      alloc(x1*x2*x3)
|
      alloc(C1 * C2)
|
      alloc((sizeof(t)) * (COUNT), ...)
|
-     alloc((e1) * (e2))
+     realloc(e1, e2)
|
-     alloc((e1) * (COUNT))
+     realloc(COUNT, e1)
|
-     alloc((E1) * (E2))
+     realloc(E1, E2)
)
// </smpl>

Link: https://lore.kernel.org/r/20230627144339.144478-20-Julia.Lawall@inria.frSigned-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

666f526b

RDMA/siw: use vmalloc_array and vcalloc · 9191df00

Julia Lawall authored Jun 27, 2023

Use vmalloc_array and vcalloc to protect against
multiplication overflows.

The changes were done using the following Coccinelle
semantic patch:

// <smpl>
@initialize:ocaml@
@@

let rename alloc =
  match alloc with
    "vmalloc" -> "vmalloc_array"
  | "vzalloc" -> "vcalloc"
  | _ -> failwith "unknown"

@@
    size_t e1,e2;
    constant C1, C2;
    expression E1, E2, COUNT, x1, x2, x3;
    typedef u8;
    typedef __u8;
    type t = {u8,__u8,char,unsigned char};
    identifier alloc = {vmalloc,vzalloc};
    fresh identifier realloc = script:ocaml(alloc) { rename alloc };
@@

(
      alloc(x1*x2*x3)
|
      alloc(C1 * C2)
|
      alloc((sizeof(t)) * (COUNT), ...)
|
-     alloc((e1) * (e2))
+     realloc(e1, e2)
|
-     alloc((e1) * (COUNT))
+     realloc(COUNT, e1)
|
-     alloc((E1) * (E2))
+     realloc(E1, E2)
)
// </smpl>

Link: https://lore.kernel.org/r/20230627144339.144478-15-Julia.Lawall@inria.frSigned-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

9191df00

RDMA/erdma: use vmalloc_array and vcalloc · c619af83

Julia Lawall authored Jun 27, 2023

Use vmalloc_array and vcalloc to protect against
multiplication overflows.

The changes were done using the following Coccinelle
semantic patch:

// <smpl>
@initialize:ocaml@
@@

let rename alloc =
  match alloc with
    "vmalloc" -> "vmalloc_array"
  | "vzalloc" -> "vcalloc"
  | _ -> failwith "unknown"

@@
    size_t e1,e2;
    constant C1, C2;
    expression E1, E2, COUNT, x1, x2, x3;
    typedef u8;
    typedef __u8;
    type t = {u8,__u8,char,unsigned char};
    identifier alloc = {vmalloc,vzalloc};
    fresh identifier realloc = script:ocaml(alloc) { rename alloc };
@@

(
      alloc(x1*x2*x3)
|
      alloc(C1 * C2)
|
      alloc((sizeof(t)) * (COUNT), ...)
|
-     alloc((e1) * (e2))
+     realloc(e1, e2)
|
-     alloc((e1) * (COUNT))
+     realloc(COUNT, e1)
|
-     alloc((E1) * (E2))
+     realloc(E1, E2)
)
// </smpl>

Link: https://lore.kernel.org/r/20230627144339.144478-6-Julia.Lawall@inria.frSigned-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

c619af83

19 Jul, 2023 1 commit

RDMA/irdma: Fix building without IPv6 · b3d2b014

Arnd Bergmann authored Jul 18, 2023

The new irdma_iw_get_vlan_prio() function requires IPv6 support to build:

x86_64-linux-ld: drivers/infiniband/hw/irdma/cm.o: in function `irdma_iw_get_vlan_prio':
cm.c:(.text+0x2832): undefined reference to `ipv6_chk_addr'

Add a compile-time check in the same way as elsewhere in this file to avoid
this by conditionally leaving out the ipv6 specific bits.

Fixes: f877f22a ("RDMA/irdma: Implement egress VLAN priority")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230718193835.3546684-1-arnd@kernel.orgAcked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>

b3d2b014

12 Jul, 2023 4 commits

RDMA/irdma: Implement egress VLAN priority · f877f22a

Mustafa Ismail authored Jul 11, 2023

When a VLAN interface is in use, get and use the VLAN
egress mapping.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175318.1301-1-shiraz.saleem@intel.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

f877f22a

RDMA/qedr: Remove a duplicate assignment in irdma_query_ah() · 65e02e84

Minjie Du authored Jul 06, 2023

Delete a duplicate statement from this function implementation.

Fixes: b48c24c2 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Minjie Du <duminjie@vivo.com>
Acked-by: Alok Prasad <palok@marvell.com>
Link: https://lore.kernel.org/r/20230706022704.1260-1-duminjie@vivo.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

65e02e84

RDMA/efa: Add RDMA write HW statistics counters · 113383ef

Michael Margolin authored Jul 03, 2023

Update device API and request RDMA write counters if RDMA write is
supported by device. Expose newly added counters through ib core
counters mechanism.
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Yonatan Nachum <ynachum@amazon.com>
Signed-off-by: Michael Margolin <mrgolin@amazon.com>
Link: https://lore.kernel.org/r/20230703153404.30877-1-mrgolin@amazon.comReviewed-by: Gal Pressman <gal.pressman@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>

113383ef

RDMA/mlx5: align MR mem allocation size to power-of-two · 52b4bdd2

Yuanyuan Zhong authored Jun 29, 2023

The MR memory allocation requests extra bytes to guarantee that there
is enough space to find the memory aligned to MLX5_UMR_ALIGN.

For power-of-two sizes, the alignment can be guaranteed by kmalloc()
according to commit 59bb4798 ("mm, sl[aou]b: guarantee natural
alignment for kmalloc(power-of-two)").

So if target alignment is power-of-two and adding the extra bytes
crosses a power-of-two boundary, use the next power-of-two as the
allocation size.
Signed-off-by: Yuanyuan Zhong <yzhong@purestorage.com>
Link: https://lore.kernel.org/r/20230629213248.3184245-2-yzhong@purestorage.comSigned-off-by: Leon Romanovsky <leon@kernel.org>

52b4bdd2

09 Jul, 2023 10 commits

Linux 6.5-rc1 · 06c2afb8
Linus Torvalds authored Jul 09, 2023

06c2afb8

MAINTAINERS 2: Electric Boogaloo · c192ac73

Linus Torvalds authored Jul 09, 2023

We just sorted the entries and fields last release, so just out of a
perverse sense of curiosity, I decided to see if we can keep things
ordered for even just one release.

The answer is "No. No we cannot".

I suggest that all kernel developers will need weekly training sessions,
involving a lot of Big Bird and Sesame Street.  And at the yearly
maintainer summit, we will all sing the alphabet song together.

I doubt I will keep doing this.  At some point "perverse sense of
curiosity" turns into just a cold dark place filled with sadness and
despair.

Repeats: 80e62bc8 ("MAINTAINERS: re-sort all entries and fields")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c192ac73

Merge tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping · f71f6421

Linus Torvalds authored Jul 09, 2023

Pull dma-mapping fixes from Christoph Hellwig:

 - swiotlb area sizing fixes (Petr Tesarik)

* tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: reduce the number of areas to match actual memory pool size
  swiotlb: always set the number of areas before allocating the pool

f71f6421

Merge tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a9943ad3

Linus Torvalds authored Jul 09, 2023

Pull irq update from Borislav Petkov:

 - Optimize IRQ domain's name assignment

* tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Use return value of strreplace()

a9943ad3

Merge tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 51e3d7c2

Linus Torvalds authored Jul 09, 2023

Pull x86 fpu fix from Borislav Petkov:

 - Do FPU AP initialization on Xen PV too which got missed by the recent
   boot reordering work

* tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen: Fix secondary processors' FPU initialization

51e3d7c2

Merge tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e3da8db0

Linus Torvalds authored Jul 09, 2023

Pull x86 fix from Thomas Gleixner:
 "A single fix for the mechanism to park CPUs with an INIT IPI.

  On shutdown or kexec, the kernel tries to park the non-boot CPUs with
  an INIT IPI. But the same code path is also used by the crash utility.
  If the CPU which panics is not the boot CPU then it sends an INIT IPI
  to the boot CPU which resets the machine.

  Prevent this by validating that the CPU which runs the stop mechanism
  is the boot CPU. If not, leave the other CPUs in HLT"

* tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smp: Don't send INIT to boot CPU

e3da8db0

Merge tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 74099e20

Linus Torvalds authored Jul 09, 2023

Pull MIPS fixes from Thomas Bogendoerfer:

 - fixes for KVM

 - fix for loongson build and cpu probing

 - DT fixes

* tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled
  MIPS: dts: add missing space before {
  MIPS: Loongson: Fix build error when make modules_install
  MIPS: KVM: Fix NULL pointer dereference
  MIPS: Loongson: Fix cpu_probe_loongson() again

74099e20

Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 76487845

Linus Torvalds authored Jul 09, 2023

Pull xfs fix from Darrick Wong:
 "Nothing exciting here, just getting rid of a gcc warning that I got
  tired of seeing when I turn on gcov"

* tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix uninit warning in xfs_growfs_data

76487845

Merge tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 4770353b

Linus Torvalds authored Jul 09, 2023

Pull more smb client updates from Steve French:

 - fix potential use after free in unmount

 - minor cleanup

 - add worker to cleanup stale directory leases

* tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Add a laundromat thread for cached directories
  smb: client: remove redundant pointer 'server'
  cifs: fix session state transition to avoid use-after-free issue

4770353b

Merge tag 'ntb-6.5' of https://github.com/jonmason/ntb · cff06873

Linus Torvalds authored Jul 09, 2023

Pull NTB updates from Jon Mason:
 "Fixes for pci_clean_master, error handling in driver inits, and
  various other issues/bugs"

* tag 'ntb-6.5' of https://github.com/jonmason/ntb:
  ntb: hw: amd: Fix debugfs_create_dir error checking
  ntb.rst: Fix copy and paste error
  ntb_netdev: Fix module_init problem
  ntb: intel: Remove redundant pci_clear_master
  ntb: epf: Remove redundant pci_clear_master
  ntb_hw_amd: Remove redundant pci_clear_master
  ntb: idt: drop redundant pci_enable_pcie_error_reporting()
  MAINTAINERS: git://github -> https://github.com for jonmason
  NTB: EPF: fix possible memory leak in pci_vntb_probe()
  NTB: ntb_tool: Add check for devm_kcalloc
  NTB: ntb_transport: fix possible memory leak while device_register() fails
  ntb: intel: Fix error handling in intel_ntb_pci_driver_init()
  NTB: amd: Fix error handling in amd_ntb_pci_driver_init()
  ntb: idt: Fix error handling in idt_pci_driver_init()

cff06873

08 Jul, 2023 3 commits

mm: lock newly mapped VMA with corrected ordering · 1c7873e3

Hugh Dickins authored Jul 08, 2023

Lockdep is certainly right to complain about

  (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f
                 but task is already holding lock:
  (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db

Invert those to the usual ordering.

Fixes: 33313a74 ("mm: lock newly mapped VMA which can be modified after it becomes visible")
Cc: stable@vger.kernel.org
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1c7873e3

Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of... · 946c6b59

Linus Torvalds authored Jul 08, 2023

Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "16 hotfixes. Six are cc:stable and the remainder address post-6.4
  issues"

The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since
it was all hopefully fixed in mainline.

* tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  lib: dhry: fix sleeping allocations inside non-preemptable section
  kasan, slub: fix HW_TAGS zeroing with slub_debug
  kasan: fix type cast in memory_is_poisoned_n
  mailmap: add entries for Heiko Stuebner
  mailmap: update manpage link
  bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
  MAINTAINERS: add linux-next info
  mailmap: add Markus Schneider-Pargmann
  writeback: account the number of pages written back
  mm: call arch_swap_restore() from do_swap_page()
  squashfs: fix cache race with migration
  mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
  docs: update ocfs2-devel mailing list address
  MAINTAINERS: update ocfs2-devel mailing list address
  mm: disable CONFIG_PER_VMA_LOCK until its fixed
  fork: lock VMAs of the parent process when forking

946c6b59

fork: lock VMAs of the parent process when forking · fb49c455

Suren Baghdasaryan authored Jul 08, 2023

When forking a child process, the parent write-protects anonymous pages
and COW-shares them with the child being forked using copy_present_pte().

We must not take any concurrent page faults on the source vma's as they
are being processed, as we expect both the vma and the pte's behind it
to be stable. For example, the anon_vma_fork() expects the parents
vma->anon_vma to not change during the vma copy.

A concurrent page fault on a page newly marked read-only by the page
copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
source vma, defeating the anon_vma_clone() that wasn't done because the
parent vma originally didn't have an anon_vma, but we now might end up
copying a pte entry for a page that has one.

Before the per-vma lock based changes, the mmap_lock guaranteed
exclusion with concurrent page faults. But now we need to do a
vma_start_write() to make sure no concurrent faults happen on this vma
while it is being processed.

This fix can potentially regress some fork-heavy workloads. Kernel
build time did not show noticeable regression on a 56-core machine while
a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/Reported-by: Jacob Young <jacobly.alt@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aae ("x86/mm: try VMA lock-based page fault handling first")
Cc: stable@vger.kernel.org
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fb49c455