Commit 242b2331 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A more active cycle than most of the recent past, with a few large,
  long discussed works this time.

  The RNBD block driver has been posted for nearly two years now, and
  flowing through RDMA due to it also introducing a new ULP.

  The removal of FMR has been a recurring discussion theme for a long
  time.

  And the usual smattering of features and bug fixes.

  Summary:

   - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa

   - Continuing driver cleanups in bnxt_re, hns

   - Big cleanup of mlx5 QP creation flows

   - More consistent use of src port and flow label when LAG is used and
     a mlx5 implementation

   - Additional set of cleanups for IB CM

   - 'RNBD' network block driver and target. This is a network block
     RDMA device specific to ionos's cloud environment. It brings strong
     multipath and resiliency capabilities.

   - Accelerated IPoIB for HFI1

   - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple
     async fds

   - Support for exchanging the new IBTA defiend ECE data during RDMA CM
     exchanges

   - Removal of the very old and insecure FMR interface from all ULPs
     and drivers. FRWR should be preferred for at least a decade now"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits)
  RDMA/cm: Spurious WARNING triggered in cm_destroy_id()
  RDMA/mlx5: Return ECE DC support
  RDMA/mlx5: Don't rely on FW to set zeros in ECE response
  RDMA/mlx5: Return an error if copy_to_user fails
  IB/hfi1: Use free_netdev() in hfi1_netdev_free()
  RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr()
  RDMA/core: Move and rename trace_cm_id_create()
  IB/hfi1: Fix hfi1_netdev_rx_init() error handling
  RDMA: Remove 'max_map_per_fmr'
  RDMA: Remove 'max_fmr'
  RDMA/core: Remove FMR device ops
  RDMA/rdmavt: Remove FMR memory registration
  RDMA/mthca: Remove FMR support for memory registration
  RDMA/mlx4: Remove FMR support for memory registration
  RDMA/i40iw: Remove FMR leftovers
  RDMA/bnxt_re: Remove FMR leftovers
  RDMA/mlx5: Remove FMR leftovers
  RDMA/core: Remove FMR pool API
  RDMA/rds: Remove FMR support for memory registration
  RDMA/srp: Remove support for FMR memory registration
  ...
parents 3f7e8237 fba97dc7
What: /sys/block/rnbd<N>/rnbd/unmap_device
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: To unmap a volume, "normal" or "force" has to be written to:
/sys/block/rnbd<N>/rnbd/unmap_device
When "normal" is used, the operation will fail with EBUSY if any process
is using the device. When "force" is used, the device is also unmapped
when device is in use. All I/Os that are in progress will fail.
Example:
# echo "normal" > /sys/block/rnbd0/rnbd/unmap_device
What: /sys/block/rnbd<N>/rnbd/state
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: The file contains the current state of the block device. The state file
returns "open" when the device is successfully mapped from the server
and accepting I/O requests. When the connection to the server gets
disconnected in case of an error (e.g. link failure), the state file
returns "closed" and all I/O requests submitted to it will fail with -EIO.
What: /sys/block/rnbd<N>/rnbd/session
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RNBD uses RTRS session to transport the data between client and
server. The entry "session" contains the name of the session, that
was used to establish the RTRS session. It's the same name that
was passed as server parameter to the map_device entry.
What: /sys/block/rnbd<N>/rnbd/mapping_path
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains the path that was passed as "device_path" to the map_device
operation.
What: /sys/block/rnbd<N>/rnbd/access_mode
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains the device access mode: ro, rw or migration.
What: /sys/class/rnbd-client
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Provide information about RNBD-client.
All sysfs files that are not read-only provide the usage information on read:
Example:
# cat /sys/class/rnbd-client/ctl/map_device
> Usage: echo "sessname=<name of the rtrs session> path=<[srcaddr,]dstaddr>
> [path=<[srcaddr,]dstaddr>] device_path=<full path on remote side>
> [access_mode=<ro|rw|migration>] > map_device
>
> addr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]
What: /sys/class/rnbd-client/ctl/map_device
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Expected format is the following:
sessname=<name of the rtrs session>
path=<[srcaddr,]dstaddr> [path=<[srcaddr,]dstaddr> ...]
device_path=<full path on remote side>
[access_mode=<ro|rw|migration>]
Where:
sessname: accepts a string not bigger than 256 chars, which identifies
a given session on the client and on the server.
I.e. "clt_hostname-srv_hostname" could be a natural choice.
path: describes a connection between the client and the server by
specifying destination and, when required, the source address.
The addresses are to be provided in the following format:
ip:<IPv6>
ip:<IPv4>
gid:<GID>
for example:
path=ip:10.0.0.66
The single addr is treated as the destination.
The connection will be established to this server from any client IP address.
path=ip:10.0.0.66,ip:10.0.1.66
First addr is the source address and the second is the destination.
If multiple "path=" options are specified multiple connection
will be established and data will be sent according to
the selected multipath policy (see RTRS mp_policy sysfs entry description).
device_path: Path to the block device on the server side. Path is specified
relative to the directory on server side configured in the
'dev_search_path' module parameter of the rnbd_server.
The rnbd_server prepends the <device_path> received from client
with <dev_search_path> and tries to open the
<dev_search_path>/<device_path> block device. On success,
a /dev/rnbd<N> device file, a /sys/block/rnbd_client/rnbd<N>/
directory and an entry in /sys/class/rnbd-client/ctl/devices
will be created.
If 'dev_search_path' contains '%SESSNAME%', then each session can
have different devices namespace, e.g. server was configured with
the following parameter "dev_search_path=/run/rnbd-devs/%SESSNAME%",
client has this string "sessname=blya device_path=sda", then server
will try to open: /run/rnbd-devs/blya/sda.
access_mode: the access_mode parameter specifies if the device is to be
mapped as "ro" read-only or "rw" read-write. The server allows
a device to be exported in rw mode only once. The "migration"
access mode has to be specified if a second mapping in read-write
mode is desired.
By default "rw" is used.
Exit Codes:
If the device is already mapped it will fail with EEXIST. If the input
has an invalid format it will return EINVAL. If the device path cannot
be found on the server, it will fail with ENOENT.
Finding device file after mapping
---------------------------------
After mapping, the device file can be found by:
o The symlink /sys/class/rnbd-client/ctl/devices/<device_id>
points to /sys/block/<dev-name>. The last part of the symlink destination
is the same as the device name. By extracting the last part of the
path the path to the device /dev/<dev-name> can be build.
o /dev/block/$(cat /sys/class/rnbd-client/ctl/devices/<device_id>/dev)
How to find the <device_id> of the device is described on the next
section.
What: /sys/class/rnbd-client/ctl/devices/
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: For each device mapped on the client a new symbolic link is created as
/sys/class/rnbd-client/ctl/devices/<device_id>, which points
to the block device created by rnbd (/sys/block/rnbd<N>/).
The <device_id> of each device is created as follows:
- If the 'device_path' provided during mapping contains slashes ("/"),
they are replaced by exclamation mark ("!") and used as as the
<device_id>. Otherwise, the <device_id> will be the same as the
"device_path" provided.
What: /sys/class/rnbd-server
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: provide information about RNBD-server.
What: /sys/class/rnbd-server/ctl/
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: When a client maps a device, a directory entry with the name of the
block device is created under /sys/class/rnbd-server/ctl/devices/.
What: /sys/class/rnbd-server/ctl/devices/<device_name>/block_dev
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Is a symlink to the sysfs entry of the exported device.
Example:
block_dev -> ../../../../class/block/ram0
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: For each client a particular device is exported to, following directory will be
created:
/sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/
When the device is unmapped by that client, the directory will be removed.
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/read_only
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains '1' if device is mapped read-only, otherwise '0'.
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/mapping_path
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains the relative device path provided by the user during mapping.
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/access_mode
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains the device access mode: ro, rw or migration.
What: /sys/class/rtrs-client
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: When a user of RTRS API creates a new session, a directory entry with
the name of that session is created under /sys/class/rtrs-client/<session-name>/
What: /sys/class/rtrs-client/<session-name>/add_path
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RW, adds a new path (connection) to an existing session. Expected format is the
following:
<[source addr,]destination addr>
*addr ::= [ ip:<ipv4|ipv6> | gid:<gid> ]
What: /sys/class/rtrs-client/<session-name>/max_reconnect_attempts
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Maximum number reconnect attempts the client should make before giving up
after connection breaks unexpectedly.
What: /sys/class/rtrs-client/<session-name>/mp_policy
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Multipath policy specifies which path should be selected on each IO:
round-robin (0):
select path in per CPU round-robin manner.
min-inflight (1):
select path with minimum inflights.
What: /sys/class/rtrs-client/<session-name>/paths/
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Each path belonging to a given session is listed here by its source and
destination address. When a new path is added to a session by writing to
the "add_path" entry, a directory <src@dst> is created.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/state
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains "connected" if the session is connected to the peer and fully
functional. Otherwise the file contains "disconnected"
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/reconnect
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Write "1" to the file in order to reconnect the path.
Operation is blocking and returns 0 if reconnect was successful.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/disconnect
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Write "1" to the file in order to disconnect the path.
Operation blocks until RTRS path is disconnected.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/remove_path
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Write "1" to the file in order to disconnected and remove the path
from the session. Operation blocks until the path is disconnected
and removed from the session.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_name
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the the name of HCA the connection established on.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_port
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the port number of active port traffic is going through.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/src_addr
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the source address of the path
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/dst_addr
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the destination address of the path
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/reset_all
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RW, Read will return usage help, write 0 will clear all the statistics.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/cpu_migration
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RTRS expects that each HCA IRQ is pinned to a separate CPU. If it's
not the case, the processing of an I/O response could be processed on a
different CPU than where it was originally submitted. This file shows
how many interrupts where generated on a non expected CPU.
"from:" is the CPU on which the IRQ was expected, but not generated.
"to:" is the CPU on which the IRQ was generated, but not expected.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/reconnects
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains 2 unsigned int values, the first one records number of successful
reconnects in the path lifetime, the second one records number of failed
reconnects in the path lifetime.
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/rdma
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains statistics regarding rdma operations and inflight operations.
The output consists of 6 values:
<read-count> <read-total-size> <write-count> <write-total-size> \
<inflights> <failovered>
What: /sys/class/rtrs-server
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: When a user of RTRS API creates a new session on a client side, a
directory entry with the name of that session is created in here.
What: /sys/class/rtrs-server/<session-name>/paths/
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: When new path is created by writing to "add_path" entry on client side,
a directory entry named as <source address>@<destination address> is created
on server.
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/disconnect
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: When "1" is written to the file, the RTRS session is being disconnected.
Operations is non-blocking and returns control immediately to the caller.
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_name
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the the name of HCA the connection established on.
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_port
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the port number of active port traffic is going through.
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/src_addr
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the source address of the path
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/dst_addr
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: RO, Contains the destination address of the path
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/stats/rdma
Date: Feb 2020
KernelVersion: 5.7
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
Description: Contains statistics regarding rdma operations and inflight operations.
The output consists of 5 values:
<read-count> <read-total-size> <write-count> <write-total-size> <inflights>
......@@ -37,9 +37,6 @@ InfiniBand core interfaces
.. kernel-doc:: drivers/infiniband/core/ud_header.c
:export:
.. kernel-doc:: drivers/infiniband/core/fmr_pool.c
:export:
.. kernel-doc:: drivers/infiniband/core/umem.c
:export:
......
......@@ -22,7 +22,6 @@ Sleeping and interrupt context
- post_recv
- poll_cq
- req_notify_cq
- map_phys_fmr
which may not sleep and must be callable from any context.
......@@ -36,7 +35,6 @@ Sleeping and interrupt context
- ib_post_send
- ib_post_recv
- ib_req_notify_cq
- ib_map_phys_fmr
are therefore safe to call from any context.
......
......@@ -14579,6 +14579,13 @@ F: arch/riscv/
N: riscv
K: riscv
RNBD BLOCK DRIVERS
M: Danil Kipnis <danil.kipnis@cloud.ionos.com>
M: Jack Wang <jinpu.wang@cloud.ionos.com>
L: linux-block@vger.kernel.org
S: Maintained
F: drivers/block/rnbd/
ROCCAT DRIVERS
M: Stefan Achatz <erazor_de@users.sourceforge.net>
S: Maintained
......@@ -14716,6 +14723,13 @@ S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jes/linux.git rtl8xxxu-devel
F: drivers/net/wireless/realtek/rtl8xxxu/
RTRS TRANSPORT DRIVERS
M: Danil Kipnis <danil.kipnis@cloud.ionos.com>
M: Jack Wang <jinpu.wang@cloud.ionos.com>
L: linux-rdma@vger.kernel.org
S: Maintained
F: drivers/infiniband/ulp/rtrs/
RXRPC SOCKETS (AF_RXRPC)
M: David Howells <dhowells@redhat.com>
L: linux-afs@lists.infradead.org
......
......@@ -458,4 +458,6 @@ config BLK_DEV_RSXX
To compile this driver as a module, choose M here: the
module will be called rsxx.
source "drivers/block/rnbd/Kconfig"
endif # BLK_DEV
......@@ -39,6 +39,7 @@ obj-$(CONFIG_BLK_DEV_PCIESSD_MTIP32XX) += mtip32xx/
obj-$(CONFIG_BLK_DEV_RSXX) += rsxx/
obj-$(CONFIG_ZRAM) += zram/
obj-$(CONFIG_BLK_DEV_RNBD) += rnbd/
obj-$(CONFIG_BLK_DEV_NULL_BLK) += null_blk.o
null_blk-objs := null_blk_main.o
......
# SPDX-License-Identifier: GPL-2.0-or-later
config BLK_DEV_RNBD
bool
config BLK_DEV_RNBD_CLIENT
tristate "RDMA Network Block Device driver client"
depends on INFINIBAND_RTRS_CLIENT
select BLK_DEV_RNBD
help
RNBD client is a network block device driver using rdma transport.
RNBD client allows for mapping of a remote block devices over
RTRS protocol from a target system where RNBD server is running.
If unsure, say N.
config BLK_DEV_RNBD_SERVER
tristate "RDMA Network Block Device driver server"
depends on INFINIBAND_RTRS_SERVER
select BLK_DEV_RNBD
help
RNBD server is the server side of RNBD using rdma transport.
RNBD server allows for exporting local block devices to a remote client
over RTRS protocol.
If unsure, say N.
# SPDX-License-Identifier: GPL-2.0-or-later
ccflags-y := -I$(srctree)/drivers/infiniband/ulp/rtrs
rnbd-client-y := rnbd-clt.o \
rnbd-clt-sysfs.o \
rnbd-common.o
rnbd-server-y := rnbd-common.o \
rnbd-srv.o \
rnbd-srv-dev.o \
rnbd-srv-sysfs.o
obj-$(CONFIG_BLK_DEV_RNBD_CLIENT) += rnbd-client.o
obj-$(CONFIG_BLK_DEV_RNBD_SERVER) += rnbd-server.o
********************************
RDMA Network Block Device (RNBD)
********************************
Introduction
------------
RNBD (RDMA Network Block Device) is a pair of kernel modules
(client and server) that allow for remote access of a block device on
the server over RTRS protocol using the RDMA (InfiniBand, RoCE, iWARP)
transport. After being mapped, the remote block devices can be accessed
on the client side as local block devices.
I/O is transferred between client and server by the RTRS transport
modules. The administration of RNBD and RTRS modules is done via
sysfs entries.
Requirements
------------
RTRS kernel modules
Quick Start
-----------
Server side:
# modprobe rnbd_server
Client side:
# modprobe rnbd_client
# echo "sessname=blya path=ip:10.50.100.66 device_path=/dev/ram0" > \
/sys/devices/virtual/rnbd-client/ctl/map_device
Where "sessname=" is a session name, a string to identify the session
on client and on server sides; "path=" is a destination IP address or
a pair of a source and a destination IPs, separated by comma. Multiple
"path=" options can be specified in order to use multipath (see RTRS
description for details); "device_path=" is the block device to be
mapped from the server side. After the session to the server machine is
established, the mapped device will appear on the client side under
/dev/rnbd<N>.
RNBD-Server Module Parameters
=============================
dev_search_path
---------------
When a device is mapped from the client, the server generates the path
to the block device on the server side by concatenating dev_search_path
and the "device_path" that was specified in the map_device operation.
The default dev_search_path is: "/".
dev_search_path option can also contain %SESSNAME% in order to provide
different device namespaces for different sessions. See "device_path"
option for details.
============================
Protocol (rnbd/rnbd-proto.h)
============================
1. Before mapping first device from a given server, client sends an
RNBD_MSG_SESS_INFO to the server. Server responds with
RNBD_MSG_SESS_INFO_RSP. Currently the messages only contain the protocol
version for backward compatibility.
2. Client requests to open a device by sending RNBD_MSG_OPEN message. This
contains the path to the device and access mode (read-only or writable).
Server responds to the message with RNBD_MSG_OPEN_RSP. This contains
a 32 bit device id to be used for IOs and device "geometry" related
information: side, max_hw_sectors, etc.
3. Client attaches RNBD_MSG_IO to each IO message send to a device. This
message contains device id, provided by server in his rnbd_msg_open_rsp,
sector to be accessed, read-write flags and bi_size.
4. Client closes a device by sending RNBD_MSG_CLOSE which contains only the
device id provided by the server.
=========================================
Contributors List(in alphabetical order)
=========================================
Danil Kipnis <danil.kipnis@profitbricks.com>
Fabian Holler <mail@fholler.de>
Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Jack Wang <jinpu.wang@profitbricks.com>
Kleber Souza <kleber.souza@profitbricks.com>
Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Milind Dumbare <Milind.dumbare@gmail.com>
Roman Penyaev <roman.penyaev@profitbricks.com>
This diff is collapsed.
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#ifndef RNBD_CLT_H
#define RNBD_CLT_H
#include <linux/wait.h>
#include <linux/in.h>
#include <linux/inet.h>
#include <linux/blk-mq.h>
#include <linux/refcount.h>
#include <rtrs.h>
#include "rnbd-proto.h"
#include "rnbd-log.h"
/* Max. number of segments per IO request, Mellanox Connect X ~ Connect X5,
* choose minimial 30 for all, minus 1 for internal protocol, so 29.
*/
#define BMAX_SEGMENTS 29
/* time in seconds between reconnect tries, default to 30 s */
#define RECONNECT_DELAY 30
/*
* Number of times to reconnect on error before giving up, 0 for * disabled,
* -1 for forever
*/
#define MAX_RECONNECTS -1
enum rnbd_clt_dev_state {
DEV_STATE_INIT,
DEV_STATE_MAPPED,
DEV_STATE_MAPPED_DISCONNECTED,
DEV_STATE_UNMAPPED,
};
struct rnbd_iu_comp {
wait_queue_head_t wait;
int errno;
};
struct rnbd_iu {
union {
struct request *rq; /* for block io */
void *buf; /* for user messages */
};
struct rtrs_permit *permit;
union {
/* use to send msg associated with a dev */
struct rnbd_clt_dev *dev;
/* use to send msg associated with a sess */
struct rnbd_clt_session *sess;
};
struct scatterlist sglist[BMAX_SEGMENTS];
struct work_struct work;
int errno;
struct rnbd_iu_comp comp;
atomic_t refcount;
};
struct rnbd_cpu_qlist {
struct list_head requeue_list;
spinlock_t requeue_lock;
unsigned int cpu;
};
struct rnbd_clt_session {
struct list_head list;
struct rtrs_clt *rtrs;
wait_queue_head_t rtrs_waitq;
bool rtrs_ready;
struct rnbd_cpu_qlist __percpu
*cpu_queues;
DECLARE_BITMAP(cpu_queues_bm, NR_CPUS);
int __percpu *cpu_rr; /* per-cpu var for CPU round-robin */
atomic_t busy;
int queue_depth;
u32 max_io_size;
struct blk_mq_tag_set tag_set;
struct mutex lock; /* protects state and devs_list */
struct list_head devs_list; /* list of struct rnbd_clt_dev */
refcount_t refcount;
char sessname[NAME_MAX];
u8 ver; /* protocol version */
};
/**
* Submission queues.
*/
struct rnbd_queue {
struct list_head requeue_list;
unsigned long in_list;
struct rnbd_clt_dev *dev;
struct blk_mq_hw_ctx *hctx;
};
struct rnbd_clt_dev {
struct rnbd_clt_session *sess;
struct request_queue *queue;
struct rnbd_queue *hw_queues;
u32 device_id;
/* local Idr index - used to track minor number allocations. */
u32 clt_device_id;
struct mutex lock;
enum rnbd_clt_dev_state dev_state;
char pathname[NAME_MAX];
enum rnbd_access_mode access_mode;
bool read_only;
bool rotational;
u32 max_hw_sectors;
u32 max_write_same_sectors;
u32 max_discard_sectors;
u32 discard_granularity;
u32 discard_alignment;
u16 secure_discard;
u16 physical_block_size;
u16 logical_block_size;
u16 max_segments;
size_t nsectors;
u64 size; /* device size in bytes */
struct list_head list;
struct gendisk *gd;
struct kobject kobj;
char blk_symlink_name[NAME_MAX];
refcount_t refcount;
struct work_struct unmap_on_rmmod_work;
};
/* rnbd-clt.c */
struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
struct rtrs_addr *paths,
size_t path_cnt, u16 port_nr,
const char *pathname,
enum rnbd_access_mode access_mode);
int rnbd_clt_unmap_device(struct rnbd_clt_dev *dev, bool force,
const struct attribute *sysfs_self);
int rnbd_clt_remap_device(struct rnbd_clt_dev *dev);
int rnbd_clt_resize_disk(struct rnbd_clt_dev *dev, size_t newsize);
/* rnbd-clt-sysfs.c */
int rnbd_clt_create_sysfs_files(void);
void rnbd_clt_destroy_sysfs_files(void);
void rnbd_clt_destroy_default_group(void);
void rnbd_clt_remove_dev_symlink(struct rnbd_clt_dev *dev);
#endif /* RNBD_CLT_H */
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#include "rnbd-proto.h"
const char *rnbd_access_mode_str(enum rnbd_access_mode mode)
{
switch (mode) {
case RNBD_ACCESS_RO:
return "ro";
case RNBD_ACCESS_RW:
return "rw";
case RNBD_ACCESS_MIGRATION:
return "migration";
default:
return "unknown";
}
}
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#ifndef RNBD_LOG_H
#define RNBD_LOG_H
#include "rnbd-clt.h"
#include "rnbd-srv.h"
#define rnbd_clt_log(fn, dev, fmt, ...) ( \
fn("<%s@%s> " fmt, (dev)->pathname, \
(dev)->sess->sessname, \
##__VA_ARGS__))
#define rnbd_srv_log(fn, dev, fmt, ...) ( \
fn("<%s@%s>: " fmt, (dev)->pathname, \
(dev)->sess->sessname, ##__VA_ARGS__))
#define rnbd_clt_err(dev, fmt, ...) \
rnbd_clt_log(pr_err, dev, fmt, ##__VA_ARGS__)
#define rnbd_clt_err_rl(dev, fmt, ...) \
rnbd_clt_log(pr_err_ratelimited, dev, fmt, ##__VA_ARGS__)
#define rnbd_clt_info(dev, fmt, ...) \
rnbd_clt_log(pr_info, dev, fmt, ##__VA_ARGS__)
#define rnbd_clt_info_rl(dev, fmt, ...) \
rnbd_clt_log(pr_info_ratelimited, dev, fmt, ##__VA_ARGS__)
#define rnbd_srv_err(dev, fmt, ...) \
rnbd_srv_log(pr_err, dev, fmt, ##__VA_ARGS__)
#define rnbd_srv_err_rl(dev, fmt, ...) \
rnbd_srv_log(pr_err_ratelimited, dev, fmt, ##__VA_ARGS__)
#define rnbd_srv_info(dev, fmt, ...) \
rnbd_srv_log(pr_info, dev, fmt, ##__VA_ARGS__)
#define rnbd_srv_info_rl(dev, fmt, ...) \
rnbd_srv_log(pr_info_ratelimited, dev, fmt, ##__VA_ARGS__)
#endif /* RNBD_LOG_H */
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#ifndef RNBD_PROTO_H
#define RNBD_PROTO_H
#include <linux/types.h>
#include <linux/blkdev.h>
#include <linux/limits.h>
#include <linux/inet.h>
#include <linux/in.h>
#include <linux/in6.h>
#include <rdma/ib.h>
#define RNBD_PROTO_VER_MAJOR 2
#define RNBD_PROTO_VER_MINOR 0
/* The default port number the RTRS server is listening on. */
#define RTRS_PORT 1234
/**
* enum rnbd_msg_types - RNBD message types
* @RNBD_MSG_SESS_INFO: initial session info from client to server
* @RNBD_MSG_SESS_INFO_RSP: initial session info from server to client
* @RNBD_MSG_OPEN: open (map) device request
* @RNBD_MSG_OPEN_RSP: response to an @RNBD_MSG_OPEN
* @RNBD_MSG_IO: block IO request operation
* @RNBD_MSG_CLOSE: close (unmap) device request
*/
enum rnbd_msg_type {
RNBD_MSG_SESS_INFO,
RNBD_MSG_SESS_INFO_RSP,
RNBD_MSG_OPEN,
RNBD_MSG_OPEN_RSP,
RNBD_MSG_IO,
RNBD_MSG_CLOSE,
};
/**
* struct rnbd_msg_hdr - header of RNBD messages
* @type: Message type, valid values see: enum rnbd_msg_types
*/
struct rnbd_msg_hdr {
__le16 type;
__le16 __padding;
};
/**
* We allow to map RO many times and RW only once. We allow to map yet another
* time RW, if MIGRATION is provided (second RW export can be required for
* example for VM migration)
*/
enum rnbd_access_mode {
RNBD_ACCESS_RO,
RNBD_ACCESS_RW,
RNBD_ACCESS_MIGRATION,
};
/**
* struct rnbd_msg_sess_info - initial session info from client to server
* @hdr: message header
* @ver: RNBD protocol version
*/
struct rnbd_msg_sess_info {
struct rnbd_msg_hdr hdr;
u8 ver;
u8 reserved[31];
};
/**
* struct rnbd_msg_sess_info_rsp - initial session info from server to client
* @hdr: message header
* @ver: RNBD protocol version
*/
struct rnbd_msg_sess_info_rsp {
struct rnbd_msg_hdr hdr;
u8 ver;
u8 reserved[31];
};
/**
* struct rnbd_msg_open - request to open a remote device.
* @hdr: message header
* @access_mode: the mode to open remote device, valid values see:
* enum rnbd_access_mode
* @device_name: device path on remote side
*/
struct rnbd_msg_open {
struct rnbd_msg_hdr hdr;
u8 access_mode;
u8 resv1;
s8 dev_name[NAME_MAX];
u8 reserved[3];
};
/**
* struct rnbd_msg_close - request to close a remote device.
* @hdr: message header
* @device_id: device_id on server side to identify the device
*/
struct rnbd_msg_close {
struct rnbd_msg_hdr hdr;
__le32 device_id;
};
/**
* struct rnbd_msg_open_rsp - response message to RNBD_MSG_OPEN
* @hdr: message header
* @device_id: device_id on server side to identify the device
* @nsectors: number of sectors in the usual 512b unit
* @max_hw_sectors: max hardware sectors in the usual 512b unit
* @max_write_same_sectors: max sectors for WRITE SAME in the 512b unit
* @max_discard_sectors: max. sectors that can be discarded at once in 512b
* unit.
* @discard_granularity: size of the internal discard allocation unit in bytes
* @discard_alignment: offset from internal allocation assignment in bytes
* @physical_block_size: physical block size device supports in bytes
* @logical_block_size: logical block size device supports in bytes
* @max_segments: max segments hardware support in one transfer
* @secure_discard: supports secure discard
* @rotation: is a rotational disc?
*/
struct rnbd_msg_open_rsp {
struct rnbd_msg_hdr hdr;
__le32 device_id;
__le64 nsectors;
__le32 max_hw_sectors;
__le32 max_write_same_sectors;
__le32 max_discard_sectors;
__le32 discard_granularity;
__le32 discard_alignment;
__le16 physical_block_size;
__le16 logical_block_size;
__le16 max_segments;
__le16 secure_discard;
u8 rotational;
u8 reserved[11];
};
/**
* struct rnbd_msg_io - message for I/O read/write
* @hdr: message header
* @device_id: device_id on server side to find the right device
* @sector: bi_sector attribute from struct bio
* @rw: valid values are defined in enum rnbd_io_flags
* @bi_size: number of bytes for I/O read/write
* @prio: priority
*/
struct rnbd_msg_io {
struct rnbd_msg_hdr hdr;
__le32 device_id;
__le64 sector;
__le32 rw;
__le32 bi_size;
__le16 prio;
};
#define RNBD_OP_BITS 8
#define RNBD_OP_MASK ((1 << RNBD_OP_BITS) - 1)
/**
* enum rnbd_io_flags - RNBD request types from rq_flag_bits
* @RNBD_OP_READ: read sectors from the device
* @RNBD_OP_WRITE: write sectors to the device
* @RNBD_OP_FLUSH: flush the volatile write cache
* @RNBD_OP_DISCARD: discard sectors
* @RNBD_OP_SECURE_ERASE: securely erase sectors
* @RNBD_OP_WRITE_SAME: write the same sectors many times
* @RNBD_F_SYNC: request is sync (sync write or read)
* @RNBD_F_FUA: forced unit access
*/
enum rnbd_io_flags {
/* Operations */
RNBD_OP_READ = 0,
RNBD_OP_WRITE = 1,
RNBD_OP_FLUSH = 2,
RNBD_OP_DISCARD = 3,
RNBD_OP_SECURE_ERASE = 4,
RNBD_OP_WRITE_SAME = 5,
RNBD_OP_LAST,
/* Flags */
RNBD_F_SYNC = 1<<(RNBD_OP_BITS + 0),
RNBD_F_FUA = 1<<(RNBD_OP_BITS + 1),
RNBD_F_ALL = (RNBD_F_SYNC | RNBD_F_FUA)
};
static inline u32 rnbd_op(u32 flags)
{
return flags & RNBD_OP_MASK;
}
static inline u32 rnbd_flags(u32 flags)
{
return flags & ~RNBD_OP_MASK;
}
static inline bool rnbd_flags_supported(u32 flags)
{
u32 op;
op = rnbd_op(flags);
flags = rnbd_flags(flags);
if (op >= RNBD_OP_LAST)
return false;
if (flags & ~RNBD_F_ALL)
return false;
return true;
}
static inline u32 rnbd_to_bio_flags(u32 rnbd_opf)
{
u32 bio_opf;
switch (rnbd_op(rnbd_opf)) {
case RNBD_OP_READ:
bio_opf = REQ_OP_READ;
break;
case RNBD_OP_WRITE:
bio_opf = REQ_OP_WRITE;
break;
case RNBD_OP_FLUSH:
bio_opf = REQ_OP_FLUSH | REQ_PREFLUSH;
break;
case RNBD_OP_DISCARD:
bio_opf = REQ_OP_DISCARD;
break;
case RNBD_OP_SECURE_ERASE:
bio_opf = REQ_OP_SECURE_ERASE;
break;
case RNBD_OP_WRITE_SAME:
bio_opf = REQ_OP_WRITE_SAME;
break;
default:
WARN(1, "Unknown RNBD type: %d (flags %d)\n",
rnbd_op(rnbd_opf), rnbd_opf);
bio_opf = 0;
}
if (rnbd_opf & RNBD_F_SYNC)
bio_opf |= REQ_SYNC;
if (rnbd_opf & RNBD_F_FUA)
bio_opf |= REQ_FUA;
return bio_opf;
}
static inline u32 rq_to_rnbd_flags(struct request *rq)
{
u32 rnbd_opf;
switch (req_op(rq)) {
case REQ_OP_READ:
rnbd_opf = RNBD_OP_READ;
break;
case REQ_OP_WRITE:
rnbd_opf = RNBD_OP_WRITE;
break;
case REQ_OP_DISCARD:
rnbd_opf = RNBD_OP_DISCARD;
break;
case REQ_OP_SECURE_ERASE:
rnbd_opf = RNBD_OP_SECURE_ERASE;
break;
case REQ_OP_WRITE_SAME:
rnbd_opf = RNBD_OP_WRITE_SAME;
break;
case REQ_OP_FLUSH:
rnbd_opf = RNBD_OP_FLUSH;
break;
default:
WARN(1, "Unknown request type %d (flags %llu)\n",
req_op(rq), (unsigned long long)rq->cmd_flags);
rnbd_opf = 0;
}
if (op_is_sync(rq->cmd_flags))
rnbd_opf |= RNBD_F_SYNC;
if (op_is_flush(rq->cmd_flags))
rnbd_opf |= RNBD_F_FUA;
return rnbd_opf;
}
const char *rnbd_access_mode_str(enum rnbd_access_mode mode);
#endif /* RNBD_PROTO_H */
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#undef pr_fmt
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
#include "rnbd-srv-dev.h"
#include "rnbd-log.h"
struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags,
struct bio_set *bs)
{
struct rnbd_dev *dev;
int ret;
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
return ERR_PTR(-ENOMEM);
dev->blk_open_flags = flags;
dev->bdev = blkdev_get_by_path(path, flags, THIS_MODULE);
ret = PTR_ERR_OR_ZERO(dev->bdev);
if (ret)
goto err;
dev->blk_open_flags = flags;
bdevname(dev->bdev, dev->name);
dev->ibd_bio_set = bs;
return dev;
err:
kfree(dev);
return ERR_PTR(ret);
}
void rnbd_dev_close(struct rnbd_dev *dev)
{
blkdev_put(dev->bdev, dev->blk_open_flags);
kfree(dev);
}
static void rnbd_dev_bi_end_io(struct bio *bio)
{
struct rnbd_dev_blk_io *io = bio->bi_private;
rnbd_endio(io->priv, blk_status_to_errno(bio->bi_status));
bio_put(bio);
}
/**
* rnbd_bio_map_kern - map kernel address into bio
* @data: pointer to buffer to map
* @bs: bio_set to use.
* @len: length in bytes
* @gfp_mask: allocation flags for bio allocation
*
* Map the kernel address into a bio suitable for io to a block
* device. Returns an error pointer in case of error.
*/
static struct bio *rnbd_bio_map_kern(void *data, struct bio_set *bs,
unsigned int len, gfp_t gfp_mask)
{
unsigned long kaddr = (unsigned long)data;
unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
unsigned long start = kaddr >> PAGE_SHIFT;
const int nr_pages = end - start;
int offset, i;
struct bio *bio;
bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
if (!bio)
return ERR_PTR(-ENOMEM);
offset = offset_in_page(kaddr);
for (i = 0; i < nr_pages; i++) {
unsigned int bytes = PAGE_SIZE - offset;
if (len <= 0)
break;
if (bytes > len)
bytes = len;
if (bio_add_page(bio, virt_to_page(data), bytes,
offset) < bytes) {
/* we don't support partial mappings */
bio_put(bio);
return ERR_PTR(-EINVAL);
}
data += bytes;
len -= bytes;
offset = 0;
}
bio->bi_end_io = bio_put;
return bio;
}
int rnbd_dev_submit_io(struct rnbd_dev *dev, sector_t sector, void *data,
size_t len, u32 bi_size, enum rnbd_io_flags flags,
short prio, void *priv)
{
struct rnbd_dev_blk_io *io;
struct bio *bio;
/* Generate bio with pages pointing to the rdma buffer */
bio = rnbd_bio_map_kern(data, dev->ibd_bio_set, len, GFP_KERNEL);
if (IS_ERR(bio))
return PTR_ERR(bio);
io = container_of(bio, struct rnbd_dev_blk_io, bio);
io->dev = dev;
io->priv = priv;
bio->bi_end_io = rnbd_dev_bi_end_io;
bio->bi_private = io;
bio->bi_opf = rnbd_to_bio_flags(flags);
bio->bi_iter.bi_sector = sector;
bio->bi_iter.bi_size = bi_size;
bio_set_prio(bio, prio);
bio_set_dev(bio, dev->bdev);
submit_bio(bio);
return 0;
}
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#ifndef RNBD_SRV_DEV_H
#define RNBD_SRV_DEV_H
#include <linux/fs.h>
#include "rnbd-proto.h"
struct rnbd_dev {
struct block_device *bdev;
struct bio_set *ibd_bio_set;
fmode_t blk_open_flags;
char name[BDEVNAME_SIZE];
};
struct rnbd_dev_blk_io {
struct rnbd_dev *dev;
void *priv;
/* have to be last member for front_pad usage of bioset_init */
struct bio bio;
};
/**
* rnbd_dev_open() - Open a device
* @flags: open flags
* @bs: bio_set to use during block io,
*/
struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags,
struct bio_set *bs);
/**
* rnbd_dev_close() - Close a device
*/
void rnbd_dev_close(struct rnbd_dev *dev);
void rnbd_endio(void *priv, int error);
static inline int rnbd_dev_get_max_segs(const struct rnbd_dev *dev)
{
return queue_max_segments(bdev_get_queue(dev->bdev));
}
static inline int rnbd_dev_get_max_hw_sects(const struct rnbd_dev *dev)
{
return queue_max_hw_sectors(bdev_get_queue(dev->bdev));
}
static inline int rnbd_dev_get_secure_discard(const struct rnbd_dev *dev)
{
return blk_queue_secure_erase(bdev_get_queue(dev->bdev));
}
static inline int rnbd_dev_get_max_discard_sects(const struct rnbd_dev *dev)
{
if (!blk_queue_discard(bdev_get_queue(dev->bdev)))
return 0;
return blk_queue_get_max_sectors(bdev_get_queue(dev->bdev),
REQ_OP_DISCARD);
}
static inline int rnbd_dev_get_discard_granularity(const struct rnbd_dev *dev)
{
return bdev_get_queue(dev->bdev)->limits.discard_granularity;
}
static inline int rnbd_dev_get_discard_alignment(const struct rnbd_dev *dev)
{
return bdev_get_queue(dev->bdev)->limits.discard_alignment;
}
/**
* rnbd_dev_submit_io() - Submit an I/O to the disk
* @dev: device to that the I/O is submitted
* @sector: address to read/write data to
* @data: I/O data to write or buffer to read I/O date into
* @len: length of @data
* @bi_size: Amount of data that will be read/written
* @prio: IO priority
* @priv: private data passed to @io_fn
*/
int rnbd_dev_submit_io(struct rnbd_dev *dev, sector_t sector, void *data,
size_t len, u32 bi_size, enum rnbd_io_flags flags,
short prio, void *priv);
#endif /* RNBD_SRV_DEV_H */
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#undef pr_fmt
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
#include <uapi/linux/limits.h>
#include <linux/kobject.h>
#include <linux/sysfs.h>
#include <linux/stat.h>
#include <linux/genhd.h>
#include <linux/list.h>
#include <linux/moduleparam.h>
#include <linux/device.h>
#include "rnbd-srv.h"
static struct device *rnbd_dev;
static struct class *rnbd_dev_class;
static struct kobject *rnbd_devs_kobj;
static void rnbd_srv_dev_release(struct kobject *kobj)
{
struct rnbd_srv_dev *dev;
dev = container_of(kobj, struct rnbd_srv_dev, dev_kobj);
kfree(dev);
}
static struct kobj_type dev_ktype = {
.sysfs_ops = &kobj_sysfs_ops,
.release = rnbd_srv_dev_release
};
int rnbd_srv_create_dev_sysfs(struct rnbd_srv_dev *dev,
struct block_device *bdev,
const char *dev_name)
{
struct kobject *bdev_kobj;
int ret;
ret = kobject_init_and_add(&dev->dev_kobj, &dev_ktype,
rnbd_devs_kobj, dev_name);
if (ret)
return ret;
dev->dev_sessions_kobj = kobject_create_and_add("sessions",
&dev->dev_kobj);
if (!dev->dev_sessions_kobj)
goto put_dev_kobj;
bdev_kobj = &disk_to_dev(bdev->bd_disk)->kobj;
ret = sysfs_create_link(&dev->dev_kobj, bdev_kobj, "block_dev");
if (ret)
goto put_sess_kobj;
return 0;
put_sess_kobj:
kobject_put(dev->dev_sessions_kobj);
put_dev_kobj:
kobject_put(&dev->dev_kobj);
return ret;
}
void rnbd_srv_destroy_dev_sysfs(struct rnbd_srv_dev *dev)
{
sysfs_remove_link(&dev->dev_kobj, "block_dev");
kobject_del(dev->dev_sessions_kobj);
kobject_put(dev->dev_sessions_kobj);
kobject_del(&dev->dev_kobj);
kobject_put(&dev->dev_kobj);
}
static ssize_t read_only_show(struct kobject *kobj, struct kobj_attribute *attr,
char *page)
{
struct rnbd_srv_sess_dev *sess_dev;
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
return scnprintf(page, PAGE_SIZE, "%d\n",
!(sess_dev->open_flags & FMODE_WRITE));
}
static struct kobj_attribute rnbd_srv_dev_session_ro_attr =
__ATTR_RO(read_only);
static ssize_t access_mode_show(struct kobject *kobj,
struct kobj_attribute *attr,
char *page)
{
struct rnbd_srv_sess_dev *sess_dev;
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
return scnprintf(page, PAGE_SIZE, "%s\n",
rnbd_access_mode_str(sess_dev->access_mode));
}
static struct kobj_attribute rnbd_srv_dev_session_access_mode_attr =
__ATTR_RO(access_mode);
static ssize_t mapping_path_show(struct kobject *kobj,
struct kobj_attribute *attr, char *page)
{
struct rnbd_srv_sess_dev *sess_dev;
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
return scnprintf(page, PAGE_SIZE, "%s\n", sess_dev->pathname);
}
static struct kobj_attribute rnbd_srv_dev_session_mapping_path_attr =
__ATTR_RO(mapping_path);
static struct attribute *rnbd_srv_default_dev_sessions_attrs[] = {
&rnbd_srv_dev_session_access_mode_attr.attr,
&rnbd_srv_dev_session_ro_attr.attr,
&rnbd_srv_dev_session_mapping_path_attr.attr,
NULL,
};
static struct attribute_group rnbd_srv_default_dev_session_attr_group = {
.attrs = rnbd_srv_default_dev_sessions_attrs,
};
void rnbd_srv_destroy_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev)
{
sysfs_remove_group(&sess_dev->kobj,
&rnbd_srv_default_dev_session_attr_group);
kobject_del(&sess_dev->kobj);
kobject_put(&sess_dev->kobj);
}
static void rnbd_srv_sess_dev_release(struct kobject *kobj)
{
struct rnbd_srv_sess_dev *sess_dev;
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
rnbd_destroy_sess_dev(sess_dev);
}
static struct kobj_type rnbd_srv_sess_dev_ktype = {
.sysfs_ops = &kobj_sysfs_ops,
.release = rnbd_srv_sess_dev_release,
};
int rnbd_srv_create_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev)
{
int ret;
ret = kobject_init_and_add(&sess_dev->kobj, &rnbd_srv_sess_dev_ktype,
sess_dev->dev->dev_sessions_kobj, "%s",
sess_dev->sess->sessname);
if (ret)
return ret;
ret = sysfs_create_group(&sess_dev->kobj,
&rnbd_srv_default_dev_session_attr_group);
if (ret)
goto err;
return 0;
err:
kobject_put(&sess_dev->kobj);
return ret;
}
int rnbd_srv_create_sysfs_files(void)
{
int err;
rnbd_dev_class = class_create(THIS_MODULE, "rnbd-server");
if (IS_ERR(rnbd_dev_class))
return PTR_ERR(rnbd_dev_class);
rnbd_dev = device_create(rnbd_dev_class, NULL,
MKDEV(0, 0), NULL, "ctl");
if (IS_ERR(rnbd_dev)) {
err = PTR_ERR(rnbd_dev);
goto cls_destroy;
}
rnbd_devs_kobj = kobject_create_and_add("devices", &rnbd_dev->kobj);
if (!rnbd_devs_kobj) {
err = -ENOMEM;
goto dev_destroy;
}
return 0;
dev_destroy:
device_destroy(rnbd_dev_class, MKDEV(0, 0));
cls_destroy:
class_destroy(rnbd_dev_class);
return err;
}
void rnbd_srv_destroy_sysfs_files(void)
{
kobject_del(rnbd_devs_kobj);
kobject_put(rnbd_devs_kobj);
device_destroy(rnbd_dev_class, MKDEV(0, 0));
class_destroy(rnbd_dev_class);
}
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#ifndef RNBD_SRV_H
#define RNBD_SRV_H
#include <linux/types.h>
#include <linux/idr.h>
#include <linux/kref.h>
#include <rtrs.h>
#include "rnbd-proto.h"
#include "rnbd-log.h"
struct rnbd_srv_session {
/* Entry inside global sess_list */
struct list_head list;
struct rtrs_srv *rtrs;
char sessname[NAME_MAX];
int queue_depth;
struct bio_set sess_bio_set;
struct xarray index_idr;
/* List of struct rnbd_srv_sess_dev */
struct list_head sess_dev_list;
struct mutex lock;
u8 ver;
};
struct rnbd_srv_dev {
/* Entry inside global dev_list */
struct list_head list;
struct kobject dev_kobj;
struct kobject *dev_sessions_kobj;
struct kref kref;
char id[NAME_MAX];
/* List of rnbd_srv_sess_dev structs */
struct list_head sess_dev_list;
struct mutex lock;
int open_write_cnt;
};
/* Structure which binds N devices and N sessions */
struct rnbd_srv_sess_dev {
/* Entry inside rnbd_srv_dev struct */
struct list_head dev_list;
/* Entry inside rnbd_srv_session struct */
struct list_head sess_list;
struct rnbd_dev *rnbd_dev;
struct rnbd_srv_session *sess;
struct rnbd_srv_dev *dev;
struct kobject kobj;
u32 device_id;
fmode_t open_flags;
struct kref kref;
struct completion *destroy_comp;
char pathname[NAME_MAX];
enum rnbd_access_mode access_mode;
};
/* rnbd-srv-sysfs.c */
int rnbd_srv_create_dev_sysfs(struct rnbd_srv_dev *dev,
struct block_device *bdev,
const char *dir_name);
void rnbd_srv_destroy_dev_sysfs(struct rnbd_srv_dev *dev);
int rnbd_srv_create_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev);
void rnbd_srv_destroy_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev);
int rnbd_srv_create_sysfs_files(void);
void rnbd_srv_destroy_sysfs_files(void);
void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev);
#endif /* RNBD_SRV_H */
......@@ -107,6 +107,7 @@ source "drivers/infiniband/ulp/srpt/Kconfig"
source "drivers/infiniband/ulp/iser/Kconfig"
source "drivers/infiniband/ulp/isert/Kconfig"
source "drivers/infiniband/ulp/rtrs/Kconfig"
source "drivers/infiniband/ulp/opa_vnic/Kconfig"
......
......@@ -8,11 +8,11 @@ obj-$(CONFIG_INFINIBAND_USER_MAD) += ib_umad.o
obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o $(user_access-y)
ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
device.o fmr_pool.o cache.o netlink.o \
device.o cache.o netlink.o \
roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
multicast.o mad.o smi.o agent.o mad_rmpp.o \
nldev.o restrack.o counters.o ib_core_uverbs.o \
trace.o
trace.o lag.o
ib_core-$(CONFIG_SECURITY_INFINIBAND) += security.o
ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o
......@@ -36,6 +36,9 @@ ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
uverbs_std_types_flow_action.o uverbs_std_types_dm.o \
uverbs_std_types_mr.o uverbs_std_types_counters.o \
uverbs_uapi.o uverbs_std_types_device.o \
uverbs_std_types_async_fd.o
uverbs_std_types_async_fd.o \
uverbs_std_types_srq.o \
uverbs_std_types_wq.o \
uverbs_std_types_qp.o
ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o
......@@ -371,6 +371,8 @@ static int fetch_ha(const struct dst_entry *dst, struct rdma_dev_addr *dev_addr,
(const void *)&dst_in6->sin6_addr;
sa_family_t family = dst_in->sa_family;
might_sleep();
/* If we have a gateway in IB mode then it must be an IB network */
if (has_gateway(dst, family) && dev_addr->network == RDMA_NETWORK_IB)
return ib_nl_fetch_ha(dev_addr, daddr, seq, family);
......@@ -727,6 +729,8 @@ int roce_resolve_route_from_path(struct sa_path_rec *rec,
struct rdma_dev_addr dev_addr = {};
int ret;
might_sleep();
if (rec->roce.route_resolved)
return 0;
......
This diff is collapsed.
This diff is collapsed.
......@@ -322,8 +322,21 @@ static struct config_group *make_cma_dev(struct config_group *group,
return ERR_PTR(err);
}
static void drop_cma_dev(struct config_group *cgroup, struct config_item *item)
{
struct config_group *group =
container_of(item, struct config_group, cg_item);
struct cma_dev_group *cma_dev_group =
container_of(group, struct cma_dev_group, device_group);
configfs_remove_default_groups(&cma_dev_group->ports_group);
configfs_remove_default_groups(&cma_dev_group->device_group);
config_item_put(item);
}
static struct configfs_group_operations cma_subsys_group_ops = {
.make_group = make_cma_dev,
.drop_item = drop_cma_dev,
};
static const struct config_item_type cma_subsys_type = {
......
......@@ -95,6 +95,7 @@ struct rdma_id_private {
* Internal to RDMA/core, don't use in the drivers
*/
struct rdma_restrack_entry res;
struct rdma_ucm_ece ece;
};
#if IS_ENABLED(CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS)
......
......@@ -103,23 +103,33 @@ DEFINE_CMA_FSM_EVENT(sent_drep);
DEFINE_CMA_FSM_EVENT(sent_dreq);
DEFINE_CMA_FSM_EVENT(id_destroy);
TRACE_EVENT(cm_id_create,
TRACE_EVENT(cm_id_attach,
TP_PROTO(
const struct rdma_id_private *id_priv
const struct rdma_id_private *id_priv,
const struct ib_device *device
),
TP_ARGS(id_priv),
TP_ARGS(id_priv, device),
TP_STRUCT__entry(
__field(u32, cm_id)
__array(unsigned char, srcaddr, sizeof(struct sockaddr_in6))
__array(unsigned char, dstaddr, sizeof(struct sockaddr_in6))
__string(devname, device->name)
),
TP_fast_assign(
__entry->cm_id = id_priv->res.id;
memcpy(__entry->srcaddr, &id_priv->id.route.addr.src_addr,
sizeof(struct sockaddr_in6));
memcpy(__entry->dstaddr, &id_priv->id.route.addr.dst_addr,
sizeof(struct sockaddr_in6));
__assign_str(devname, device->name);
),
TP_printk("cm.id=%u",
__entry->cm_id
TP_printk("cm.id=%u src=%pISpc dst=%pISpc device=%s",
__entry->cm_id, __entry->srcaddr, __entry->dstaddr,
__get_str(devname)
)
);
......
......@@ -414,4 +414,7 @@ void rdma_umap_priv_init(struct rdma_umap_priv *priv,
struct vm_area_struct *vma,
struct rdma_user_mmap_entry *entry);
void ib_cq_pool_init(struct ib_device *dev);
void ib_cq_pool_destroy(struct ib_device *dev);
#endif /* _CORE_PRIV_H */
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -64,8 +64,8 @@ uverbs_get_uobject_from_file(u16 object_id, enum uverbs_obj_access access,
s64 id, struct uverbs_attr_bundle *attrs);
void uverbs_finalize_object(struct ib_uobject *uobj,
enum uverbs_obj_access access, bool commit,
struct uverbs_attr_bundle *attrs);
enum uverbs_obj_access access, bool hw_obj_valid,
bool commit, struct uverbs_attr_bundle *attrs);
int uverbs_output_written(const struct uverbs_attr_bundle *bundle, size_t idx);
......@@ -159,6 +159,9 @@ extern const struct uapi_definition uverbs_def_obj_dm[];
extern const struct uapi_definition uverbs_def_obj_flow_action[];
extern const struct uapi_definition uverbs_def_obj_intf[];
extern const struct uapi_definition uverbs_def_obj_mr[];
extern const struct uapi_definition uverbs_def_obj_qp[];
extern const struct uapi_definition uverbs_def_obj_srq[];
extern const struct uapi_definition uverbs_def_obj_wq[];
extern const struct uapi_definition uverbs_def_write_intf[];
static inline const struct uverbs_api_write_method *
......
......@@ -129,7 +129,7 @@ static int rdma_rw_init_mr_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
qp->integrity_en);
int i, j, ret = 0, count = 0;
ctx->nr_ops = (sg_cnt + pages_per_mr - 1) / pages_per_mr;
ctx->nr_ops = DIV_ROUND_UP(sg_cnt, pages_per_mr);
ctx->reg = kcalloc(ctx->nr_ops, sizeof(*ctx->reg), GFP_KERNEL);
if (!ctx->reg) {
ret = -ENOMEM;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -41,7 +41,7 @@
#define STRUCT_FIELD(header, field) \
.struct_offset_bytes = offsetof(struct ib_unpacked_ ## header, field), \
.struct_size_bytes = sizeof ((struct ib_unpacked_ ## header *) 0)->field, \
.struct_size_bytes = sizeof_field(struct ib_unpacked_ ## header, field), \
.field_name = #header ":" #field
static const struct ib_field lrh_table[] = {
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment