Commits · 453924de6212ac159f946b75c6b59918e2e30944 · Kirill Smelkov / linux

01 Jan, 2024 16 commits

afs: Overhaul invalidation handling to better support RO volumes · 453924de

David Howells authored Nov 08, 2023

Overhaul the third party-induced invalidation handling, making use of the
previously added volume-level event counters (cb_scrub and cb_ro_snapshot)
that are now being parsed out of the VolSync record returned by the
fileserver in many of its replies.

This allows better handling of RO (and Backup) volumes.  Since these are
snapshot of a RW volume that are updated atomically simultantanously across
all servers that host them, they only require a single callback promise for
the entire volume.  The currently upstream code assumes that RO volumes
operate in the same manner as RW volumes, and that each file has its own
individual callback - which means that it does a status fetch for *every*
file in a RO volume, whether or not the volume got "released" (volume
callback breaks can occur for other reasons too, such as the volumeserver
taking ownership of a volume from a fileserver).

To this end, make the following changes:

 (1) Change the meaning of the volume's cb_v_break counter so that it is
     now a hint that we need to issue a status fetch to work out the state
     of a volume.  cb_v_break is incremented by volume break callbacks and
     by server initialisation callbacks.

 (2) Add a second counter, cb_v_check, to the afs_volume struct such that
     if this differs from cb_v_break, we need to do a check.  When the
     check is complete, cb_v_check is advanced to what cb_v_break was at
     the start of the status fetch.

 (3) Move the list of mmap'd vnodes to the volume and trigger removal of
     PTEs that map to files on a volume break rather than on a server
     break.

 (4) When a server reinitialisation callback comes in, use the
     server-to-volume reverse mapping added in a preceding patch to iterate
     over all the volumes using that server and clear the volume callback
     promises for that server and the general volume promise as a whole to
     trigger reanalysis.

 (5) Replace the AFS_VNODE_CB_PROMISED flag with an AFS_NO_CB_PROMISE
     (TIME64_MIN) value in the cb_expires_at field, reducing the number of
     checks we need to make.

 (6) Change afs_check_validity() to quickly see if various event counters
     have been incremented or if the vnode or volume callback promise is
     due to expire/has expired without making any changes to the state.
     That is now left to afs_validate() as this may get more complicated in
     future as we may have to examine server records too.

 (7) Overhaul afs_validate() so that it does a single status fetch if we
     need to check the state of either the vnode or the volume - and do so
     under appropriate locking.  The function does the following steps:

     (A) If the vnode/volume is no longer seen as valid, then we take the
     vnode validation lock and, if the volume promise has expired, the
     volume check lock also.  The latter prevents redundant checks being
     made to find out if a new version of the volume got released.

     (B) If a previous RPC call found that the volsync changed unexpectedly
     or that a RO volume was updated, then we unmap all PTEs pointing to
     the file to stop mmap being used for access.

     (C) If the vnode is still seen to be of uncertain validity, then we
     perform an FS.FetchStatus RPC op to jointly update the volume status
     and the vnode status.  This assessment is done as part of parsing the
     reply:

	If the RO volume creation timestamp advances, cb_ro_snapshot is
	incremented; if either the creation or update timestamps changes in
	an unexpected way, the cb_scrub counter is incremented

	If the Data Version returned doesn't match the copy we have
	locally, then we ask for the pagecache to be zapped.  This takes
	care of handling RO update.

     (D) If cb_scrub differs between volume and vnode, the vnode's
     pagecache is zapped and the vnode's cb_scrub is updated unless the
     file is marked as having been deleted.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

453924de

afs: Parse the VolSync record in the reply of a number of RPC ops · 16069e13

David Howells authored Nov 05, 2023

A number of fileserver RPC operations return a VolSync record as part of
their reply that gives some information about the state of the volume being
accessed, including:

 (1) A volume Creation timestamp.  For an RW volume, this is the time at
     which the volume was created; if it changes, the RW volume was
     presumably restored from a backup and all cached data should be
     scrubbed as Data Version numbers could regress on the files in the
     volume.

     For an RO volume, this is the time it was last snapshotted from the RW
     volume.  It is expected to advance each time this happens; if it
     regresses, cached data should be scrubbed.

 (2) A volume Update timestamp (Auristor only).  For an RW volume, this is
     updated any time any change is made to a volume or its contents.  If
     it regresses, all cached data must be scrubbed.

     For an RO volume, this is a copy of the RW volume's Update timestamp
     at the point of snapshotting.  It can be used as a version number when
     checking to see if a callback on a RO volume was due to a snapshot.
     If it regresses, all cached data must be scrubbed.

but this is currently not made use of by the in-kernel afs filesystem.

Make the afs filesystem use this by:

 (1) Add an update time field to the afs_volsync struct and use a value of
     TIME64_MIN in both that and the creation time to indicate that they
     are unset.

 (2) Add creation and update time fields to the afs_volume struct and use
     this to track the two timestamps.

 (3) Add a volsync_lock mutex to the afs_volume struct to control
     modification access for when we detect a change in these values.

 (3) Add a 'pre-op volsync' struct to the afs_operation struct to record
     the state of the volume tracking before the op.

 (4) Add a new counter, cb_scrub, to the afs_volume struct to count events
     that require all data to be scrubbed.  A copy is placed in the
     afs_vnode struct (inode) and if they no longer match, a scrub takes
     place.

 (5) When the result of an operation is being parsed, parse the VolSync
     data too, if it is provided.  Note that the two timestamps are handled
     separately, since they don't work in quite the same way.

     - If the afs_volume tracking is unset, just set it and do nothing
       else.

     - If the result timestamps are the same as the ones in afs_volume, do
       nothing.

     - If the timestamps regress, increment cb_scrub if not already done
       so.

     - If the creation timestamp on a RW volume changes, increment cb_scrub
       if not already done so.

     - If the creation timestamp on a RO volume advances, update the server
       list and see if the current server has been excluded, if so reissue
       the op.  Once over half of the replication sites have been updated,
       increment cb_ro_snapshot to indicate updates may be required and
       switch over to excluding unupdated replication sites.

     - If the creation timestamp on a Backup volume advances, just
       increment cb_ro_snapshot to trigger updates.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

16069e13

afs: Don't leave DONTUSE/NEWREPSITE servers out of server list · d3acd81e

David Howells authored Nov 14, 2023

Don't leave servers that are marked VLSF_DONTUSE or VLSF_NEWREPSITE out of
the server list for a volume; rather, mark DONTUSE ones excluded and mark
either NEWREPSITE excluded if the number of updated servers is <50% of the
usable servers or mark !NEWREPSITE excluded otherwise.

Mark the server list as a whole with a 3-state flag to indicate whether we
think the RW volume is being replicated to the RO volume, and, if so,
whether we should switch to using updated replication sites
(VLSF_NEWREPSITE) or stick with the old for now.

This processing is pushed up from the VLDB RPC reply parser to the code
that generates the server list from that information.

Doing this allows the old list to be kept with just the exclusion flags
replaced and to keep the server records pinned and maintained.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

d3acd81e

afs: Fix comment in afs_do_lookup() · dd948889

David Howells authored Nov 14, 2023

Fix the comment in afs_do_lookup() that says that slot 0 is used for the
fid being looked up and slot 1 is used for the directory.  It's actually
done the other way round.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

dd948889

afs: Apply server breaks to mmap'd files in the call processor · 32222f09

David Howells authored Nov 07, 2023

Apply server breaks to mmap'd files that are being used from that server
from the call processor work function rather than punting it off to a
workqueue. The work item, afs_server_init_callback(), then bumps each
individual inode off to its own work item introducing a potentially lengthy
delay. This reduces that delay at the cost of extending the amount of time
we delay replying to the CB.InitCallBack3 notification RPC from the server.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

32222f09

afs: Move the vnode/volume validity checking code into its own file · dfa0a449

David Howells authored Nov 07, 2023

Move the code that does validity checking of vnodes and volumes with
respect to third-party changes into its own file.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

dfa0a449

afs: Defer volume record destruction to a workqueue · 445f9b69

David Howells authored Nov 08, 2023

Defer volume record destruction to a workqueue so that afs_put_volume()
isn't going to run the destruction process in the callback workqueue whilst
the server is holding up other clients whilst waiting for us to reply to a
CB.CallBack notification RPC.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

445f9b69

afs: Make it possible to find the volumes that are using a server · ca0e79a4

David Howells authored Nov 02, 2023

Make it possible to find the afs_volume structs that are using an
afs_server struct to aid in breaking volume callbacks.

The way this is done is that each afs_volume already has an array of
afs_server_entry records that point to the servers where that volume might
be found.  An afs_volume backpointer and a list node is added to each entry
and each entry is then added to an RCU-traversable list on the afs_server
to which it points.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

ca0e79a4

afs: Combine the endpoint state bools into a bitmask · 21c1f410

David Howells authored Oct 31, 2023

Combine the endpoint state bool-type members into a bitmask so that some of
them can be waited upon more easily.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

21c1f410

afs: Keep a record of the current fileserver endpoint state · f49b594d

David Howells authored Oct 31, 2023

Keep a record of the current fileserver endpoint state, including the probe
state, and replace it when a new probe is started rather than just
squelching the old state and overwriting it.  Clearance of the old state
can cause a race if there's another thread also currently trying to
communicate with that server.

It appears that this race might be the culprit for some occasions where
kafs complains about invalid data in the RPC reply because the rotation
algorithm fell all the way through without actually issuing an RPC call and
the error return got filled in from the probe state (which has a zero error
recorded).  Whatever happens to be in the caller's reply buffer is then
taken as the response.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

f49b594d

afs: Dispatch vlserver probes in priority order · e6a7d7f7

David Howells authored Oct 30, 2023

When probing all the addresses for a volume location server, dispatch them
in order of descending priority to try and get back highest priority one
first.

Also add a tracepoint to show the transmission and completion of the
probes.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

e6a7d7f7

afs: Dispatch fileserver probes in priority order · 92f091cd

David Howells authored Oct 30, 2023

When probing all the addresses for a fileserver, dispatch them in order of
descending priority to try and get back highest priority one first.

Also add a tracepoint to show the transmission and completion of the
probes.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

92f091cd

afs: Mark address lists with configured priorities · d14cf8ed

David Howells authored Oct 30, 2023

Add a field to each address in an address list (afs_addr_list struct) that
records the current priority for that address according to the address
preference table.  We don't want to do this every time we use an address
list, so the version number of the address preference table is recorded in
the address list too and we only re-mark the list when we see the version
change.

These numbers are then displayed through /proc/net/afs/servers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

d14cf8ed

afs: Provide a way to configure address priorities · f94f70d3

David Howells authored Oct 27, 2023

AFS servers may have multiple addresses, but the client can't easily judge
between them as to which one is best.  For instance, an address that has a
larger RTT might actually have a better bandwidth because it goes through a
switch rather than being directly connected - but we can't work this out
dynamically unless we push through sufficient data that we can measure it.

To allow the administrator to configure this, add a list of preference
weightings for server addresses by IPv4/IPv6 address or subnet and allow
this to be viewed through a procfile and altered by writing text commands
to that same file.  Preference rules can be added/updated by:

	echo "add <proto> <addr>[/<subnet>] <prior>" >/proc/fs/afs/addr_prefs
	echo "add udp 1.2.3.4 1000" >/proc/fs/afs/addr_prefs
	echo "add udp 192.168.0.0/16 3000" >/proc/fs/afs/addr_prefs
	echo "add udp 1001:2002:0:6::/64 4000" >/proc/fs/afs/addr_prefs

and removed by:

	echo "del <proto> <addr>[/<subnet>]" >/proc/fs/afs/addr_prefs
	echo "del udp 1.2.3.4" >/proc/fs/afs/addr_prefs

where the priority is a number between 0 and 65535.

The list is split between IPv4 and IPv6 addresses and each sublist is kept
in numerical order, with rules that would otherwise match but have
different subnet masking being ordered with the most specific submatch
first.

A subsequent patch will apply these rules.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

f94f70d3

afs: Remove the unimplemented afs_cmp_addr_list() · b605ee42

David Howells authored Oct 30, 2023

Remove afs_cmp_addr_list() as it was never implemented.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

b605ee42

afs: Add some more info to /proc/net/afs/servers · af9a5b49

David Howells authored Oct 27, 2023

In /proc/net/afs/servers, show the cell name and the last error for each
address in the server's list.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

af9a5b49

24 Dec, 2023 22 commits

rxrpc: Create a procfile to display outstanding client conn bundles · d2ce4a84

David Howells authored Oct 26, 2023

Create /proc/net/rxrpc/bundles to display outstanding rxrpc client
connection bundles.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

d2ce4a84

afs: Fold the afs_addr_cursor struct in · 98f9fda2

David Howells authored Oct 20, 2023

Fold the afs_addr_cursor struct into the afs_operation struct and the
afs_vl_cursor struct and fold its operations into their callers also.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

98f9fda2

afs: Use peer + service_id as call address · e38f299e

David Howells authored Oct 26, 2023

Use the rxrpc_peer plus the service ID as the call address instead of
passing in a sockaddr_srx down to rxrpc.  The peer record is obtained by
using rxrpc_kernel_get_peer().  This avoids the need to repeatedly look up
the peer and allows rxrpc to hold on to resources for it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

e38f299e

afs: Rename some fields · 905b8615

David Howells authored Oct 26, 2023

Rename the ->index and ->untried fields of the afs_vl_cursor and
afs_operation struct to ->server_index and ->untried_servers to avoid
confusion with address iteration fields when those get folded in.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

905b8615

afs: Add a tracepoint for struct afs_addr_list · 1e5d8493

David Howells authored Oct 19, 2023

Add a tracepoint to track the lifetime of the afs_addr_list struct.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

1e5d8493

afs: Simplify error handling · aa453bec

David Howells authored Oct 25, 2023

Simplify error handling a bit by moving it from the afs_addr_cursor struct
to the afs_operation and afs_vl_cursor structs and using the error
prioritisation function for accumulating errors from multiple sources (AFS
tries to rotate between multiple fileservers, some of which may be
inaccessible or in some state of offlinedness).
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

aa453bec

afs: Don't put afs_call in afs_wait_for_call_to_complete() · 6f2ff7e8

David Howells authored Oct 26, 2023

Don't put the afs_call struct in afs_wait_for_call_to_complete() but rather
have the caller do it.  This will allow the caller to fish stuff out of the
afs_call struct rather than the afs_addr_cursor struct, thereby allowing a
subsequent patch to subsume it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

6f2ff7e8

afs: Wrap most op->error accesses with inline funcs · 2de5599f

David Howells authored Oct 26, 2023

Wrap most op->error accesses with inline funcs which will make it easier
for a subsequent patch to replace op->error with something else.  Two
functions are added to this end:

 (1) afs_op_error() - Get the error code.

 (2) afs_op_set_error() - Set the error code.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

2de5599f

afs: Use op->nr_iterations=-1 to indicate to begin fileserver iteration · 075171fd

David Howells authored Oct 20, 2023

Set op->nr_iterations to -1 to indicate that we need to begin fileserver
iteration rather than setting error to SHRT_MAX.  This makes it easier to
eliminate the address cursor.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

075171fd

afs: Handle the VIO and UAEIO aborts explicitly · eb8eae65

David Howells authored Oct 20, 2023

When processing the result of a call, handle the VIO and UAEIO abort
specifically rather than leaving it to a default case.  Rather than
erroring out unconditionally, see if there's another server if the volume
has more than one server available, otherwise return -EREMOTEIO.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

eb8eae65

afs: Rename addr_list::failed to probe_failed · aa4917d6

David Howells authored Oct 20, 2023

Rename the failed member of struct addr_list to probe_failed as it's
specifically related to probe failures.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

aa4917d6

afs: Don't skip server addresses for which we didn't get an RTT reading · a2aff7b5

David Howells authored Oct 30, 2023

In the rotation algorithms for iterating over volume location servers and
file servers, don't skip servers from which we got a valid response to a
probe (either a reply DATA packet or an ABORT) even if we didn't manage to
get an RTT reading.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

a2aff7b5

rxrpc, afs: Allow afs to pin rxrpc_peer objects · 72904d7b

David Howells authored Oct 19, 2023

Change rxrpc's API such that:

 (1) A new function, rxrpc_kernel_lookup_peer(), is provided to look up an
     rxrpc_peer record for a remote address and a corresponding function,
     rxrpc_kernel_put_peer(), is provided to dispose of it again.

 (2) When setting up a call, the rxrpc_peer object used during a call is
     now passed in rather than being set up by rxrpc_connect_call().  For
     afs, this meenat passing it to rxrpc_kernel_begin_call() rather than
     the full address (the service ID then has to be passed in as a
     separate parameter).

 (3) A new function, rxrpc_kernel_remote_addr(), is added so that afs can
     get a pointer to the transport address for display purposed, and
     another, rxrpc_kernel_remote_srx(), to gain a pointer to the full
     rxrpc address.

 (4) The function to retrieve the RTT from a call, rxrpc_kernel_get_srtt(),
     is then altered to take a peer.  This now returns the RTT or -1 if
     there are insufficient samples.

 (5) Rename rxrpc_kernel_get_peer() to rxrpc_kernel_call_get_peer().

 (6) Provide a new function, rxrpc_kernel_get_peer(), to get a ref on a
     peer the caller already has.

This allows the afs filesystem to pin the rxrpc_peer records that it is
using, allowing faster lookups and pointer comparisons rather than
comparing sockaddr_rxrpc contents.  It also makes it easier to get hold of
the RTT.  The following changes are made to afs:

 (1) The addr_list struct's addrs[] elements now hold a peer struct pointer
     and a service ID rather than a sockaddr_rxrpc.

 (2) When displaying the transport address, rxrpc_kernel_remote_addr() is
     used.

 (3) The port arg is removed from afs_alloc_addrlist() since it's always
     overridden.

 (4) afs_merge_fs_addr4() and afs_merge_fs_addr6() do peer lookup and may
     now return an error that must be handled.

 (5) afs_find_server() now takes a peer pointer to specify the address.

 (6) afs_find_server(), afs_compare_fs_alists() and afs_merge_fs_addr[46]{}
     now do peer pointer comparison rather than address comparison.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

72904d7b

afs: Turn the afs_addr_list address array into an array of structs · 07f3502b

David Howells authored Oct 18, 2023

Turn the afs_addr_list address array into an array of structs, thereby
allowing per-address (such as RTT) info to be added.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

07f3502b

afs: Add comments on abort handling · fe245c8f

David Howells authored Oct 18, 2023

Add some comments on AFS abort code handling in the rotation algorithm and
adjust the errors produced to match.
Reported-by: Jeffrey E Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org

fe245c8f

rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock() · bad1a11c

Oleg Nesterov authored Nov 17, 2023

rxrpc_find_service_conn_rcu() should make the "seq" counter odd on the
second pass, otherwise read_seqbegin_or_lock() never takes the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20231117164846.GA10410@redhat.com/

bad1a11c

afs: use read_seqbegin() in afs_check_validity() and afs_getattr() · df91b9df

Oleg Nesterov authored Nov 30, 2023

David Howells says:

 (3) afs_check_validity().
 (4) afs_getattr().

     These are both pretty short, so your solution is probably good for them.
     That said, afs_vnode_commit_status() can spend a long time under the
     write lock - and pretty much every file RPC op returns a status update.

Change these functions to use read_seqbegin(). This simplifies the code
and doesn't change the current behaviour, the "seq" counter is always even
so read_seqbegin_or_lock() can never take the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20231130115617.GA21584@redhat.com/

df91b9df

afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() · 1702e065

Oleg Nesterov authored Nov 30, 2023

David Howells says:

 (5) afs_find_server().

     There could be a lot of servers in the list and each server can have
     multiple addresses, so I think this would be better with an exclusive
     second pass.

     The server list isn't likely to change all that often, but when it does
     change, there's a good chance several servers are going to be
     added/removed one after the other.  Further, this is only going to be
     used for incoming cache management/callback requests from the server,
     which hopefully aren't going to happen too often - but it is remotely
     drivable.

 (6) afs_find_server_by_uuid().

     Similarly to (5), there could be a lot of servers to search through, but
     they are in a tree not a flat list, so it should be faster to process.
     Again, it's not likely to change that often and, again, when it does
     change it's likely to involve multiple changes.  This can be driven
     remotely by an incoming cache management request but is mostly going to
     be driven by setting up or reconfiguring a volume's server list -
     something that also isn't likely to happen often.

Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock()
never takes the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20231130115614.GA21581@redhat.com/

1702e065

afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu() · 4121b433

Oleg Nesterov authored Nov 30, 2023

David Howells says:

 (2) afs_lookup_volume_rcu().

     There can be a lot of volumes known by a system.  A thousand would
     require a 10-step walk and this is drivable by remote operation, so I
     think this should probably take a lock on the second pass too.

Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock()
never takes the lock.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20231130115606.GA21571@redhat.com/

4121b433

afs: Automatically generate trace tag enums · 2daa6404

David Howells authored Feb 23, 2023

Automatically generate trace tag enums from the symbol -> string mapping
tables rather than having the enums as well, thereby reducing duplicated
data.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org

2daa6404

afs: Remove whitespace before most ')' from the trace header · a790c258

David Howells authored Feb 23, 2023

checkpatch objects to whitespace before ')', so remove most of it from the
afs trace header.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org

a790c258

Linux 6.7-rc7 · 861deac3
Linus Torvalds authored Dec 23, 2023

861deac3

23 Dec, 2023 2 commits

Merge tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3f82f1c3

Linus Torvalds authored Dec 23, 2023

Pull x86 fixes from Ingo Molnar:

 - Fix a secondary CPUs enumeration regression caused by creative MADT
   APIC table entries on certain systems.

 - Fix a race in the NOP-patcher that can spuriously trigger crashes on
   bootup.

 - Fix a bootup failure regression caused by the parallel bringup code,
   caused by firmware inconsistency between the APIC initialization
   states of the boot and secondary CPUs, on certain systems.

* tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/acpi: Handle bogus MADT APIC tables gracefully
  x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
  x86/alternatives: Sync core before enabling interrupts
  x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully

3f82f1c3

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · f969c914

Linus Torvalds authored Dec 23, 2023

Pull SCSI fixes from James Bottomley:
 "Four small fixes, three in drivers with the core one adding a batch
  indicator (for drivers which use it) to the error handler"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Let the sq_lock protect sq_tail_slot access
  scsi: ufs: qcom: Return ufs_qcom_clk_scale_*() errors in ufs_qcom_clk_scale_notify()
  scsi: core: Always send batch on reset or error handling command
  scsi: bnx2fc: Fix skb double free in bnx2fc_rcv()

f969c914