1. 01 Jan, 2024 16 commits
    • David Howells's avatar
      afs: Overhaul invalidation handling to better support RO volumes · 453924de
      David Howells authored
      Overhaul the third party-induced invalidation handling, making use of the
      previously added volume-level event counters (cb_scrub and cb_ro_snapshot)
      that are now being parsed out of the VolSync record returned by the
      fileserver in many of its replies.
      
      This allows better handling of RO (and Backup) volumes.  Since these are
      snapshot of a RW volume that are updated atomically simultantanously across
      all servers that host them, they only require a single callback promise for
      the entire volume.  The currently upstream code assumes that RO volumes
      operate in the same manner as RW volumes, and that each file has its own
      individual callback - which means that it does a status fetch for *every*
      file in a RO volume, whether or not the volume got "released" (volume
      callback breaks can occur for other reasons too, such as the volumeserver
      taking ownership of a volume from a fileserver).
      
      To this end, make the following changes:
      
       (1) Change the meaning of the volume's cb_v_break counter so that it is
           now a hint that we need to issue a status fetch to work out the state
           of a volume.  cb_v_break is incremented by volume break callbacks and
           by server initialisation callbacks.
      
       (2) Add a second counter, cb_v_check, to the afs_volume struct such that
           if this differs from cb_v_break, we need to do a check.  When the
           check is complete, cb_v_check is advanced to what cb_v_break was at
           the start of the status fetch.
      
       (3) Move the list of mmap'd vnodes to the volume and trigger removal of
           PTEs that map to files on a volume break rather than on a server
           break.
      
       (4) When a server reinitialisation callback comes in, use the
           server-to-volume reverse mapping added in a preceding patch to iterate
           over all the volumes using that server and clear the volume callback
           promises for that server and the general volume promise as a whole to
           trigger reanalysis.
      
       (5) Replace the AFS_VNODE_CB_PROMISED flag with an AFS_NO_CB_PROMISE
           (TIME64_MIN) value in the cb_expires_at field, reducing the number of
           checks we need to make.
      
       (6) Change afs_check_validity() to quickly see if various event counters
           have been incremented or if the vnode or volume callback promise is
           due to expire/has expired without making any changes to the state.
           That is now left to afs_validate() as this may get more complicated in
           future as we may have to examine server records too.
      
       (7) Overhaul afs_validate() so that it does a single status fetch if we
           need to check the state of either the vnode or the volume - and do so
           under appropriate locking.  The function does the following steps:
      
           (A) If the vnode/volume is no longer seen as valid, then we take the
           vnode validation lock and, if the volume promise has expired, the
           volume check lock also.  The latter prevents redundant checks being
           made to find out if a new version of the volume got released.
      
           (B) If a previous RPC call found that the volsync changed unexpectedly
           or that a RO volume was updated, then we unmap all PTEs pointing to
           the file to stop mmap being used for access.
      
           (C) If the vnode is still seen to be of uncertain validity, then we
           perform an FS.FetchStatus RPC op to jointly update the volume status
           and the vnode status.  This assessment is done as part of parsing the
           reply:
      
      	If the RO volume creation timestamp advances, cb_ro_snapshot is
      	incremented; if either the creation or update timestamps changes in
      	an unexpected way, the cb_scrub counter is incremented
      
      	If the Data Version returned doesn't match the copy we have
      	locally, then we ask for the pagecache to be zapped.  This takes
      	care of handling RO update.
      
           (D) If cb_scrub differs between volume and vnode, the vnode's
           pagecache is zapped and the vnode's cb_scrub is updated unless the
           file is marked as having been deleted.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      453924de
    • David Howells's avatar
      afs: Parse the VolSync record in the reply of a number of RPC ops · 16069e13
      David Howells authored
      A number of fileserver RPC operations return a VolSync record as part of
      their reply that gives some information about the state of the volume being
      accessed, including:
      
       (1) A volume Creation timestamp.  For an RW volume, this is the time at
           which the volume was created; if it changes, the RW volume was
           presumably restored from a backup and all cached data should be
           scrubbed as Data Version numbers could regress on the files in the
           volume.
      
           For an RO volume, this is the time it was last snapshotted from the RW
           volume.  It is expected to advance each time this happens; if it
           regresses, cached data should be scrubbed.
      
       (2) A volume Update timestamp (Auristor only).  For an RW volume, this is
           updated any time any change is made to a volume or its contents.  If
           it regresses, all cached data must be scrubbed.
      
           For an RO volume, this is a copy of the RW volume's Update timestamp
           at the point of snapshotting.  It can be used as a version number when
           checking to see if a callback on a RO volume was due to a snapshot.
           If it regresses, all cached data must be scrubbed.
      
      but this is currently not made use of by the in-kernel afs filesystem.
      
      Make the afs filesystem use this by:
      
       (1) Add an update time field to the afs_volsync struct and use a value of
           TIME64_MIN in both that and the creation time to indicate that they
           are unset.
      
       (2) Add creation and update time fields to the afs_volume struct and use
           this to track the two timestamps.
      
       (3) Add a volsync_lock mutex to the afs_volume struct to control
           modification access for when we detect a change in these values.
      
       (3) Add a 'pre-op volsync' struct to the afs_operation struct to record
           the state of the volume tracking before the op.
      
       (4) Add a new counter, cb_scrub, to the afs_volume struct to count events
           that require all data to be scrubbed.  A copy is placed in the
           afs_vnode struct (inode) and if they no longer match, a scrub takes
           place.
      
       (5) When the result of an operation is being parsed, parse the VolSync
           data too, if it is provided.  Note that the two timestamps are handled
           separately, since they don't work in quite the same way.
      
           - If the afs_volume tracking is unset, just set it and do nothing
             else.
      
           - If the result timestamps are the same as the ones in afs_volume, do
             nothing.
      
           - If the timestamps regress, increment cb_scrub if not already done
             so.
      
           - If the creation timestamp on a RW volume changes, increment cb_scrub
             if not already done so.
      
           - If the creation timestamp on a RO volume advances, update the server
             list and see if the current server has been excluded, if so reissue
             the op.  Once over half of the replication sites have been updated,
             increment cb_ro_snapshot to indicate updates may be required and
             switch over to excluding unupdated replication sites.
      
           - If the creation timestamp on a Backup volume advances, just
             increment cb_ro_snapshot to trigger updates.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      16069e13
    • David Howells's avatar
      afs: Don't leave DONTUSE/NEWREPSITE servers out of server list · d3acd81e
      David Howells authored
      Don't leave servers that are marked VLSF_DONTUSE or VLSF_NEWREPSITE out of
      the server list for a volume; rather, mark DONTUSE ones excluded and mark
      either NEWREPSITE excluded if the number of updated servers is <50% of the
      usable servers or mark !NEWREPSITE excluded otherwise.
      
      Mark the server list as a whole with a 3-state flag to indicate whether we
      think the RW volume is being replicated to the RO volume, and, if so,
      whether we should switch to using updated replication sites
      (VLSF_NEWREPSITE) or stick with the old for now.
      
      This processing is pushed up from the VLDB RPC reply parser to the code
      that generates the server list from that information.
      
      Doing this allows the old list to be kept with just the exclusion flags
      replaced and to keep the server records pinned and maintained.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      d3acd81e
    • David Howells's avatar
      afs: Fix comment in afs_do_lookup() · dd948889
      David Howells authored
      Fix the comment in afs_do_lookup() that says that slot 0 is used for the
      fid being looked up and slot 1 is used for the directory.  It's actually
      done the other way round.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      dd948889
    • David Howells's avatar
      afs: Apply server breaks to mmap'd files in the call processor · 32222f09
      David Howells authored
      Apply server breaks to mmap'd files that are being used from that server
      from the call processor work function rather than punting it off to a
      workqueue.  The work item, afs_server_init_callback(), then bumps each
      individual inode off to its own work item introducing a potentially lengthy
      delay.  This reduces that delay at the cost of extending the amount of time
      we delay replying to the CB.InitCallBack3 notification RPC from the server.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      32222f09
    • David Howells's avatar
      afs: Move the vnode/volume validity checking code into its own file · dfa0a449
      David Howells authored
      Move the code that does validity checking of vnodes and volumes with
      respect to third-party changes into its own file.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      dfa0a449
    • David Howells's avatar
      afs: Defer volume record destruction to a workqueue · 445f9b69
      David Howells authored
      Defer volume record destruction to a workqueue so that afs_put_volume()
      isn't going to run the destruction process in the callback workqueue whilst
      the server is holding up other clients whilst waiting for us to reply to a
      CB.CallBack notification RPC.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      445f9b69
    • David Howells's avatar
      afs: Make it possible to find the volumes that are using a server · ca0e79a4
      David Howells authored
      Make it possible to find the afs_volume structs that are using an
      afs_server struct to aid in breaking volume callbacks.
      
      The way this is done is that each afs_volume already has an array of
      afs_server_entry records that point to the servers where that volume might
      be found.  An afs_volume backpointer and a list node is added to each entry
      and each entry is then added to an RCU-traversable list on the afs_server
      to which it points.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      ca0e79a4
    • David Howells's avatar
      afs: Combine the endpoint state bools into a bitmask · 21c1f410
      David Howells authored
      Combine the endpoint state bool-type members into a bitmask so that some of
      them can be waited upon more easily.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      21c1f410
    • David Howells's avatar
      afs: Keep a record of the current fileserver endpoint state · f49b594d
      David Howells authored
      Keep a record of the current fileserver endpoint state, including the probe
      state, and replace it when a new probe is started rather than just
      squelching the old state and overwriting it.  Clearance of the old state
      can cause a race if there's another thread also currently trying to
      communicate with that server.
      
      It appears that this race might be the culprit for some occasions where
      kafs complains about invalid data in the RPC reply because the rotation
      algorithm fell all the way through without actually issuing an RPC call and
      the error return got filled in from the probe state (which has a zero error
      recorded).  Whatever happens to be in the caller's reply buffer is then
      taken as the response.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      f49b594d
    • David Howells's avatar
      afs: Dispatch vlserver probes in priority order · e6a7d7f7
      David Howells authored
      When probing all the addresses for a volume location server, dispatch them
      in order of descending priority to try and get back highest priority one
      first.
      
      Also add a tracepoint to show the transmission and completion of the
      probes.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      e6a7d7f7
    • David Howells's avatar
      afs: Dispatch fileserver probes in priority order · 92f091cd
      David Howells authored
      When probing all the addresses for a fileserver, dispatch them in order of
      descending priority to try and get back highest priority one first.
      
      Also add a tracepoint to show the transmission and completion of the
      probes.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      92f091cd
    • David Howells's avatar
      afs: Mark address lists with configured priorities · d14cf8ed
      David Howells authored
      Add a field to each address in an address list (afs_addr_list struct) that
      records the current priority for that address according to the address
      preference table.  We don't want to do this every time we use an address
      list, so the version number of the address preference table is recorded in
      the address list too and we only re-mark the list when we see the version
      change.
      
      These numbers are then displayed through /proc/net/afs/servers.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      d14cf8ed
    • David Howells's avatar
      afs: Provide a way to configure address priorities · f94f70d3
      David Howells authored
      AFS servers may have multiple addresses, but the client can't easily judge
      between them as to which one is best.  For instance, an address that has a
      larger RTT might actually have a better bandwidth because it goes through a
      switch rather than being directly connected - but we can't work this out
      dynamically unless we push through sufficient data that we can measure it.
      
      To allow the administrator to configure this, add a list of preference
      weightings for server addresses by IPv4/IPv6 address or subnet and allow
      this to be viewed through a procfile and altered by writing text commands
      to that same file.  Preference rules can be added/updated by:
      
      	echo "add <proto> <addr>[/<subnet>] <prior>" >/proc/fs/afs/addr_prefs
      	echo "add udp 1.2.3.4 1000" >/proc/fs/afs/addr_prefs
      	echo "add udp 192.168.0.0/16 3000" >/proc/fs/afs/addr_prefs
      	echo "add udp 1001:2002:0:6::/64 4000" >/proc/fs/afs/addr_prefs
      
      and removed by:
      
      	echo "del <proto> <addr>[/<subnet>]" >/proc/fs/afs/addr_prefs
      	echo "del udp 1.2.3.4" >/proc/fs/afs/addr_prefs
      
      where the priority is a number between 0 and 65535.
      
      The list is split between IPv4 and IPv6 addresses and each sublist is kept
      in numerical order, with rules that would otherwise match but have
      different subnet masking being ordered with the most specific submatch
      first.
      
      A subsequent patch will apply these rules.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      f94f70d3
    • David Howells's avatar
      afs: Remove the unimplemented afs_cmp_addr_list() · b605ee42
      David Howells authored
      Remove afs_cmp_addr_list() as it was never implemented.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      b605ee42
    • David Howells's avatar
      afs: Add some more info to /proc/net/afs/servers · af9a5b49
      David Howells authored
      In /proc/net/afs/servers, show the cell name and the last error for each
      address in the server's list.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      af9a5b49
  2. 24 Dec, 2023 22 commits
    • David Howells's avatar
      rxrpc: Create a procfile to display outstanding client conn bundles · d2ce4a84
      David Howells authored
      Create /proc/net/rxrpc/bundles to display outstanding rxrpc client
      connection bundles.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      d2ce4a84
    • David Howells's avatar
      afs: Fold the afs_addr_cursor struct in · 98f9fda2
      David Howells authored
      Fold the afs_addr_cursor struct into the afs_operation struct and the
      afs_vl_cursor struct and fold its operations into their callers also.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      98f9fda2
    • David Howells's avatar
      afs: Use peer + service_id as call address · e38f299e
      David Howells authored
      Use the rxrpc_peer plus the service ID as the call address instead of
      passing in a sockaddr_srx down to rxrpc.  The peer record is obtained by
      using rxrpc_kernel_get_peer().  This avoids the need to repeatedly look up
      the peer and allows rxrpc to hold on to resources for it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      e38f299e
    • David Howells's avatar
      afs: Rename some fields · 905b8615
      David Howells authored
      Rename the ->index and ->untried fields of the afs_vl_cursor and
      afs_operation struct to ->server_index and ->untried_servers to avoid
      confusion with address iteration fields when those get folded in.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      905b8615
    • David Howells's avatar
      afs: Add a tracepoint for struct afs_addr_list · 1e5d8493
      David Howells authored
      Add a tracepoint to track the lifetime of the afs_addr_list struct.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      1e5d8493
    • David Howells's avatar
      afs: Simplify error handling · aa453bec
      David Howells authored
      Simplify error handling a bit by moving it from the afs_addr_cursor struct
      to the afs_operation and afs_vl_cursor structs and using the error
      prioritisation function for accumulating errors from multiple sources (AFS
      tries to rotate between multiple fileservers, some of which may be
      inaccessible or in some state of offlinedness).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      aa453bec
    • David Howells's avatar
      afs: Don't put afs_call in afs_wait_for_call_to_complete() · 6f2ff7e8
      David Howells authored
      Don't put the afs_call struct in afs_wait_for_call_to_complete() but rather
      have the caller do it.  This will allow the caller to fish stuff out of the
      afs_call struct rather than the afs_addr_cursor struct, thereby allowing a
      subsequent patch to subsume it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      6f2ff7e8
    • David Howells's avatar
      afs: Wrap most op->error accesses with inline funcs · 2de5599f
      David Howells authored
      Wrap most op->error accesses with inline funcs which will make it easier
      for a subsequent patch to replace op->error with something else.  Two
      functions are added to this end:
      
       (1) afs_op_error() - Get the error code.
      
       (2) afs_op_set_error() - Set the error code.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      2de5599f
    • David Howells's avatar
      afs: Use op->nr_iterations=-1 to indicate to begin fileserver iteration · 075171fd
      David Howells authored
      Set op->nr_iterations to -1 to indicate that we need to begin fileserver
      iteration rather than setting error to SHRT_MAX.  This makes it easier to
      eliminate the address cursor.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      075171fd
    • David Howells's avatar
      afs: Handle the VIO and UAEIO aborts explicitly · eb8eae65
      David Howells authored
      When processing the result of a call, handle the VIO and UAEIO abort
      specifically rather than leaving it to a default case.  Rather than
      erroring out unconditionally, see if there's another server if the volume
      has more than one server available, otherwise return -EREMOTEIO.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      eb8eae65
    • David Howells's avatar
      afs: Rename addr_list::failed to probe_failed · aa4917d6
      David Howells authored
      Rename the failed member of struct addr_list to probe_failed as it's
      specifically related to probe failures.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      aa4917d6
    • David Howells's avatar
      afs: Don't skip server addresses for which we didn't get an RTT reading · a2aff7b5
      David Howells authored
      In the rotation algorithms for iterating over volume location servers and
      file servers, don't skip servers from which we got a valid response to a
      probe (either a reply DATA packet or an ABORT) even if we didn't manage to
      get an RTT reading.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      a2aff7b5
    • David Howells's avatar
      rxrpc, afs: Allow afs to pin rxrpc_peer objects · 72904d7b
      David Howells authored
      Change rxrpc's API such that:
      
       (1) A new function, rxrpc_kernel_lookup_peer(), is provided to look up an
           rxrpc_peer record for a remote address and a corresponding function,
           rxrpc_kernel_put_peer(), is provided to dispose of it again.
      
       (2) When setting up a call, the rxrpc_peer object used during a call is
           now passed in rather than being set up by rxrpc_connect_call().  For
           afs, this meenat passing it to rxrpc_kernel_begin_call() rather than
           the full address (the service ID then has to be passed in as a
           separate parameter).
      
       (3) A new function, rxrpc_kernel_remote_addr(), is added so that afs can
           get a pointer to the transport address for display purposed, and
           another, rxrpc_kernel_remote_srx(), to gain a pointer to the full
           rxrpc address.
      
       (4) The function to retrieve the RTT from a call, rxrpc_kernel_get_srtt(),
           is then altered to take a peer.  This now returns the RTT or -1 if
           there are insufficient samples.
      
       (5) Rename rxrpc_kernel_get_peer() to rxrpc_kernel_call_get_peer().
      
       (6) Provide a new function, rxrpc_kernel_get_peer(), to get a ref on a
           peer the caller already has.
      
      This allows the afs filesystem to pin the rxrpc_peer records that it is
      using, allowing faster lookups and pointer comparisons rather than
      comparing sockaddr_rxrpc contents.  It also makes it easier to get hold of
      the RTT.  The following changes are made to afs:
      
       (1) The addr_list struct's addrs[] elements now hold a peer struct pointer
           and a service ID rather than a sockaddr_rxrpc.
      
       (2) When displaying the transport address, rxrpc_kernel_remote_addr() is
           used.
      
       (3) The port arg is removed from afs_alloc_addrlist() since it's always
           overridden.
      
       (4) afs_merge_fs_addr4() and afs_merge_fs_addr6() do peer lookup and may
           now return an error that must be handled.
      
       (5) afs_find_server() now takes a peer pointer to specify the address.
      
       (6) afs_find_server(), afs_compare_fs_alists() and afs_merge_fs_addr[46]{}
           now do peer pointer comparison rather than address comparison.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      72904d7b
    • David Howells's avatar
      afs: Turn the afs_addr_list address array into an array of structs · 07f3502b
      David Howells authored
      Turn the afs_addr_list address array into an array of structs, thereby
      allowing per-address (such as RTT) info to be added.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      07f3502b
    • David Howells's avatar
      afs: Add comments on abort handling · fe245c8f
      David Howells authored
      Add some comments on AFS abort code handling in the rotation algorithm and
      adjust the errors produced to match.
      Reported-by: default avatarJeffrey E Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      fe245c8f
    • Oleg Nesterov's avatar
      rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock() · bad1a11c
      Oleg Nesterov authored
      rxrpc_find_service_conn_rcu() should make the "seq" counter odd on the
      second pass, otherwise read_seqbegin_or_lock() never takes the lock.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/20231117164846.GA10410@redhat.com/
      bad1a11c
    • Oleg Nesterov's avatar
      afs: use read_seqbegin() in afs_check_validity() and afs_getattr() · df91b9df
      Oleg Nesterov authored
      David Howells says:
      
       (3) afs_check_validity().
       (4) afs_getattr().
      
           These are both pretty short, so your solution is probably good for them.
           That said, afs_vnode_commit_status() can spend a long time under the
           write lock - and pretty much every file RPC op returns a status update.
      
      Change these functions to use read_seqbegin(). This simplifies the code
      and doesn't change the current behaviour, the "seq" counter is always even
      so read_seqbegin_or_lock() can never take the lock.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/20231130115617.GA21584@redhat.com/
      df91b9df
    • Oleg Nesterov's avatar
      afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() · 1702e065
      Oleg Nesterov authored
      David Howells says:
      
       (5) afs_find_server().
      
           There could be a lot of servers in the list and each server can have
           multiple addresses, so I think this would be better with an exclusive
           second pass.
      
           The server list isn't likely to change all that often, but when it does
           change, there's a good chance several servers are going to be
           added/removed one after the other.  Further, this is only going to be
           used for incoming cache management/callback requests from the server,
           which hopefully aren't going to happen too often - but it is remotely
           drivable.
      
       (6) afs_find_server_by_uuid().
      
           Similarly to (5), there could be a lot of servers to search through, but
           they are in a tree not a flat list, so it should be faster to process.
           Again, it's not likely to change that often and, again, when it does
           change it's likely to involve multiple changes.  This can be driven
           remotely by an incoming cache management request but is mostly going to
           be driven by setting up or reconfiguring a volume's server list -
           something that also isn't likely to happen often.
      
      Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock()
      never takes the lock.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/20231130115614.GA21581@redhat.com/
      1702e065
    • Oleg Nesterov's avatar
      afs: fix the usage of read_seqbegin_or_lock() in afs_lookup_volume_rcu() · 4121b433
      Oleg Nesterov authored
      David Howells says:
      
       (2) afs_lookup_volume_rcu().
      
           There can be a lot of volumes known by a system.  A thousand would
           require a 10-step walk and this is drivable by remote operation, so I
           think this should probably take a lock on the second pass too.
      
      Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock()
      never takes the lock.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/20231130115606.GA21571@redhat.com/
      4121b433
    • David Howells's avatar
      afs: Automatically generate trace tag enums · 2daa6404
      David Howells authored
      Automatically generate trace tag enums from the symbol -> string mapping
      tables rather than having the enums as well, thereby reducing duplicated
      data.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-afs@lists.infradead.org
      cc: linux-fsdevel@vger.kernel.org
      2daa6404
    • David Howells's avatar
      afs: Remove whitespace before most ')' from the trace header · a790c258
      David Howells authored
      checkpatch objects to whitespace before ')', so remove most of it from the
      afs trace header.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-afs@lists.infradead.org
      cc: linux-fsdevel@vger.kernel.org
      a790c258
    • Linus Torvalds's avatar
      Linux 6.7-rc7 · 861deac3
      Linus Torvalds authored
      861deac3
  3. 23 Dec, 2023 2 commits
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3f82f1c3
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - Fix a secondary CPUs enumeration regression caused by creative MADT
         APIC table entries on certain systems.
      
       - Fix a race in the NOP-patcher that can spuriously trigger crashes on
         bootup.
      
       - Fix a bootup failure regression caused by the parallel bringup code,
         caused by firmware inconsistency between the APIC initialization
         states of the boot and secondary CPUs, on certain systems.
      
      * tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/acpi: Handle bogus MADT APIC tables gracefully
        x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
        x86/alternatives: Sync core before enabling interrupts
        x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully
      3f82f1c3
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · f969c914
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Four small fixes, three in drivers with the core one adding a batch
        indicator (for drivers which use it) to the error handler"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Let the sq_lock protect sq_tail_slot access
        scsi: ufs: qcom: Return ufs_qcom_clk_scale_*() errors in ufs_qcom_clk_scale_notify()
        scsi: core: Always send batch on reset or error handling command
        scsi: bnx2fc: Fix skb double free in bnx2fc_rcv()
      f969c914