An error occurred fetching the project authors.
  1. 27 Mar, 2020 1 commit
    • Bob Peterson's avatar
      gfs2: Change inode qa_data to allow multiple users · 2fba46a0
      Bob Peterson authored
      Before this patch, multiple users called gfs2_qa_alloc which allocated
      a qadata structure to the inode, if quotas are turned on. Later, in
      file close or evict, the structure was deleted with gfs2_qa_delete.
      But there can be several competing processes who need access to the
      structure. There were races between file close (release) and the others.
      Thus, a release could delete the structure out from under a process
      that relied upon its existence. For example, chown.
      
      This patch changes the management of the qadata structures to be
      a get/put scheme. Function gfs2_qa_alloc has been changed to gfs2_qa_get
      and if the structure is allocated, the count essentially starts out at
      1. Function gfs2_qa_delete has been renamed to gfs2_qa_put, and the
      last guy to decrement the count to 0 frees the memory.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      2fba46a0
  2. 27 Feb, 2020 2 commits
    • Bob Peterson's avatar
      gfs2: Do proper error checking for go_sync family of glops functions · 1c634f94
      Bob Peterson authored
      Before this patch, function do_xmote would try to sync out the glock
      dirty data by calling the appropriate glops function XXX_go_sync()
      but it did not check for a good return code. If the sync was not
      possible due to an io error or whatever, do_xmote would continue on
      and call go_inval and release the glock to other cluster nodes.
      When those nodes go to replay the journal, they may already be holding
      glocks for the journal records that should have been synced, but were
      not due to the ignored error.
      
      This patch introduces proper error code checking to the go_sync
      family of glops functions.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Reviewed-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      1c634f94
    • Bob Peterson's avatar
      gfs2: Force withdraw to replay journals and wait for it to finish · 601ef0d5
      Bob Peterson authored
      When a node withdraws from a file system, it often leaves its journal
      in an incomplete state. This is especially true when the withdraw is
      caused by io errors writing to the journal. Before this patch, a
      withdraw would try to write a "shutdown" record to the journal, tell
      dlm it's done with the file system, and none of the other nodes
      know about the problem. Later, when the problem is fixed and the
      withdrawn node is rebooted, it would then discover that its own
      journal was incomplete, and replay it. However, replaying it at this
      point is almost guaranteed to introduce corruption because the other
      nodes are likely to have used affected resource groups that appeared
      in the journal since the time of the withdraw. Replaying the journal
      later will overwrite any changes made, and not through any fault of
      dlm, which was instructed during the withdraw to release those
      resources.
      
      This patch makes file system withdraws seen by the entire cluster.
      Withdrawing nodes dequeue their journal glock to allow recovery.
      
      The remaining nodes check all the journals to see if they are
      clean or in need of replay. They try to replay dirty journals, but
      only the journals of withdrawn nodes will be "not busy" and
      therefore available for replay.
      
      Until the journal replay is complete, no i/o related glocks may be
      given out, to ensure that the replay does not cause the
      aforementioned corruption: We cannot allow any journal replay to
      overwrite blocks associated with a glock once it is held.
      
      The "live" glock which is now used to signal when a withdraw
      occurs. When a withdraw occurs, the node signals its withdraw by
      dequeueing the "live" glock and trying to enqueue it in EX mode,
      thus forcing the other nodes to all see a demote request, by way
      of a "1CB" (one callback) try lock. The "live" glock is not
      granted in EX; the callback is only just used to indicate a
      withdraw has occurred.
      
      Note that all nodes in the cluster must wait for the recovering
      node to finish replaying the withdrawing node's journal before
      continuing. To this end, it checks that the journals are clean
      multiple times in a retry loop.
      
      Also note that the withdraw function may be called from a wide
      variety of situations, and therefore, we need to take extra
      precautions to make sure pointers are valid before using them in
      many circumstances.
      
      We also need to take care when glocks decide to withdraw, since
      the withdraw code now uses glocks.
      
      Also, before this patch, if a process encountered an error and
      decided to withdraw, if another process was already withdrawing,
      the second withdraw would be silently ignored, which set it free
      to unlock its glocks. That's correct behavior if the original
      withdrawer encounters further errors down the road. But if
      secondary waiters don't wait for the journal replay, unlocking
      glocks will allow other nodes to use them, despite the fact that
      the journal containing those blocks is being replayed. The
      replay needs to finish before our glocks are released to other
      nodes. IOW, secondary withdraws need to wait for the first
      withdraw to finish.
      
      For example, if an rgrp glock is unlocked by a process that didn't
      wait for the first withdraw, a journal replay could introduce file
      system corruption by replaying a rgrp block that has already been
      granted to a different cluster node.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      601ef0d5
  3. 20 Feb, 2020 1 commit
    • Bob Peterson's avatar
      gfs2: Allow some glocks to be used during withdraw · a72d2401
      Bob Peterson authored
      We need to allow some glocks to be enqueued, dequeued, promoted, and demoted
      when we're withdrawn. For example, to maintain metadata integrity, we should
      disallow the use of inode and rgrp glocks when withdrawn. Other glocks, like
      iopen or the transaction glocks may be safely used because none of their
      metadata goes through the journal. So in general, we should disallow all
      glocks with an address space, and allow all the others. One exception is:
      we need to allow our active journal to be demoted so others may recover it.
      
      Allowing glocks after withdraw gives us the ability to take appropriate
      action (in a following patch) to have our journal properly replayed by
      another node rather than just abandoning the current transactions and
      pretending nothing bad happened, leaving the other nodes free to modify
      the blocks we had in our journal, which may result in file system
      corruption.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      a72d2401
  4. 10 Feb, 2020 3 commits
    • Bob Peterson's avatar
      gfs2: log error reform · 036330c9
      Bob Peterson authored
      Before this patch, gfs2 kept track of journal io errors in two
      places sd_log_error and the SDF_AIL1_IO_ERROR flag in sd_flags.
      This patch consolidates the two into sd_log_error so that it
      reflects the first error encountered writing to the journal.
      In future patches, we will take advantage of this by checking
      this value rather than having to check both when reacting to
      io errors.
      
      In addition, this fixes a tight loop in unmount: If buffers
      get on the ail1 list and an io error occurs elsewhere, the
      ail1 list would never be cleared because they were always busy.
      So unmount would hang, waiting for the ail1 list to empty.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Reviewed-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      036330c9
    • Bob Peterson's avatar
      gfs2: Rework how rgrp buffer_heads are managed · b3422cac
      Bob Peterson authored
      Before this patch, the rgrp code had a serious problem related to
      how it managed buffer_heads for resource groups. The problem caused
      file system corruption, especially in cases of journal replay.
      
      When an rgrp glock was demoted to transfer ownership to a
      different cluster node, do_xmote() first calls rgrp_go_sync and then
      rgrp_go_inval, as expected. When it calls rgrp_go_sync, that called
      gfs2_rgrp_brelse() that dropped the buffer_head reference count.
      In most cases, the reference count went to zero, which is right.
      However, there were other places where the buffers are handled
      differently.
      
      After rgrp_go_sync, do_xmote called rgrp_go_inval which called
      gfs2_rgrp_brelse a second time, then rgrp_go_inval's call to
      truncate_inode_pages_range would get rid of the pages in memory,
      but only if the reference count drops to 0.
      
      Unfortunately, gfs2_rgrp_brelse was setting bi->bi_bh = NULL.
      So when rgrp_go_sync called gfs2_rgrp_brelse, it lost the pointer
      to the buffer_heads in cases where the reference count was still 1.
      Therefore, when rgrp_go_inval called gfs2_rgrp_brelse a second time,
      it failed the check for "if (bi->bi_bh)" and thus failed to call
      brelse a second time. Because of that, the reference count on those
      buffers sometimes failed to drop from 1 to 0. And that caused
      function truncate_inode_pages_range to keep the pages in page cache
      rather than freeing them.
      
      The next time the rgrp glock was acquired, the metadata read of
      the rgrp buffers re-used the pages in memory, which were now
      wrong because they were likely modified by the other node who
      acquired the glock in EX (which is why we demoted the glock).
      This re-use of the page cache caused corruption because changes
      made by the other nodes were never seen, so the bitmaps were
      inaccurate.
      
      For some reason, the problem became most apparent when journal
      replay forced the replay of rgrps in memory, which caused newer
      rgrp data to be overwritten by the older in-core pages.
      
      A big part of the problem was that the rgrp buffer were released
      in multiple places: The go_unlock function would release them when
      the glock was released rather than when the glock is demoted,
      which is clearly wrong because our intent was to cache them until
      the glock is demoted from SH or EX.
      
      This patch attempts to clean up the mess and make one consistent
      and centralized mechanism for managing the rgrp buffer_heads by
      implementing several changes:
      
      1. It eliminates the call to gfs2_rgrp_brelse() from rgrp_go_sync.
         We don't want to release the buffers or zero the pointers when
         syncing for the reasons stated above. It only makes sense to
         release them when the glock is actually invalidated (go_inval).
         And when we do, then we set the bh pointers to NULL.
      2. The go_unlock function (which was only used for rgrps) is
         eliminated, as we've talked about doing many times before.
         The go_unlock function was called too early in the glock dq
         process, and should not happen until the glock is invalidated.
      3. It also eliminates the call to rgrp_brelse in gfs2_clear_rgrpd.
         That will now happen automatically when the rgrp glocks are
         demoted, and shouldn't happen any sooner or later than that.
         Instead, function gfs2_clear_rgrpd has been modified to demote
         the rgrp glocks, and therefore, free those pages, before the
         remaining glocks are culled by gfs2_gl_hash_clear. This
         prevents the gl_object from hanging around when the glocks are
         culled.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Reviewed-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      b3422cac
    • Bob Peterson's avatar
      gfs2: Introduce concept of a pending withdraw · 69511080
      Bob Peterson authored
      File system withdraws can be delayed when inconsistencies are
      discovered when we cannot withdraw immediately, for example, when
      critical spin_locks are held. But delaying the withdraw can cause
      gfs2 to ignore the error and keep running for a short period of time.
      For example, an rgrp glock may be dequeued and demoted while there
      are still buffers that haven't been properly revoked, due to io
      errors writing to the journal.
      
      This patch introduces a new concept of a pending withdraw, which
      means an inconsistency has been discovered and we need to withdraw
      at the earliest possible opportunity. In these cases, we aren't
      quite withdrawn yet, but we still need to not dequeue glocks and
      other critical things. If we dequeue the glocks and the withdraw
      results in our journal being replayed, the replay could overwrite
      data that's been modified by a different node that acquired the
      glock in the meantime.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Reviewed-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      69511080
  5. 28 Jan, 2020 1 commit
    • Bob Peterson's avatar
      Revert "gfs2: eliminate tr_num_revoke_rm" · a31b4ec5
      Bob Peterson authored
      This reverts commit e955537e.
      
      Before patch e955537e, tr_num_revoke tracked the number of revokes
      added to the transaction, and tr_num_revoke_rm tracked how many
      revokes were removed. But since revokes are queued off the sdp
      (superblock) pointer, some transactions could remove more revokes
      than they added. (e.g. revokes added by a different process).
      Commit e955537e eliminated transaction variable tr_num_revoke_rm,
      but in order to do so, it changed the accounting to always use
      tr_num_revoke for its math. Since you can remove more revokes than
      you add, tr_num_revoke could now become a negative value.
      This negative value broke the assert in function gfs2_trans_end:
      
      	if (gfs2_assert_withdraw(sdp, (nbuf <=3D tr->tr_blocks) &&
      			       (tr->tr_num_revoke <=3D tr->tr_revokes)))
      
      One way to fix this is to simply remove the tr_num_revoke clause
      from the assert and allow the value to become negative. Andreas
      didn't like that idea, so instead, we decided to revert e955537e.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      a31b4ec5
  6. 20 Jan, 2020 2 commits
  7. 07 Jan, 2020 1 commit
  8. 19 Sep, 2019 1 commit
  9. 04 Sep, 2019 1 commit
    • Bob Peterson's avatar
      gfs2: Use async glocks for rename · ad26967b
      Bob Peterson authored
      Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can
      reverse the roles of which directories are "old" and which are "new" for
      the purposes of rename. This can cause deadlocks where two nodes end up
      waiting for each other.
      
      There can be several layers of directory dependencies across many nodes.
      
      This patch fixes the problem by acquiring all gfs2_rename's inode glocks
      asychronously and waiting for all glocks to be acquired.  That way all
      inodes are locked regardless of the order.
      
      The timeout value for multiple asynchronous glocks is calculated to be
      the total of the individual wait times for each glock times two.
      
      Since gfs2_exchange is very similar to gfs2_rename, both functions are
      patched in the same way.
      
      A new async glock wait queue, sd_async_glock_wait, keeps a list of
      waiters for these events. If gfs2's holder_wake function detects an
      async holder, it wakes up any waiters for the event. The waiter only
      tests whether any of its requests are still pending.
      
      Since the glocks are sent to dlm asychronously, the wait function needs
      to check to see which glocks, if any, were granted.
      
      If a glock is granted by dlm (and therefore held), its minimum hold time
      is checked and adjusted as necessary, as other glock grants do.
      
      If the event times out, all glocks held thus far must be dequeued to
      resolve any existing deadlocks.  Then, if there are any outstanding
      locking requests, we need to loop around and wait for dlm to respond to
      those requests too.  After we release all requests, we return -ESTALE to
      the caller (vfs rename) which loops around and retries the request.
      
          Node1           Node2
          ---------       ---------
      1.  Enqueue A       Enqueue B
      2.  Enqueue B       Enqueue A
      3.  A granted
      6.                  B granted
      7.  Wait for B
      8.                  Wait for A
      9.                  A times out (since Node 1 holds A)
      10.                 Dequeue B (since it was granted)
      11.                 Wait for all requests from DLM
      12. B Granted (since Node2 released it in step 10)
      13. Rename
      14. Dequeue A
      15.                 DLM Grants A
      16.                 Dequeue A (due to the timeout and since we
                          no longer have B held for our task).
      17. Dequeue B
      18.                 Return -ESTALE to vfs
      19.                 VFS retries the operation, goto step 1.
      
      This release-all-locks / acquire-all-locks may slow rename / exchange
      down as both nodes struggle in the same way and do the same thing.
      However, this will only happen when there is contention for the same
      inodes, which ought to be rare.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      ad26967b
  10. 27 Jun, 2019 3 commits
    • Bob Peterson's avatar
      gfs2: dump fsid when dumping glock problems · 3792ce97
      Bob Peterson authored
      Before this patch, if a glock error was encountered, the glock with
      the problem was dumped. But sometimes you may have lots of file systems
      mounted, and that doesn't tell you which file system it was for.
      
      This patch adds a new boolean parameter fsid to the dump_glock family
      of functions. For non-error cases, such as dumping the glocks debugfs
      file, the fsid is not dumped in order to keep lock dumps and glocktop
      as clean as possible. For all error cases, such as GLOCK_BUG_ON, the
      file system id is now printed. This will make it easier to debug.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      3792ce97
    • Bob Peterson's avatar
      gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN · 04aea0ca
      Bob Peterson authored
      Before this patch, the superblock flag indicating when a file system
      is withdrawn was called SDF_SHUTDOWN. This patch simply renames it to
      the more obvious SDF_WITHDRAWN.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      04aea0ca
    • Bob Peterson's avatar
      gfs2: eliminate tr_num_revoke_rm · e955537e
      Bob Peterson authored
      For its journal processing, gfs2 kept track of the number of buffers
      added and removed on a per-transaction basis. These values are used
      to calculate space needed in the journal. But while these calculations
      make sense for the number of buffers, they make no sense for revokes.
      Revokes are managed in their own list, linked from the superblock.
      So it's entirely unnecessary to keep separate per-transaction counts
      for revokes added and removed. A single count will do the same job.
      Therefore, this patch combines the transaction revokes into a single
      count.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      e955537e
  11. 06 Jun, 2019 1 commit
  12. 05 Jun, 2019 1 commit
  13. 07 May, 2019 4 commits
    • Abhi Das's avatar
      gfs2: fix race between gfs2_freeze_func and unmount · 8f918219
      Abhi Das authored
      As part of the freeze operation, gfs2_freeze_func() is left blocking
      on a request to hold the sd_freeze_gl in SH. This glock is held in EX
      by the gfs2_freeze() code.
      
      A subsequent call to gfs2_unfreeze() releases the EXclusively held
      sd_freeze_gl, which allows gfs2_freeze_func() to acquire it in SH and
      resume its operation.
      
      gfs2_unfreeze(), however, doesn't wait for gfs2_freeze_func() to complete.
      If a umount is issued right after unfreeze, it could result in an
      inconsistent filesystem because some journal data (statfs update) isn't
      written out.
      
      Refer to commit 24972557 for a more detailed explanation of how
      freeze/unfreeze work.
      
      This patch causes gfs2_unfreeze() to wait for gfs2_freeze_func() to
      complete before returning to the user.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      8f918219
    • Andreas Gruenbacher's avatar
      gfs2: Rename sd_log_le_{revoke,ordered} · a5b1d3fc
      Andreas Gruenbacher authored
      Rename sd_log_le_revoke to sd_log_revokes and sd_log_le_ordered to
      sd_log_ordered: not sure what le stands for here, but it doesn't add
      clarity, and if it stands for list entry, it's actually confusing as
      those are both list heads but not list entries.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      a5b1d3fc
    • Bob Peterson's avatar
      gfs2: Replace gl_revokes with a GLF flag · 73118ca8
      Bob Peterson authored
      The gl_revokes value determines how many outstanding revokes a glock has
      on the superblock revokes list; this is used to avoid unnecessary log
      flushes.  However, gl_revokes is only ever tested for being zero, and it's
      only decremented in revoke_lo_after_commit, which removes all revokes
      from the list, so we know that the gl_revoke values of all the glocks on
      the list will reach zero.  Therefore, we can replace gl_revokes with a
      bit flag. This saves an atomic counter in struct gfs2_glock.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      73118ca8
    • Bob Peterson's avatar
      gfs2: clean_journal improperly set sd_log_flush_head · 7c70b896
      Bob Peterson authored
      This patch fixes regressions in 588bff95.
      Due to that patch, function clean_journal was setting the value of
      sd_log_flush_head, but that's only valid if it is replaying the node's
      own journal. If it's replaying another node's journal, that's completely
      wrong and will lead to multiple problems. This patch tries to clean up
      the mess by passing the value of the logical journal block number into
      gfs2_write_log_header so the function can treat non-owned journals
      generically. For the local journal, the journal extent map is used for
      best performance. For other nodes from other journals, new function
      gfs2_lblk_to_dblk is called to figure it out using gfs2_iomap_get.
      
      This patch also tries to establish more consistency when passing journal
      block parameters by changing several unsigned int types to a consistent
      u32.
      
      Fixes: 588bff95 ("GFS2: Reduce code redundancy writing log headers")
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      7c70b896
  14. 23 Jan, 2019 1 commit
  15. 12 Dec, 2018 1 commit
  16. 11 Dec, 2018 1 commit
  17. 12 Oct, 2018 2 commits
  18. 05 Oct, 2018 1 commit
    • Bob Peterson's avatar
      gfs2: slow the deluge of io error messages · b524abcc
      Bob Peterson authored
      When an io error is hit, it calls gfs2_io_error_bh_i for every
      journal buffer it can't write. Since we changed gfs2_io_error_bh_i
      recently to withdraw later in the cycle, it sends a flood of
      errors to the console. This patch checks for the file system already
      being withdrawn, and if so, doesn't send more messages. It doesn't
      stop the flood of messages, but it slows it down and keeps it more
      reasonable.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      b524abcc
  19. 07 Aug, 2018 1 commit
    • Bob Peterson's avatar
      gfs2: Fix gfs2_testbit to use clone bitmaps · dffe12a8
      Bob Peterson authored
      Function gfs2_testbit is called in three places. Two of those places,
      gfs2_alloc_extent and gfs2_unaligned_extlen, should be using the clone
      bitmaps, not the "real" bitmaps. Function gfs2_unaligned_extlen is used
      by the block reservations scheme to determine the length of an extent of
      free blocks. Before this patch, it wasn't using the clone bitmap, which
      means recently-freed blocks were treated as free blocks for the purposes
      of an allocation.
      
      This patch adds a new parameter to gfs2_testbit to indicate whether or
      not the clone bitmaps should be used (if available).
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Reviewed-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      dffe12a8
  20. 05 Jul, 2018 1 commit
  21. 21 Jun, 2018 1 commit
  22. 04 Jun, 2018 1 commit
    • Bob Peterson's avatar
      GFS2: gfs2_free_extlen can return an extent that is too long · dc8fbb03
      Bob Peterson authored
      Function gfs2_free_extlen calculates the length of an extent of
      free blocks that may be reserved. The end pointer was calculated as
      end = start + bh->b_size but b_size is incorrect because the
      bitmap usually stops prior to the end of the buffer data on
      the last bitmap.
      
      What this means is that when you do a write, you can reserve a
      chunk of blocks that runs off the end of the last bitmap. For
      example, I've got a file system where there is only one bitmap
      for each rgrp, so ri_length==1. I saw cases in which iozone
      tried to do a big write, grabbed a large block reservation,
      chose rgrp 5464152, which has ri_data0 5464153 and ri_data 8188.
      So 5464153 + 8188 = 5472341 which is the end of the rgrp.
      
      When it grabbed a reservation it got back: 5470936, length 7229.
      But 5470936 + 7229 = 5478165. So the reservation starts inside
      the rgrp but runs 5824 blocks past the end of the bitmap.
      
      This patch fixes the calculation so it won't exceed the last
      bitmap. It also adds a BUG_ON to guard against overflows in the
      future.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      dc8fbb03
  23. 16 Apr, 2018 1 commit
    • Andreas Gruenbacher's avatar
      gfs2: Remove sdp->sd_jheightsize · 9a38662b
      Andreas Gruenbacher authored
      GFS2 keeps two arrarys in the superblock that define the maximum size of
      an inode depending on the inode's height: sdp->sd_heightsize defines the
      heights in units of sb->s_blocksize; sdp->sd_jheightsize defines them in
      units of sb->s_blocksize - sizeof(struct gfs2_meta_header).  These
      arrays are used to determine when additional layers of indirect blocks
      are needed.  The second array is used for directories which have an
      additional gfs2_meta_header at the beginning of each block.
      
      Distinguishing between these two cases makes no sense: the height
      required for representing N blocks will come out the same no matter if
      the calculation is done in gross (sb->s_blocksize) or net
      (sb->s_blocksize - sizeof(struct gfs2_meta_header)) units.
      
      Stuffed directories don't have an additional gfs2_meta_header, but the
      stuffed case is handled separately for both files and directories,
      anyway.
      
      Remove the unncessary sdp->sd_jheightsize array.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      9a38662b
  24. 29 Mar, 2018 1 commit
  25. 22 Jan, 2018 1 commit
  26. 18 Jan, 2018 1 commit
  27. 25 Aug, 2017 2 commits
    • Andreas Gruenbacher's avatar
      gfs2: Silence gcc format-truncation warning · 561b7969
      Andreas Gruenbacher authored
      Enlarge sd_fsname to be big enough for the longest long lock table name
      and an arbitrary journal number.  This silences two -Wformat-truncation
      warnings with gcc 7.1.1.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      561b7969
    • Bob Peterson's avatar
      GFS2: Withdraw for IO errors writing to the journal or statfs · 942b0cdd
      Bob Peterson authored
      Before this patch, if GFS2 encountered IO errors while writing to
      the journal, it would not report the problem, so they would go
      unnoticed, sometimes for many hours. Sometimes this would only be
      noticed later, when recovery tried to do journal replay and failed
      due to invalid metadata at the blocks that resulted in IO errors.
      
      This patch makes GFS2's log daemon check for IO errors. If it
      encounters one, it withdraws from the file system and reports
      why in dmesg. A similar action is taken when IO errors occur when
      writing to the system statfs file.
      
      These errors are also reported back to any callers of fsync, since
      that requires the journal to be flushed. Therefore, any IO errors
      that would previously go unnoticed are now noticed and the file
      system is withdrawn as early as possible, thus preventing further
      file system damage.
      
      Also note that this reintroduces superblock variable sd_log_error,
      which Christoph removed with commit f729b66f.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      942b0cdd
  28. 10 Aug, 2017 1 commit
    • Abhi Das's avatar
      gfs2: forcibly flush ail to relieve memory pressure · b066a4ee
      Abhi Das authored
      On systems with low memory, it is possible for gfs2 to infinitely
      loop in balance_dirty_pages() under heavy IO (creating sparse files).
      
      balance_dirty_pages() attempts to write out the dirty pages via
      gfs2_writepages() but none are found because these dirty pages are
      being used by the journaling code in the ail. Normally, the journal
      has an upper threshold which when hit triggers an automatic flush
      of the ail. But this threshold can be higher than the number of
      allowable dirty pages and result in the ail never being flushed.
      
      This patch forces an ail flush when gfs2_writepages() fails to write
      anything. This is a good indication that the ail might be holding
      some dirty pages.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      b066a4ee
  29. 07 Jul, 2017 1 commit