Commit 111c1aa8 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "In addition to some ext4 bug fixes and cleanups, this cycle we add the
  orphan_file feature, which eliminates bottlenecks when doing a large
  number of parallel truncates and file deletions, and move the discard
  operation out of the jbd2 commit thread when using the discard mount
  option, to better support devices with slow discard operations"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
  ext4: make the updating inode data procedure atomic
  ext4: remove an unnecessary if statement in __ext4_get_inode_loc()
  ext4: move inode eio simulation behind io completeion
  ext4: Improve scalability of ext4 orphan file handling
  ext4: Orphan file documentation
  ext4: Speedup ext4 orphan inode handling
  ext4: Move orphan inode handling into a separate file
  ext4: Support for checksumming from journal triggers
  ext4: fix race writing to an inline_data file while its xattrs are changing
  jbd2: add sparse annotations for add_transaction_credits()
  ext4: fix sparse warnings
  ext4: Make sure quota files are not grabbed accidentally
  ext4: fix e2fsprogs checksum failure for mounted filesystem
  ext4: if zeroout fails fall back to splitting the extent node
  ext4: reduce arguments of ext4_fc_add_dentry_tlv
  ext4: flush background discard kwork when retry allocation
  ext4: get discard out of jbd2 commit kthread contex
  ext4: remove the repeated comment of ext4_trim_all_free
  ext4: add new helper interface ext4_try_to_trim_range()
  ext4: remove the 'group' parameter of ext4_trim_extent
  ...
parents 815409a1 baaae979
......@@ -11,3 +11,4 @@ have static metadata at fixed locations.
.. include:: bitmaps.rst
.. include:: mmp.rst
.. include:: journal.rst
.. include:: orphan.rst
......@@ -498,11 +498,11 @@ structure -- inode change time (ctime), access time (atime), data
modification time (mtime), and deletion time (dtime). The four fields
are 32-bit signed integers that represent seconds since the Unix epoch
(1970-01-01 00:00:00 GMT), which means that the fields will overflow in
January 2038. For inodes that are not linked from any directory but are
still open (orphan inodes), the dtime field is overloaded for use with
the orphan list. The superblock field ``s_last_orphan`` points to the
first inode in the orphan list; dtime is then the number of the next
orphaned inode, or zero if there are no more orphans.
January 2038. If the filesystem does not have orphan_file feature, inodes
that are not linked from any directory but are still open (orphan inodes) have
the dtime field overloaded for use with the orphan list. The superblock field
``s_last_orphan`` points to the first inode in the orphan list; dtime is then
the number of the next orphaned inode, or zero if there are no more orphans.
If the inode structure size ``sb->s_inode_size`` is larger than 128
bytes and the ``i_inode_extra`` field is large enough to encompass the
......
.. SPDX-License-Identifier: GPL-2.0
Orphan file
-----------
In unix there can inodes that are unlinked from directory hierarchy but that
are still alive because they are open. In case of crash the filesystem has to
clean up these inodes as otherwise they (and the blocks referenced from them)
would leak. Similarly if we truncate or extend the file, we need not be able
to perform the operation in a single journalling transaction. In such case we
track the inode as orphan so that in case of crash extra blocks allocated to
the file get truncated.
Traditionally ext4 tracks orphan inodes in a form of single linked list where
superblock contains the inode number of the last orphan inode (s\_last\_orphan
field) and then each inode contains inode number of the previously orphaned
inode (we overload i\_dtime inode field for this). However this filesystem
global single linked list is a scalability bottleneck for workloads that result
in heavy creation of orphan inodes. When orphan file feature
(COMPAT\_ORPHAN\_FILE) is enabled, the filesystem has a special inode
(referenced from the superblock through s\_orphan_file_inum) with several
blocks. Each of these blocks has a structure:
.. list-table::
:widths: 8 8 24 40
:header-rows: 1
* - Offset
- Type
- Name
- Description
* - 0x0
- Array of \_\_le32 entries
- Orphan inode entries
- Each \_\_le32 entry is either empty (0) or it contains inode number of
an orphan inode.
* - blocksize - 8
- \_\_le32
- ob\_magic
- Magic value stored in orphan block tail (0x0b10ca04)
* - blocksize - 4
- \_\_le32
- ob\_checksum
- Checksum of the orphan block.
When a filesystem with orphan file feature is writeably mounted, we set
RO\_COMPAT\_ORPHAN\_PRESENT feature in the superblock to indicate there may
be valid orphan entries. In case we see this feature when mounting the
filesystem, we read the whole orphan file and process all orphan inodes found
there as usual. When cleanly unmounting the filesystem we remove the
RO\_COMPAT\_ORPHAN\_PRESENT feature to avoid unnecessary scanning of the orphan
file and also make the filesystem fully compatible with older kernels.
......@@ -36,3 +36,20 @@ ext4 reserves some inode for special features, as follows:
* - 11
- Traditional first non-reserved inode. Usually this is the lost+found directory. See s\_first\_ino in the superblock.
Note that there are also some inodes allocated from non-reserved inode numbers
for other filesystem features which are not referenced from standard directory
hierarchy. These are generally reference from the superblock. They are:
.. list-table::
:widths: 20 50
:header-rows: 1
* - Superblock field
- Description
* - s\_lpf\_ino
- Inode number of lost+found directory.
* - s\_prj\_quota\_inum
- Inode number of quota file tracking project quotas
* - s\_orphan\_file\_inum
- Inode number of file tracking orphan inodes.
......@@ -479,7 +479,11 @@ The ext4 superblock is laid out as follows in
- Filename charset encoding flags.
* - 0x280
- \_\_le32
- s\_reserved[95]
- s\_orphan\_file\_inum
- Orphan file inode number.
* - 0x284
- \_\_le32
- s\_reserved[94]
- Padding to the end of the block.
* - 0x3FC
- \_\_le32
......@@ -603,6 +607,11 @@ following:
the journal, JBD2 incompat feature
(JBD2\_FEATURE\_INCOMPAT\_FAST\_COMMIT) gets
set (COMPAT\_FAST\_COMMIT).
* - 0x1000
- Orphan file allocated. This is the special file for more efficient
tracking of unlinked but still open inodes. When there may be any
entries in the file, we additionally set proper rocompat feature
(RO\_COMPAT\_ORPHAN\_PRESENT).
.. _super_incompat:
......@@ -713,6 +722,10 @@ the following:
- Filesystem tracks project quotas. (RO\_COMPAT\_PROJECT)
* - 0x8000
- Verity inodes may be present on the filesystem. (RO\_COMPAT\_VERITY)
* - 0x10000
- Indicates orphan file may have valid orphan entries and thus we need
to clean them up when mounting the filesystem
(RO\_COMPAT\_ORPHAN\_PRESENT).
.. _super_def_hash:
......
......@@ -10,7 +10,7 @@ ext4-y := balloc.o bitmap.o block_validity.o dir.o ext4_jbd2.o extents.o \
indirect.o inline.o inode.o ioctl.o mballoc.o migrate.o \
mmp.o move_extent.o namei.o page-io.o readpage.o resize.o \
super.o symlink.o sysfs.o xattr.o xattr_hurd.o xattr_trusted.o \
xattr_user.o fast_commit.o
xattr_user.o fast_commit.o orphan.o
ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o
ext4-$(CONFIG_EXT4_FS_SECURITY) += xattr_security.o
......
......@@ -652,8 +652,14 @@ int ext4_should_retry_alloc(struct super_block *sb, int *retries)
* possible we just missed a transaction commit that did so
*/
smp_mb();
if (sbi->s_mb_free_pending == 0)
if (sbi->s_mb_free_pending == 0) {
if (test_opt(sb, DISCARD)) {
atomic_inc(&sbi->s_retry_alloc_pending);
flush_work(&sbi->s_discard_work);
atomic_dec(&sbi->s_retry_alloc_pending);
}
return ext4_has_free_clusters(sbi, 1, 0);
}
/*
* it's possible we've just missed a transaction commit here,
......
......@@ -1034,7 +1034,14 @@ struct ext4_inode_info {
*/
struct rw_semaphore xattr_sem;
struct list_head i_orphan; /* unlinked but open inodes */
/*
* Inodes with EXT4_STATE_ORPHAN_FILE use i_orphan_idx. Otherwise
* i_orphan is used.
*/
union {
struct list_head i_orphan; /* unlinked but open inodes */
unsigned int i_orphan_idx; /* Index in orphan file */
};
/* Fast commit related info */
......@@ -1419,7 +1426,8 @@ struct ext4_super_block {
__u8 s_last_error_errcode;
__le16 s_encoding; /* Filename charset encoding */
__le16 s_encoding_flags; /* Filename charset encoding flags */
__le32 s_reserved[95]; /* Padding to the end of the block */
__le32 s_orphan_file_inum; /* Inode for tracking orphan inodes */
__le32 s_reserved[94]; /* Padding to the end of the block */
__le32 s_checksum; /* crc32c(superblock) */
};
......@@ -1438,6 +1446,54 @@ struct ext4_super_block {
#define EXT4_ENC_UTF8_12_1 1
/* Types of ext4 journal triggers */
enum ext4_journal_trigger_type {
EXT4_JTR_ORPHAN_FILE,
EXT4_JTR_NONE /* This must be the last entry for indexing to work! */
};
#define EXT4_JOURNAL_TRIGGER_COUNT EXT4_JTR_NONE
struct ext4_journal_trigger {
struct jbd2_buffer_trigger_type tr_triggers;
struct super_block *sb;
};
static inline struct ext4_journal_trigger *EXT4_TRIGGER(
struct jbd2_buffer_trigger_type *trigger)
{
return container_of(trigger, struct ext4_journal_trigger, tr_triggers);
}
#define EXT4_ORPHAN_BLOCK_MAGIC 0x0b10ca04
/* Structure at the tail of orphan block */
struct ext4_orphan_block_tail {
__le32 ob_magic;
__le32 ob_checksum;
};
static inline int ext4_inodes_per_orphan_block(struct super_block *sb)
{
return (sb->s_blocksize - sizeof(struct ext4_orphan_block_tail)) /
sizeof(u32);
}
struct ext4_orphan_block {
atomic_t ob_free_entries; /* Number of free orphan entries in block */
struct buffer_head *ob_bh; /* Buffer for orphan block */
};
/*
* Info about orphan file.
*/
struct ext4_orphan_info {
int of_blocks; /* Number of orphan blocks in a file */
__u32 of_csum_seed; /* Checksum seed for orphan file */
struct ext4_orphan_block *of_binfo; /* Array with info about orphan
* file blocks */
};
/*
* fourth extended-fs super-block data in memory
*/
......@@ -1492,9 +1548,11 @@ struct ext4_sb_info {
/* Journaling */
struct journal_s *s_journal;
struct list_head s_orphan;
struct mutex s_orphan_lock;
unsigned long s_ext4_flags; /* Ext4 superblock flags */
struct mutex s_orphan_lock; /* Protects on disk list changes */
struct list_head s_orphan; /* List of orphaned inodes in on disk
list */
struct ext4_orphan_info s_orphan_info;
unsigned long s_commit_interval;
u32 s_max_batch_time;
u32 s_min_batch_time;
......@@ -1527,6 +1585,9 @@ struct ext4_sb_info {
unsigned int s_mb_free_pending;
struct list_head s_freed_data_list; /* List of blocks to be freed
after commit completed */
struct list_head s_discard_list;
struct work_struct s_discard_work;
atomic_t s_retry_alloc_pending;
struct rb_root s_mb_avg_fragment_size_root;
rwlock_t s_mb_rb_lock;
struct list_head *s_mb_largest_free_orders;
......@@ -1616,6 +1677,9 @@ struct ext4_sb_info {
struct mb_cache *s_ea_inode_cache;
spinlock_t s_es_lock ____cacheline_aligned_in_smp;
/* Journal triggers for checksum computation */
struct ext4_journal_trigger s_journal_triggers[EXT4_JOURNAL_TRIGGER_COUNT];
/* Ratelimit ext4 messages. */
struct ratelimit_state s_err_ratelimit_state;
struct ratelimit_state s_warning_ratelimit_state;
......@@ -1826,6 +1890,7 @@ enum {
EXT4_STATE_LUSTRE_EA_INODE, /* Lustre-style ea_inode */
EXT4_STATE_VERITY_IN_PROGRESS, /* building fs-verity Merkle tree */
EXT4_STATE_FC_COMMITTING, /* Fast commit ongoing */
EXT4_STATE_ORPHAN_FILE, /* Inode orphaned in orphan file */
};
#define EXT4_INODE_BIT_FNS(name, field, offset) \
......@@ -1927,6 +1992,7 @@ static inline bool ext4_verity_in_progress(struct inode *inode)
*/
#define EXT4_FEATURE_COMPAT_FAST_COMMIT 0x0400
#define EXT4_FEATURE_COMPAT_STABLE_INODES 0x0800
#define EXT4_FEATURE_COMPAT_ORPHAN_FILE 0x1000 /* Orphan file exists */
#define EXT4_FEATURE_RO_COMPAT_SPARSE_SUPER 0x0001
#define EXT4_FEATURE_RO_COMPAT_LARGE_FILE 0x0002
......@@ -1947,6 +2013,8 @@ static inline bool ext4_verity_in_progress(struct inode *inode)
#define EXT4_FEATURE_RO_COMPAT_READONLY 0x1000
#define EXT4_FEATURE_RO_COMPAT_PROJECT 0x2000
#define EXT4_FEATURE_RO_COMPAT_VERITY 0x8000
#define EXT4_FEATURE_RO_COMPAT_ORPHAN_PRESENT 0x10000 /* Orphan file may be
non-empty */
#define EXT4_FEATURE_INCOMPAT_COMPRESSION 0x0001
#define EXT4_FEATURE_INCOMPAT_FILETYPE 0x0002
......@@ -2030,6 +2098,7 @@ EXT4_FEATURE_COMPAT_FUNCS(dir_index, DIR_INDEX)
EXT4_FEATURE_COMPAT_FUNCS(sparse_super2, SPARSE_SUPER2)
EXT4_FEATURE_COMPAT_FUNCS(fast_commit, FAST_COMMIT)
EXT4_FEATURE_COMPAT_FUNCS(stable_inodes, STABLE_INODES)
EXT4_FEATURE_COMPAT_FUNCS(orphan_file, ORPHAN_FILE)
EXT4_FEATURE_RO_COMPAT_FUNCS(sparse_super, SPARSE_SUPER)
EXT4_FEATURE_RO_COMPAT_FUNCS(large_file, LARGE_FILE)
......@@ -2044,6 +2113,7 @@ EXT4_FEATURE_RO_COMPAT_FUNCS(metadata_csum, METADATA_CSUM)
EXT4_FEATURE_RO_COMPAT_FUNCS(readonly, READONLY)
EXT4_FEATURE_RO_COMPAT_FUNCS(project, PROJECT)
EXT4_FEATURE_RO_COMPAT_FUNCS(verity, VERITY)
EXT4_FEATURE_RO_COMPAT_FUNCS(orphan_present, ORPHAN_PRESENT)
EXT4_FEATURE_INCOMPAT_FUNCS(compression, COMPRESSION)
EXT4_FEATURE_INCOMPAT_FUNCS(filetype, FILETYPE)
......@@ -2077,7 +2147,8 @@ EXT4_FEATURE_INCOMPAT_FUNCS(casefold, CASEFOLD)
EXT4_FEATURE_RO_COMPAT_LARGE_FILE| \
EXT4_FEATURE_RO_COMPAT_BTREE_DIR)
#define EXT4_FEATURE_COMPAT_SUPP EXT4_FEATURE_COMPAT_EXT_ATTR
#define EXT4_FEATURE_COMPAT_SUPP (EXT4_FEATURE_COMPAT_EXT_ATTR| \
EXT4_FEATURE_COMPAT_ORPHAN_FILE)
#define EXT4_FEATURE_INCOMPAT_SUPP (EXT4_FEATURE_INCOMPAT_FILETYPE| \
EXT4_FEATURE_INCOMPAT_RECOVER| \
EXT4_FEATURE_INCOMPAT_META_BG| \
......@@ -2102,7 +2173,8 @@ EXT4_FEATURE_INCOMPAT_FUNCS(casefold, CASEFOLD)
EXT4_FEATURE_RO_COMPAT_METADATA_CSUM|\
EXT4_FEATURE_RO_COMPAT_QUOTA |\
EXT4_FEATURE_RO_COMPAT_PROJECT |\
EXT4_FEATURE_RO_COMPAT_VERITY)
EXT4_FEATURE_RO_COMPAT_VERITY |\
EXT4_FEATURE_RO_COMPAT_ORPHAN_PRESENT)
#define EXTN_FEATURE_FUNCS(ver) \
static inline bool ext4_has_unknown_ext##ver##_compat_features(struct super_block *sb) \
......@@ -2138,6 +2210,8 @@ static inline bool ext4_has_incompat_features(struct super_block *sb)
return (EXT4_SB(sb)->s_es->s_feature_incompat != 0);
}
extern int ext4_feature_set_ok(struct super_block *sb, int readonly);
/*
* Superblock flags
*/
......@@ -2150,7 +2224,6 @@ static inline int ext4_forced_shutdown(struct ext4_sb_info *sbi)
return test_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
}
/*
* Default values for user and/or group using reserved blocks
*/
......@@ -2911,13 +2984,14 @@ int ext4_get_block(struct inode *inode, sector_t iblock,
int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
struct buffer_head *bh, int create);
int ext4_walk_page_buffers(handle_t *handle,
struct inode *inode,
struct buffer_head *head,
unsigned from,
unsigned to,
int *partial,
int (*fn)(handle_t *handle,
int (*fn)(handle_t *handle, struct inode *inode,
struct buffer_head *bh));
int do_journal_get_write_access(handle_t *handle,
int do_journal_get_write_access(handle_t *handle, struct inode *inode,
struct buffer_head *bh);
#define FALL_BACK_TO_NONDELALLOC 1
#define CONVERT_INLINE_DATA 2
......@@ -2996,8 +3070,6 @@ extern int ext4_init_new_dir(handle_t *handle, struct inode *dir,
struct inode *inode);
extern int ext4_dirblock_csum_verify(struct inode *inode,
struct buffer_head *bh);
extern int ext4_orphan_add(handle_t *, struct inode *);
extern int ext4_orphan_del(handle_t *, struct inode *);
extern int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash,
__u32 start_minor_hash, __u32 *next_hash);
extern int ext4_search_dir(struct buffer_head *bh,
......@@ -3466,6 +3538,7 @@ static inline bool ext4_is_quota_journalled(struct super_block *sb)
return (ext4_has_feature_quota(sb) ||
sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA]);
}
int ext4_enable_quotas(struct super_block *sb);
#endif
/*
......@@ -3727,6 +3800,19 @@ extern void ext4_stop_mmpd(struct ext4_sb_info *sbi);
/* verity.c */
extern const struct fsverity_operations ext4_verityops;
/* orphan.c */
extern int ext4_orphan_add(handle_t *, struct inode *);
extern int ext4_orphan_del(handle_t *, struct inode *);
extern void ext4_orphan_cleanup(struct super_block *sb,
struct ext4_super_block *es);
extern void ext4_release_orphan_info(struct super_block *sb);
extern int ext4_init_orphan_info(struct super_block *sb);
extern int ext4_orphan_file_empty(struct super_block *sb);
extern void ext4_orphan_file_block_trigger(
struct jbd2_buffer_trigger_type *triggers,
struct buffer_head *bh,
void *data, size_t size);
/*
* Add new method to test whether block and inode bitmaps are properly
* initialized. With uninit_bg reading the block from disk is not enough
......
......@@ -173,10 +173,11 @@ struct partial_cluster {
#define EXT_MAX_EXTENT(__hdr__) \
((le16_to_cpu((__hdr__)->eh_max)) ? \
((EXT_FIRST_EXTENT((__hdr__)) + le16_to_cpu((__hdr__)->eh_max) - 1)) \
: 0)
: NULL)
#define EXT_MAX_INDEX(__hdr__) \
((le16_to_cpu((__hdr__)->eh_max)) ? \
((EXT_FIRST_INDEX((__hdr__)) + le16_to_cpu((__hdr__)->eh_max) - 1)) : 0)
((EXT_FIRST_INDEX((__hdr__)) + le16_to_cpu((__hdr__)->eh_max) - 1)) \
: NULL)
static inline struct ext4_extent_header *ext_inode_hdr(struct inode *inode)
{
......
......@@ -218,9 +218,11 @@ static void ext4_check_bdev_write_error(struct super_block *sb)
}
int __ext4_journal_get_write_access(const char *where, unsigned int line,
handle_t *handle, struct buffer_head *bh)
handle_t *handle, struct super_block *sb,
struct buffer_head *bh,
enum ext4_journal_trigger_type trigger_type)
{
int err = 0;
int err;
might_sleep();
......@@ -229,11 +231,18 @@ int __ext4_journal_get_write_access(const char *where, unsigned int line,
if (ext4_handle_valid(handle)) {
err = jbd2_journal_get_write_access(handle, bh);
if (err)
if (err) {
ext4_journal_abort_handle(where, line, __func__, bh,
handle, err);
return err;
}
}
return err;
if (trigger_type == EXT4_JTR_NONE || !ext4_has_metadata_csum(sb))
return 0;
BUG_ON(trigger_type >= EXT4_JOURNAL_TRIGGER_COUNT);
jbd2_journal_set_triggers(bh,
&EXT4_SB(sb)->s_journal_triggers[trigger_type].tr_triggers);
return 0;
}
/*
......@@ -301,17 +310,27 @@ int __ext4_forget(const char *where, unsigned int line, handle_t *handle,
}
int __ext4_journal_get_create_access(const char *where, unsigned int line,
handle_t *handle, struct buffer_head *bh)
handle_t *handle, struct super_block *sb,
struct buffer_head *bh,
enum ext4_journal_trigger_type trigger_type)
{
int err = 0;
int err;
if (ext4_handle_valid(handle)) {
err = jbd2_journal_get_create_access(handle, bh);
if (err)
ext4_journal_abort_handle(where, line, __func__,
bh, handle, err);
if (!ext4_handle_valid(handle))
return 0;
err = jbd2_journal_get_create_access(handle, bh);
if (err) {
ext4_journal_abort_handle(where, line, __func__, bh, handle,
err);
return err;
}
return err;
if (trigger_type == EXT4_JTR_NONE || !ext4_has_metadata_csum(sb))
return 0;
BUG_ON(trigger_type >= EXT4_JOURNAL_TRIGGER_COUNT);
jbd2_journal_set_triggers(bh,
&EXT4_SB(sb)->s_journal_triggers[trigger_type].tr_triggers);
return 0;
}
int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
......
......@@ -231,26 +231,32 @@ int ext4_expand_extra_isize(struct inode *inode,
* Wrapper functions with which ext4 calls into JBD.
*/
int __ext4_journal_get_write_access(const char *where, unsigned int line,
handle_t *handle, struct buffer_head *bh);
handle_t *handle, struct super_block *sb,
struct buffer_head *bh,
enum ext4_journal_trigger_type trigger_type);
int __ext4_forget(const char *where, unsigned int line, handle_t *handle,
int is_metadata, struct inode *inode,
struct buffer_head *bh, ext4_fsblk_t blocknr);
int __ext4_journal_get_create_access(const char *where, unsigned int line,
handle_t *handle, struct buffer_head *bh);
handle_t *handle, struct super_block *sb,
struct buffer_head *bh,
enum ext4_journal_trigger_type trigger_type);
int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
handle_t *handle, struct inode *inode,
struct buffer_head *bh);
#define ext4_journal_get_write_access(handle, bh) \
__ext4_journal_get_write_access(__func__, __LINE__, (handle), (bh))
#define ext4_journal_get_write_access(handle, sb, bh, trigger_type) \
__ext4_journal_get_write_access(__func__, __LINE__, (handle), (sb), \
(bh), (trigger_type))
#define ext4_forget(handle, is_metadata, inode, bh, block_nr) \
__ext4_forget(__func__, __LINE__, (handle), (is_metadata), (inode), \
(bh), (block_nr))
#define ext4_journal_get_create_access(handle, bh) \
__ext4_journal_get_create_access(__func__, __LINE__, (handle), (bh))
#define ext4_journal_get_create_access(handle, sb, bh, trigger_type) \
__ext4_journal_get_create_access(__func__, __LINE__, (handle), (sb), \
(bh), (trigger_type))
#define ext4_handle_dirty_metadata(handle, inode, bh) \
__ext4_handle_dirty_metadata(__func__, __LINE__, (handle), (inode), \
(bh))
......
......@@ -139,7 +139,8 @@ static int ext4_ext_get_access(handle_t *handle, struct inode *inode,
if (path->p_bh) {
/* path points to block */
BUFFER_TRACE(path->p_bh, "get_write_access");
return ext4_journal_get_write_access(handle, path->p_bh);
return ext4_journal_get_write_access(handle, inode->i_sb,
path->p_bh, EXT4_JTR_NONE);
}
/* path points to leaf/index in inode body */
/* we use in-core data, no need to protect them */
......@@ -1082,7 +1083,8 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
}
lock_buffer(bh);
err = ext4_journal_get_create_access(handle, bh);
err = ext4_journal_get_create_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err)
goto cleanup;
......@@ -1160,7 +1162,8 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
}
lock_buffer(bh);
err = ext4_journal_get_create_access(handle, bh);
err = ext4_journal_get_create_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err)
goto cleanup;
......@@ -1286,7 +1289,8 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
return -ENOMEM;
lock_buffer(bh);
err = ext4_journal_get_create_access(handle, bh);
err = ext4_journal_get_create_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err) {
unlock_buffer(bh);
goto out;
......@@ -3569,7 +3573,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
split_map.m_len - ee_block);
err = ext4_ext_zeroout(inode, &zero_ex1);
if (err)
goto out;
goto fallback;
split_map.m_len = allocated;
}
if (split_map.m_lblk - ee_block + split_map.m_len <
......@@ -3583,7 +3587,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
ext4_ext_pblock(ex));
err = ext4_ext_zeroout(inode, &zero_ex2);
if (err)
goto out;
goto fallback;
}
split_map.m_len += split_map.m_lblk - ee_block;
......@@ -3592,6 +3596,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
}
}
fallback:
err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
flags);
if (err > 0)
......
......@@ -775,28 +775,27 @@ static bool ext4_fc_add_tlv(struct super_block *sb, u16 tag, u16 len, u8 *val,
}
/* Same as above, but adds dentry tlv. */
static bool ext4_fc_add_dentry_tlv(struct super_block *sb, u16 tag,
int parent_ino, int ino, int dlen,
const unsigned char *dname,
u32 *crc)
static bool ext4_fc_add_dentry_tlv(struct super_block *sb, u32 *crc,
struct ext4_fc_dentry_update *fc_dentry)
{
struct ext4_fc_dentry_info fcd;
struct ext4_fc_tl tl;
int dlen = fc_dentry->fcd_name.len;
u8 *dst = ext4_fc_reserve_space(sb, sizeof(tl) + sizeof(fcd) + dlen,
crc);
if (!dst)
return false;
fcd.fc_parent_ino = cpu_to_le32(parent_ino);
fcd.fc_ino = cpu_to_le32(ino);
tl.fc_tag = cpu_to_le16(tag);
fcd.fc_parent_ino = cpu_to_le32(fc_dentry->fcd_parent);
fcd.fc_ino = cpu_to_le32(fc_dentry->fcd_ino);
tl.fc_tag = cpu_to_le16(fc_dentry->fcd_op);
tl.fc_len = cpu_to_le16(sizeof(fcd) + dlen);
ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), crc);
dst += sizeof(tl);
ext4_fc_memcpy(sb, dst, &fcd, sizeof(fcd), crc);
dst += sizeof(fcd);
ext4_fc_memcpy(sb, dst, dname, dlen, crc);
ext4_fc_memcpy(sb, dst, fc_dentry->fcd_name.name, dlen, crc);
dst += dlen;
return true;
......@@ -992,11 +991,7 @@ __releases(&sbi->s_fc_lock)
&sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {
if (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT) {
spin_unlock(&sbi->s_fc_lock);
if (!ext4_fc_add_dentry_tlv(
sb, fc_dentry->fcd_op,
fc_dentry->fcd_parent, fc_dentry->fcd_ino,
fc_dentry->fcd_name.len,
fc_dentry->fcd_name.name, crc)) {
if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry)) {
ret = -ENOSPC;
goto lock_and_exit;
}
......@@ -1035,11 +1030,7 @@ __releases(&sbi->s_fc_lock)
if (ret)
goto lock_and_exit;
if (!ext4_fc_add_dentry_tlv(
sb, fc_dentry->fcd_op,
fc_dentry->fcd_parent, fc_dentry->fcd_ino,
fc_dentry->fcd_name.len,
fc_dentry->fcd_name.name, crc)) {
if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry)) {
ret = -ENOSPC;
goto lock_and_exit;
}
......
......@@ -823,7 +823,8 @@ static int ext4_sample_last_mounted(struct super_block *sb,
if (IS_ERR(handle))
goto out;
BUFFER_TRACE(sbi->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
err = ext4_journal_get_write_access(handle, sb, sbi->s_sbh,
EXT4_JTR_NONE);
if (err)
goto out_journal;
lock_buffer(sbi->s_sbh);
......
......@@ -300,7 +300,8 @@ void ext4_free_inode(handle_t *handle, struct inode *inode)
}
BUFFER_TRACE(bitmap_bh, "get_write_access");
fatal = ext4_journal_get_write_access(handle, bitmap_bh);
fatal = ext4_journal_get_write_access(handle, sb, bitmap_bh,
EXT4_JTR_NONE);
if (fatal)
goto error_return;
......@@ -308,7 +309,8 @@ void ext4_free_inode(handle_t *handle, struct inode *inode)
gdp = ext4_get_group_desc(sb, block_group, &bh2);
if (gdp) {
BUFFER_TRACE(bh2, "get_write_access");
fatal = ext4_journal_get_write_access(handle, bh2);
fatal = ext4_journal_get_write_access(handle, sb, bh2,
EXT4_JTR_NONE);
}
ext4_lock_group(sb, block_group);
cleared = ext4_test_and_clear_bit(bit, bitmap_bh->b_data);
......@@ -1085,7 +1087,8 @@ struct inode *__ext4_new_inode(struct user_namespace *mnt_userns,
}
}
BUFFER_TRACE(inode_bitmap_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, inode_bitmap_bh);
err = ext4_journal_get_write_access(handle, sb, inode_bitmap_bh,
EXT4_JTR_NONE);
if (err) {
ext4_std_error(sb, err);
goto out;
......@@ -1127,7 +1130,8 @@ struct inode *__ext4_new_inode(struct user_namespace *mnt_userns,
}
BUFFER_TRACE(group_desc_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, group_desc_bh);
err = ext4_journal_get_write_access(handle, sb, group_desc_bh,
EXT4_JTR_NONE);
if (err) {
ext4_std_error(sb, err);
goto out;
......@@ -1144,7 +1148,8 @@ struct inode *__ext4_new_inode(struct user_namespace *mnt_userns,
goto out;
}
BUFFER_TRACE(block_bitmap_bh, "get block bitmap access");
err = ext4_journal_get_write_access(handle, block_bitmap_bh);
err = ext4_journal_get_write_access(handle, sb, block_bitmap_bh,
EXT4_JTR_NONE);
if (err) {
brelse(block_bitmap_bh);
ext4_std_error(sb, err);
......@@ -1583,8 +1588,8 @@ int ext4_init_inode_table(struct super_block *sb, ext4_group_t group,
num = sbi->s_itb_per_group - used_blks;
BUFFER_TRACE(group_desc_bh, "get_write_access");
ret = ext4_journal_get_write_access(handle,
group_desc_bh);
ret = ext4_journal_get_write_access(handle, sb, group_desc_bh,
EXT4_JTR_NONE);
if (ret)
goto err_out;
......
......@@ -354,7 +354,8 @@ static int ext4_alloc_branch(handle_t *handle,
}
lock_buffer(bh);
BUFFER_TRACE(bh, "call get_create_access");
err = ext4_journal_get_create_access(handle, bh);
err = ext4_journal_get_create_access(handle, ar->inode->i_sb,
bh, EXT4_JTR_NONE);
if (err) {
unlock_buffer(bh);
goto failed;
......@@ -429,7 +430,8 @@ static int ext4_splice_branch(handle_t *handle,
*/
if (where->bh) {
BUFFER_TRACE(where->bh, "get_write_access");
err = ext4_journal_get_write_access(handle, where->bh);
err = ext4_journal_get_write_access(handle, ar->inode->i_sb,
where->bh, EXT4_JTR_NONE);
if (err)
goto err_out;
}
......@@ -728,7 +730,8 @@ static int ext4_ind_truncate_ensure_credits(handle_t *handle,
return ret;
if (bh) {
BUFFER_TRACE(bh, "retaking write access");
ret = ext4_journal_get_write_access(handle, bh);
ret = ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (unlikely(ret))
return ret;
}
......@@ -916,7 +919,8 @@ static void ext4_free_data(handle_t *handle, struct inode *inode,
if (this_bh) { /* For indirect block */
BUFFER_TRACE(this_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, this_bh);
err = ext4_journal_get_write_access(handle, inode->i_sb,
this_bh, EXT4_JTR_NONE);
/* Important: if we can't update the indirect pointers
* to the blocks, we can't free them. */
if (err)
......@@ -1079,7 +1083,8 @@ static void ext4_free_branches(handle_t *handle, struct inode *inode,
*/
BUFFER_TRACE(parent_bh, "get_write_access");
if (!ext4_journal_get_write_access(handle,
parent_bh)){
inode->i_sb, parent_bh,
EXT4_JTR_NONE)) {
*p = 0;
BUFFER_TRACE(parent_bh,
"call ext4_handle_dirty_metadata");
......
......@@ -264,7 +264,8 @@ static int ext4_create_inline_data(handle_t *handle,
return error;
BUFFER_TRACE(is.iloc.bh, "get_write_access");
error = ext4_journal_get_write_access(handle, is.iloc.bh);
error = ext4_journal_get_write_access(handle, inode->i_sb, is.iloc.bh,
EXT4_JTR_NONE);
if (error)
goto out;
......@@ -350,7 +351,8 @@ static int ext4_update_inline_data(handle_t *handle, struct inode *inode,
goto out;
BUFFER_TRACE(is.iloc.bh, "get_write_access");
error = ext4_journal_get_write_access(handle, is.iloc.bh);
error = ext4_journal_get_write_access(handle, inode->i_sb, is.iloc.bh,
EXT4_JTR_NONE);
if (error)
goto out;
......@@ -427,7 +429,8 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle,
goto out;
BUFFER_TRACE(is.iloc.bh, "get_write_access");
error = ext4_journal_get_write_access(handle, is.iloc.bh);
error = ext4_journal_get_write_access(handle, inode->i_sb, is.iloc.bh,
EXT4_JTR_NONE);
if (error)
goto out;
......@@ -593,7 +596,7 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
ret = __block_write_begin(page, from, to, ext4_get_block);
if (!ret && ext4_should_journal_data(inode)) {
ret = ext4_walk_page_buffers(handle, page_buffers(page),
ret = ext4_walk_page_buffers(handle, inode, page_buffers(page),
from, to, NULL,
do_journal_get_write_access);
}
......@@ -682,7 +685,8 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
goto convert;
}
ret = ext4_journal_get_write_access(handle, iloc.bh);
ret = ext4_journal_get_write_access(handle, inode->i_sb, iloc.bh,
EXT4_JTR_NONE);
if (ret)
goto out;
......@@ -750,6 +754,12 @@ int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len,
ext4_write_lock_xattr(inode, &no_expand);
BUG_ON(!ext4_has_inline_data(inode));
/*
* ei->i_inline_off may have changed since ext4_write_begin()
* called ext4_try_to_write_inline_data()
*/
(void) ext4_find_inline_data_nolock(inode);
kaddr = kmap_atomic(page);
ext4_write_inline_data(inode, &iloc, kaddr, pos, len);
kunmap_atomic(kaddr);
......@@ -923,7 +933,8 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
if (ret < 0)
goto out_release_page;
}
ret = ext4_journal_get_write_access(handle, iloc.bh);
ret = ext4_journal_get_write_access(handle, inode->i_sb, iloc.bh,
EXT4_JTR_NONE);
if (ret)
goto out_release_page;
......@@ -1028,7 +1039,8 @@ static int ext4_add_dirent_to_inline(handle_t *handle,
return err;
BUFFER_TRACE(iloc->bh, "get_write_access");
err = ext4_journal_get_write_access(handle, iloc->bh);
err = ext4_journal_get_write_access(handle, dir->i_sb, iloc->bh,
EXT4_JTR_NONE);
if (err)
return err;
ext4_insert_dentry(dir, inode, de, inline_size, fname);
......@@ -1223,7 +1235,8 @@ static int ext4_convert_inline_data_nolock(handle_t *handle,
}
lock_buffer(data_bh);
error = ext4_journal_get_create_access(handle, data_bh);
error = ext4_journal_get_create_access(handle, inode->i_sb, data_bh,
EXT4_JTR_NONE);
if (error) {
unlock_buffer(data_bh);
error = -EIO;
......@@ -1707,7 +1720,8 @@ int ext4_delete_inline_entry(handle_t *handle,
}
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle, dir->i_sb, bh,
EXT4_JTR_NONE);
if (err)
goto out;
......
This diff is collapsed.
......@@ -1154,7 +1154,9 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
err = PTR_ERR(handle);
goto pwsalt_err_exit;
}
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
err = ext4_journal_get_write_access(handle, sb,
sbi->s_sbh,
EXT4_JTR_NONE);
if (err)
goto pwsalt_err_journal;
lock_buffer(sbi->s_sbh);
......
This diff is collapsed.
......@@ -70,7 +70,8 @@ static struct buffer_head *ext4_append(handle_t *handle,
inode->i_size += inode->i_sb->s_blocksize;
EXT4_I(inode)->i_disksize = inode->i_size;
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err) {
brelse(bh);
ext4_std_error(inode->i_sb, err);
......@@ -1927,12 +1928,14 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
}
BUFFER_TRACE(*bh, "get_write_access");
err = ext4_journal_get_write_access(handle, *bh);
err = ext4_journal_get_write_access(handle, dir->i_sb, *bh,
EXT4_JTR_NONE);
if (err)
goto journal_error;
BUFFER_TRACE(frame->bh, "get_write_access");
err = ext4_journal_get_write_access(handle, frame->bh);
err = ext4_journal_get_write_access(handle, dir->i_sb, frame->bh,
EXT4_JTR_NONE);
if (err)
goto journal_error;
......@@ -2109,7 +2112,8 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname,
return err;
}
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle, dir->i_sb, bh,
EXT4_JTR_NONE);
if (err) {
ext4_std_error(dir->i_sb, err);
return err;
......@@ -2167,7 +2171,8 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
blocksize = dir->i_sb->s_blocksize;
dxtrace(printk(KERN_DEBUG "Creating index: inode %lu\n", dir->i_ino));
BUFFER_TRACE(bh, "get_write_access");
retval = ext4_journal_get_write_access(handle, bh);
retval = ext4_journal_get_write_access(handle, dir->i_sb, bh,
EXT4_JTR_NONE);
if (retval) {
ext4_std_error(dir->i_sb, retval);
brelse(bh);
......@@ -2419,7 +2424,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
}
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle, sb, bh, EXT4_JTR_NONE);
if (err)
goto journal_error;
......@@ -2476,7 +2481,8 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
node2->fake.rec_len = ext4_rec_len_to_disk(sb->s_blocksize,
sb->s_blocksize);
BUFFER_TRACE(frame->bh, "get_write_access");
err = ext4_journal_get_write_access(handle, frame->bh);
err = ext4_journal_get_write_access(handle, sb, frame->bh,
EXT4_JTR_NONE);
if (err)
goto journal_error;
if (!add_level) {
......@@ -2486,8 +2492,9 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
icount1, icount2));
BUFFER_TRACE(frame->bh, "get_write_access"); /* index root */
err = ext4_journal_get_write_access(handle,
(frame - 1)->bh);
err = ext4_journal_get_write_access(handle, sb,
(frame - 1)->bh,
EXT4_JTR_NONE);
if (err)
goto journal_error;
......@@ -2636,7 +2643,8 @@ static int ext4_delete_entry(handle_t *handle,
csum_size = sizeof(struct ext4_dir_entry_tail);
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle, dir->i_sb, bh,
EXT4_JTR_NONE);
if (unlikely(err))
goto out;
......@@ -3046,186 +3054,6 @@ bool ext4_empty_dir(struct inode *inode)
return true;
}
/*
* ext4_orphan_add() links an unlinked or truncated inode into a list of
* such inodes, starting at the superblock, in case we crash before the
* file is closed/deleted, or in case the inode truncate spans multiple
* transactions and the last transaction is not recovered after a crash.
*
* At filesystem recovery time, we walk this list deleting unlinked
* inodes and truncating linked inodes in ext4_orphan_cleanup().
*
* Orphan list manipulation functions must be called under i_mutex unless
* we are just creating the inode or deleting it.
*/
int ext4_orphan_add(handle_t *handle, struct inode *inode)
{
struct super_block *sb = inode->i_sb;
struct ext4_sb_info *sbi = EXT4_SB(sb);
struct ext4_iloc iloc;
int err = 0, rc;
bool dirty = false;
if (!sbi->s_journal || is_bad_inode(inode))
return 0;
WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
!inode_is_locked(inode));
/*
* Exit early if inode already is on orphan list. This is a big speedup
* since we don't have to contend on the global s_orphan_lock.
*/
if (!list_empty(&EXT4_I(inode)->i_orphan))
return 0;
/*
* Orphan handling is only valid for files with data blocks
* being truncated, or files being unlinked. Note that we either
* hold i_mutex, or the inode can not be referenced from outside,
* so i_nlink should not be bumped due to race
*/
ASSERT((S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
S_ISLNK(inode->i_mode)) || inode->i_nlink == 0);
BUFFER_TRACE(sbi->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
if (err)
goto out;
err = ext4_reserve_inode_write(handle, inode, &iloc);
if (err)
goto out;
mutex_lock(&sbi->s_orphan_lock);
/*
* Due to previous errors inode may be already a part of on-disk
* orphan list. If so skip on-disk list modification.
*/
if (!NEXT_ORPHAN(inode) || NEXT_ORPHAN(inode) >
(le32_to_cpu(sbi->s_es->s_inodes_count))) {
/* Insert this inode at the head of the on-disk orphan list */
NEXT_ORPHAN(inode) = le32_to_cpu(sbi->s_es->s_last_orphan);
lock_buffer(sbi->s_sbh);
sbi->s_es->s_last_orphan = cpu_to_le32(inode->i_ino);
ext4_superblock_csum_set(sb);
unlock_buffer(sbi->s_sbh);
dirty = true;
}
list_add(&EXT4_I(inode)->i_orphan, &sbi->s_orphan);
mutex_unlock(&sbi->s_orphan_lock);
if (dirty) {
err = ext4_handle_dirty_metadata(handle, NULL, sbi->s_sbh);
rc = ext4_mark_iloc_dirty(handle, inode, &iloc);
if (!err)
err = rc;
if (err) {
/*
* We have to remove inode from in-memory list if
* addition to on disk orphan list failed. Stray orphan
* list entries can cause panics at unmount time.
*/
mutex_lock(&sbi->s_orphan_lock);
list_del_init(&EXT4_I(inode)->i_orphan);
mutex_unlock(&sbi->s_orphan_lock);
}
} else
brelse(iloc.bh);
jbd_debug(4, "superblock will point to %lu\n", inode->i_ino);
jbd_debug(4, "orphan inode %lu will point to %d\n",
inode->i_ino, NEXT_ORPHAN(inode));
out:
ext4_std_error(sb, err);
return err;
}
/*
* ext4_orphan_del() removes an unlinked or truncated inode from the list
* of such inodes stored on disk, because it is finally being cleaned up.
*/
int ext4_orphan_del(handle_t *handle, struct inode *inode)
{
struct list_head *prev;
struct ext4_inode_info *ei = EXT4_I(inode);
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
__u32 ino_next;
struct ext4_iloc iloc;
int err = 0;
if (!sbi->s_journal && !(sbi->s_mount_state & EXT4_ORPHAN_FS))
return 0;
WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
!inode_is_locked(inode));
/* Do this quick check before taking global s_orphan_lock. */
if (list_empty(&ei->i_orphan))
return 0;
if (handle) {
/* Grab inode buffer early before taking global s_orphan_lock */
err = ext4_reserve_inode_write(handle, inode, &iloc);
}
mutex_lock(&sbi->s_orphan_lock);
jbd_debug(4, "remove inode %lu from orphan list\n", inode->i_ino);
prev = ei->i_orphan.prev;
list_del_init(&ei->i_orphan);
/* If we're on an error path, we may not have a valid
* transaction handle with which to update the orphan list on
* disk, but we still need to remove the inode from the linked
* list in memory. */
if (!handle || err) {
mutex_unlock(&sbi->s_orphan_lock);
goto out_err;
}
ino_next = NEXT_ORPHAN(inode);
if (prev == &sbi->s_orphan) {
jbd_debug(4, "superblock will point to %u\n", ino_next);
BUFFER_TRACE(sbi->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
if (err) {
mutex_unlock(&sbi->s_orphan_lock);
goto out_brelse;
}
lock_buffer(sbi->s_sbh);
sbi->s_es->s_last_orphan = cpu_to_le32(ino_next);
ext4_superblock_csum_set(inode->i_sb);
unlock_buffer(sbi->s_sbh);
mutex_unlock(&sbi->s_orphan_lock);
err = ext4_handle_dirty_metadata(handle, NULL, sbi->s_sbh);
} else {
struct ext4_iloc iloc2;
struct inode *i_prev =
&list_entry(prev, struct ext4_inode_info, i_orphan)->vfs_inode;
jbd_debug(4, "orphan inode %lu will point to %u\n",
i_prev->i_ino, ino_next);
err = ext4_reserve_inode_write(handle, i_prev, &iloc2);
if (err) {
mutex_unlock(&sbi->s_orphan_lock);
goto out_brelse;
}
NEXT_ORPHAN(i_prev) = ino_next;
err = ext4_mark_iloc_dirty(handle, i_prev, &iloc2);
mutex_unlock(&sbi->s_orphan_lock);
}
if (err)
goto out_brelse;
NEXT_ORPHAN(inode) = 0;
err = ext4_mark_iloc_dirty(handle, inode, &iloc);
out_err:
ext4_std_error(inode->i_sb, err);
return err;
out_brelse:
brelse(iloc.bh);
goto out_err;
}
static int ext4_rmdir(struct inode *dir, struct dentry *dentry)
{
int retval;
......@@ -3675,7 +3503,8 @@ static int ext4_rename_dir_prepare(handle_t *handle, struct ext4_renament *ent)
if (le32_to_cpu(ent->parent_de->inode) != ent->dir->i_ino)
return -EFSCORRUPTED;
BUFFER_TRACE(ent->dir_bh, "get_write_access");
return ext4_journal_get_write_access(handle, ent->dir_bh);
return ext4_journal_get_write_access(handle, ent->dir->i_sb,
ent->dir_bh, EXT4_JTR_NONE);
}
static int ext4_rename_dir_finish(handle_t *handle, struct ext4_renament *ent,
......@@ -3710,7 +3539,8 @@ static int ext4_setent(handle_t *handle, struct ext4_renament *ent,
int retval, retval2;
BUFFER_TRACE(ent->bh, "get write access");
retval = ext4_journal_get_write_access(handle, ent->bh);
retval = ext4_journal_get_write_access(handle, ent->dir->i_sb, ent->bh,
EXT4_JTR_NONE);
if (retval)
return retval;
ent->de->inode = cpu_to_le32(ino);
......
This diff is collapsed.
......@@ -409,7 +409,8 @@ static struct buffer_head *bclean(handle_t *handle, struct super_block *sb,
if (unlikely(!bh))
return ERR_PTR(-ENOMEM);
BUFFER_TRACE(bh, "get_write_access");
if ((err = ext4_journal_get_write_access(handle, bh))) {
err = ext4_journal_get_write_access(handle, sb, bh, EXT4_JTR_NONE);
if (err) {
brelse(bh);
bh = ERR_PTR(err);
} else {
......@@ -474,7 +475,8 @@ static int set_flexbg_block_bitmap(struct super_block *sb, handle_t *handle,
return -ENOMEM;
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle, sb, bh,
EXT4_JTR_NONE);
if (err) {
brelse(bh);
return err;
......@@ -569,7 +571,8 @@ static int setup_new_flex_group_blocks(struct super_block *sb,
}
BUFFER_TRACE(gdb, "get_write_access");
err = ext4_journal_get_write_access(handle, gdb);
err = ext4_journal_get_write_access(handle, sb, gdb,
EXT4_JTR_NONE);
if (err) {
brelse(gdb);
goto out;
......@@ -837,17 +840,18 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
}
BUFFER_TRACE(EXT4_SB(sb)->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh);
err = ext4_journal_get_write_access(handle, sb, EXT4_SB(sb)->s_sbh,
EXT4_JTR_NONE);
if (unlikely(err))
goto errout;
BUFFER_TRACE(gdb_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, gdb_bh);
err = ext4_journal_get_write_access(handle, sb, gdb_bh, EXT4_JTR_NONE);
if (unlikely(err))
goto errout;
BUFFER_TRACE(dind, "get_write_access");
err = ext4_journal_get_write_access(handle, dind);
err = ext4_journal_get_write_access(handle, sb, dind, EXT4_JTR_NONE);
if (unlikely(err)) {
ext4_std_error(sb, err);
goto errout;
......@@ -956,7 +960,7 @@ static int add_new_gdb_meta_bg(struct super_block *sb,
n_group_desc[gdb_num] = gdb_bh;
BUFFER_TRACE(gdb_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, gdb_bh);
err = ext4_journal_get_write_access(handle, sb, gdb_bh, EXT4_JTR_NONE);
if (err) {
kvfree(n_group_desc);
brelse(gdb_bh);
......@@ -1042,7 +1046,8 @@ static int reserve_backup_gdb(handle_t *handle, struct inode *inode,
for (i = 0; i < reserved_gdb; i++) {
BUFFER_TRACE(primary[i], "get_write_access");
if ((err = ext4_journal_get_write_access(handle, primary[i])))
if ((err = ext4_journal_get_write_access(handle, sb, primary[i],
EXT4_JTR_NONE)))
goto exit_bh;
}
......@@ -1149,10 +1154,9 @@ static void update_backups(struct super_block *sb, sector_t blk_off, char *data,
backup_block, backup_block -
ext4_group_first_block_no(sb, group));
BUFFER_TRACE(bh, "get_write_access");
if ((err = ext4_journal_get_write_access(handle, bh))) {
brelse(bh);
if ((err = ext4_journal_get_write_access(handle, sb, bh,
EXT4_JTR_NONE)))
break;
}
lock_buffer(bh);
memcpy(bh->b_data, data, size);
if (rest)
......@@ -1232,7 +1236,8 @@ static int ext4_add_new_descs(handle_t *handle, struct super_block *sb,
gdb_bh = sbi_array_rcu_deref(sbi, s_group_desc,
gdb_num);
BUFFER_TRACE(gdb_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, gdb_bh);
err = ext4_journal_get_write_access(handle, sb, gdb_bh,
EXT4_JTR_NONE);
if (!err && reserved_gdb && ext4_bg_num_gdb(sb, group))
err = reserve_backup_gdb(handle, resize_inode, group);
......@@ -1509,7 +1514,8 @@ static int ext4_flex_group_add(struct super_block *sb,
}
BUFFER_TRACE(sbi->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
err = ext4_journal_get_write_access(handle, sb, sbi->s_sbh,
EXT4_JTR_NONE);
if (err)
goto exit_journal;
......@@ -1722,7 +1728,8 @@ static int ext4_group_extend_no_check(struct super_block *sb,
}
BUFFER_TRACE(EXT4_SB(sb)->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh);
err = ext4_journal_get_write_access(handle, sb, EXT4_SB(sb)->s_sbh,
EXT4_JTR_NONE);
if (err) {
ext4_warning(sb, "error %d on journal write access", err);
goto errout;
......@@ -1884,7 +1891,8 @@ static int ext4_convert_meta_bg(struct super_block *sb, struct inode *inode)
return PTR_ERR(handle);
BUFFER_TRACE(sbi->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
err = ext4_journal_get_write_access(handle, sb, sbi->s_sbh,
EXT4_JTR_NONE);
if (err)
goto errout;
......
This diff is collapsed.
......@@ -791,7 +791,8 @@ static void ext4_xattr_update_super_block(handle_t *handle,
return;
BUFFER_TRACE(EXT4_SB(sb)->s_sbh, "get_write_access");
if (ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh) == 0) {
if (ext4_journal_get_write_access(handle, sb, EXT4_SB(sb)->s_sbh,
EXT4_JTR_NONE) == 0) {
lock_buffer(EXT4_SB(sb)->s_sbh);
ext4_set_feature_xattr(sb);
ext4_superblock_csum_set(sb);
......@@ -1169,7 +1170,8 @@ ext4_xattr_inode_dec_ref_all(handle_t *handle, struct inode *parent,
continue;
}
if (err > 0) {
err = ext4_journal_get_write_access(handle, bh);
err = ext4_journal_get_write_access(handle,
parent->i_sb, bh, EXT4_JTR_NONE);
if (err) {
ext4_warning_inode(ea_inode,
"Re-get write access err=%d",
......@@ -1230,7 +1232,8 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
int error = 0;
BUFFER_TRACE(bh, "get_write_access");
error = ext4_journal_get_write_access(handle, bh);
error = ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (error)
goto out;
......@@ -1371,7 +1374,8 @@ static int ext4_xattr_inode_write(handle_t *handle, struct inode *ea_inode,
"ext4_getblk() return bh = NULL");
return -EFSCORRUPTED;
}
ret = ext4_journal_get_write_access(handle, bh);
ret = ext4_journal_get_write_access(handle, ea_inode->i_sb, bh,
EXT4_JTR_NONE);
if (ret)
goto out;
......@@ -1855,7 +1859,8 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
if (s->base) {
BUFFER_TRACE(bs->bh, "get_write_access");
error = ext4_journal_get_write_access(handle, bs->bh);
error = ext4_journal_get_write_access(handle, sb, bs->bh,
EXT4_JTR_NONE);
if (error)
goto cleanup;
lock_buffer(bs->bh);
......@@ -1987,8 +1992,9 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
if (error)
goto cleanup;
BUFFER_TRACE(new_bh, "get_write_access");
error = ext4_journal_get_write_access(handle,
new_bh);
error = ext4_journal_get_write_access(
handle, sb, new_bh,
EXT4_JTR_NONE);
if (error)
goto cleanup_dquot;
lock_buffer(new_bh);
......@@ -2092,7 +2098,8 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
}
lock_buffer(new_bh);
error = ext4_journal_get_create_access(handle, new_bh);
error = ext4_journal_get_create_access(handle, sb,
new_bh, EXT4_JTR_NONE);
if (error) {
unlock_buffer(new_bh);
error = -EIO;
......@@ -2848,7 +2855,8 @@ int ext4_xattr_delete_inode(handle_t *handle, struct inode *inode,
goto cleanup;
}
error = ext4_journal_get_write_access(handle, iloc.bh);
error = ext4_journal_get_write_access(handle, inode->i_sb,
iloc.bh, EXT4_JTR_NONE);
if (error) {
EXT4_ERROR_INODE(inode, "write access (error %d)",
error);
......
......@@ -179,8 +179,8 @@ static int jbd2_descriptor_block_csum_verify(journal_t *j, void *buf)
if (!jbd2_journal_has_csum_v2or3(j))
return 1;
tail = (struct jbd2_journal_block_tail *)(buf + j->j_blocksize -
sizeof(struct jbd2_journal_block_tail));
tail = (struct jbd2_journal_block_tail *)((char *)buf +
j->j_blocksize - sizeof(struct jbd2_journal_block_tail));
provided = tail->t_checksum;
tail->t_checksum = 0;
calculated = jbd2_chksum(j, j->j_csum_seed, buf, j->j_blocksize);
......@@ -196,7 +196,7 @@ static int jbd2_descriptor_block_csum_verify(journal_t *j, void *buf)
static int count_tags(journal_t *journal, struct buffer_head *bh)
{
char * tagp;
journal_block_tag_t * tag;
journal_block_tag_t tag;
int nr = 0, size = journal->j_blocksize;
int tag_bytes = journal_tag_bytes(journal);
......@@ -206,14 +206,14 @@ static int count_tags(journal_t *journal, struct buffer_head *bh)
tagp = &bh->b_data[sizeof(journal_header_t)];
while ((tagp - bh->b_data + tag_bytes) <= size) {
tag = (journal_block_tag_t *) tagp;
memcpy(&tag, tagp, sizeof(tag));
nr++;
tagp += tag_bytes;
if (!(tag->t_flags & cpu_to_be16(JBD2_FLAG_SAME_UUID)))
if (!(tag.t_flags & cpu_to_be16(JBD2_FLAG_SAME_UUID)))
tagp += 16;
if (tag->t_flags & cpu_to_be16(JBD2_FLAG_LAST_TAG))
if (tag.t_flags & cpu_to_be16(JBD2_FLAG_LAST_TAG))
break;
}
......@@ -433,9 +433,9 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf)
}
static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
journal_block_tag3_t *tag3,
void *buf, __u32 sequence)
{
journal_block_tag3_t *tag3 = (journal_block_tag3_t *)tag;
__u32 csum32;
__be32 seq;
......@@ -496,7 +496,7 @@ static int do_one_pass(journal_t *journal,
while (1) {
int flags;
char * tagp;
journal_block_tag_t * tag;
journal_block_tag_t tag;
struct buffer_head * obh;
struct buffer_head * nbh;
......@@ -613,8 +613,8 @@ static int do_one_pass(journal_t *journal,
<= journal->j_blocksize - descr_csum_size) {
unsigned long io_block;
tag = (journal_block_tag_t *) tagp;
flags = be16_to_cpu(tag->t_flags);
memcpy(&tag, tagp, sizeof(tag));
flags = be16_to_cpu(tag.t_flags);
io_block = next_log_block++;
wrap(journal, next_log_block);
......@@ -632,7 +632,7 @@ static int do_one_pass(journal_t *journal,
J_ASSERT(obh != NULL);
blocknr = read_tag_block(journal,
tag);
&tag);
/* If the block has been
* revoked, then we're all done
......@@ -647,8 +647,8 @@ static int do_one_pass(journal_t *journal,
/* Look for block corruption */
if (!jbd2_block_tag_csum_verify(
journal, tag, obh->b_data,
be32_to_cpu(tmp->h_sequence))) {
journal, &tag, (journal_block_tag3_t *)tagp,
obh->b_data, be32_to_cpu(tmp->h_sequence))) {
brelse(obh);
success = -EFSBADCRC;
printk(KERN_ERR "JBD2: Invalid "
......@@ -760,7 +760,6 @@ static int do_one_pass(journal_t *journal,
*/
jbd_debug(1, "JBD2: Invalid checksum ignored in transaction %u, likely stale data\n",
next_commit_ID);
err = 0;
brelse(bh);
goto done;
}
......@@ -897,7 +896,7 @@ static int scan_revoke_records(journal_t *journal, struct buffer_head *bh,
{
jbd2_journal_revoke_header_t *header;
int offset, max;
int csum_size = 0;
unsigned csum_size = 0;
__u32 rcount;
int record_len = 4;
......
......@@ -223,9 +223,15 @@ static void sub_reserved_credits(journal_t *journal, int blocks)
* with j_state_lock held for reading. Returns 0 if handle joined the running
* transaction. Returns 1 if we had to wait, j_state_lock is dropped, and
* caller must retry.
*
* Note: because j_state_lock may be dropped depending on the return
* value, we need to fake out sparse so ti doesn't complain about a
* locking imbalance. Callers of add_transaction_credits will need to
* make a similar accomodation.
*/
static int add_transaction_credits(journal_t *journal, int blocks,
int rsv_blocks)
__must_hold(&journal->j_state_lock)
{
transaction_t *t = journal->j_running_transaction;
int needed;
......@@ -238,6 +244,7 @@ static int add_transaction_credits(journal_t *journal, int blocks,
if (t->t_state != T_RUNNING) {
WARN_ON_ONCE(t->t_state >= T_FLUSH);
wait_transaction_locked(journal);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
......@@ -266,10 +273,12 @@ static int add_transaction_credits(journal_t *journal, int blocks,
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + total <=
journal->j_max_transaction_buffers);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
wait_transaction_locked(journal);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
......@@ -293,6 +302,7 @@ static int add_transaction_credits(journal_t *journal, int blocks,
journal->j_max_transaction_buffers)
__jbd2_log_wait_for_space(journal);
write_unlock(&journal->j_state_lock);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
......@@ -310,6 +320,7 @@ static int add_transaction_credits(journal_t *journal, int blocks,
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + rsv_blocks
<= journal->j_max_transaction_buffers / 2);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
return 0;
......@@ -413,8 +424,14 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
if (!handle->h_reserved) {
/* We may have dropped j_state_lock - restart in that case */
if (add_transaction_credits(journal, blocks, rsv_blocks))
if (add_transaction_credits(journal, blocks, rsv_blocks)) {
/*
* add_transaction_credits releases
* j_state_lock on a non-zero return
*/
__release(&journal->j_state_lock);
goto repeat;
}
} else {
/*
* We have handle reserved so we are allowed to join T_LOCKED
......@@ -1404,7 +1421,7 @@ void jbd2_journal_set_triggers(struct buffer_head *bh,
{
struct journal_head *jh = jbd2_journal_grab_journal_head(bh);
if (WARN_ON(!jh))
if (WARN_ON_ONCE(!jh))
return;
jh->b_triggers = type;
jbd2_journal_put_journal_head(jh);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment