Commit dd05e42f authored by Linus Torvalds's avatar Linus Torvalds
parents c1d96203 1f04c0a2
...@@ -50,9 +50,14 @@ userspace utilities, etc. ...@@ -50,9 +50,14 @@ userspace utilities, etc.
Features Features
======== ========
- This is a complete rewrite of the NTFS driver that used to be in the kernel. - This is a complete rewrite of the NTFS driver that used to be in the 2.4 and
This new driver implements NTFS read support and is functionally equivalent earlier kernels. This new driver implements NTFS read support and is
to the old ntfs driver. functionally equivalent to the old ntfs driver and it also implements limited
write support. The biggest limitation at present is that files/directories
cannot be created or deleted. See below for the list of write features that
are so far supported. Another limitation is that writing to compressed files
is not implemented at all. Also, neither read nor write access to encrypted
files is so far implemented.
- The new driver has full support for sparse files on NTFS 3.x volumes which - The new driver has full support for sparse files on NTFS 3.x volumes which
the old driver isn't happy with. the old driver isn't happy with.
- The new driver supports execution of binaries due to mmap() now being - The new driver supports execution of binaries due to mmap() now being
...@@ -78,7 +83,20 @@ Features ...@@ -78,7 +83,20 @@ Features
- The new driver supports fsync(2), fdatasync(2), and msync(2). - The new driver supports fsync(2), fdatasync(2), and msync(2).
- The new driver supports readv(2) and writev(2). - The new driver supports readv(2) and writev(2).
- The new driver supports access time updates (including mtime and ctime). - The new driver supports access time updates (including mtime and ctime).
- The new driver supports truncate(2) and open(2) with O_TRUNC. But at present
only very limited support for highly fragmented files, i.e. ones which have
their data attribute split across multiple extents, is included. Another
limitation is that at present truncate(2) will never create sparse files,
since to mark a file sparse we need to modify the directory entry for the
file and we do not implement directory modifications yet.
- The new driver supports write(2) which can both overwrite existing data and
extend the file size so that you can write beyond the existing data. Also,
writing into sparse regions is supported and the holes are filled in with
clusters. But at present only limited support for highly fragmented files,
i.e. ones which have their data attribute split across multiple extents, is
included. Another limitation is that write(2) will never create sparse
files, since to mark a file sparse we need to modify the directory entry for
the file and we do not implement directory modifications yet.
Supported mount options Supported mount options
======================= =======================
...@@ -439,6 +457,22 @@ ChangeLog ...@@ -439,6 +457,22 @@ ChangeLog
Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog. Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog.
2.1.25:
- Write support is now extended with write(2) being able to both
overwrite existing file data and to extend files. Also, if a write
to a sparse region occurs, write(2) will fill in the hole. Note,
mmap(2) based writes still do not support writing into holes or
writing beyond the initialized size.
- Write support has a new feature and that is that truncate(2) and
open(2) with O_TRUNC are now implemented thus files can be both made
smaller and larger.
- Note: Both write(2) and truncate(2)/open(2) with O_TRUNC still have
limitations in that they
- only provide limited support for highly fragmented files.
- only work on regular, i.e. uncompressed and unencrypted files.
- never create sparse files although this will change once directory
operations are implemented.
- Lots of bug fixes and enhancements across the board.
2.1.24: 2.1.24:
- Support journals ($LogFile) which have been modified by chkdsk. This - Support journals ($LogFile) which have been modified by chkdsk. This
means users can boot into Windows after we marked the volume dirty. means users can boot into Windows after we marked the volume dirty.
......
ToDo/Notes: ToDo/Notes:
- Find and fix bugs. - Find and fix bugs.
- In between ntfs_prepare/commit_write, need exclusion between - The only places in the kernel where a file is resized are
simultaneous file extensions. This is given to us by holding i_sem ntfs_file_write*() and ntfs_truncate() for both of which i_sem is
on the inode. The only places in the kernel when a file is resized held. Just have to be careful in read-/writepage and other helpers
are prepare/commit write and truncate for both of which i_sem is not running under i_sem that we play nice... Also need to be careful
held. Just have to be careful in readpage/writepage and all other with initialized_size extension in ntfs_file_write*() and writepage.
helpers not running under i_sem that we play nice... UPDATE: The only things that need to be checked are the compressed
Also need to be careful with initialized_size extention in write and the other attribute resize/write cases like index
ntfs_prepare_write. Basically, just be _very_ careful in this code... attributes, etc. For now none of these are implemented so are safe.
UPDATE: The only things that need to be checked are read/writepage - Implement filling in of holes in aops.c::ntfs_writepage() and its
which do not hold i_sem. Note writepage cannot change i_size but it helpers.
needs to cope with a concurrent i_size change, just like readpage.
Also both need to cope with concurrent changes to the other sizes,
i.e. initialized/allocated/compressed size, as well.
- Implement mft.c::sync_mft_mirror_umount(). We currently will just - Implement mft.c::sync_mft_mirror_umount(). We currently will just
leave the volume dirty on umount if the final iput(vol->mft_ino) leave the volume dirty on umount if the final iput(vol->mft_ino)
causes a write of any mirrored mft records due to the mft mirror causes a write of any mirrored mft records due to the mft mirror
...@@ -22,6 +19,68 @@ ToDo/Notes: ...@@ -22,6 +19,68 @@ ToDo/Notes:
- Enable the code for setting the NT4 compatibility flag when we start - Enable the code for setting the NT4 compatibility flag when we start
making NTFS 1.2 specific modifications. making NTFS 1.2 specific modifications.
2.1.25 - (Almost) fully implement write(2) and truncate(2).
- Change ntfs_map_runlist_nolock(), ntfs_attr_find_vcn_nolock() and
{__,}ntfs_cluster_free() to also take an optional attribute search
context as argument. This allows calling these functions with the
mft record mapped. Update all callers.
- Fix potential deadlock in ntfs_mft_data_extend_allocation_nolock()
error handling by passing in the active search context when calling
ntfs_cluster_free().
- Change ntfs_cluster_alloc() to take an extra boolean parameter
specifying whether the cluster are being allocated to extend an
attribute or to fill a hole.
- Change ntfs_attr_make_non_resident() to call ntfs_cluster_alloc()
with @is_extension set to TRUE and remove the runlist terminator
fixup code as this is now done by ntfs_cluster_alloc().
- Change ntfs_attr_make_non_resident to take the attribute value size
as an extra parameter. This is needed since we need to know the size
before we can map the mft record and our callers always know it. The
reason we cannot simply read the size from the vfs inode i_size is
that this is not necessarily uptodate. This happens when
ntfs_attr_make_non_resident() is called in the ->truncate call path.
- Fix ntfs_attr_make_non_resident() to update the vfs inode i_blocks
which is zero for a resident attribute but should no longer be zero
once the attribute is non-resident as it then has real clusters
allocated.
- Add fs/ntfs/attrib.[hc]::ntfs_attr_extend_allocation(), a function to
extend the allocation of an attributes. Optionally, the data size,
but not the initialized size can be extended, too.
- Implement fs/ntfs/inode.[hc]::ntfs_truncate(). It only supports
uncompressed and unencrypted files and it never creates sparse files
at least for the moment (making a file sparse requires us to modify
its directory entries and we do not support directory operations at
the moment). Also, support for highly fragmented files, i.e. ones
whose data attribute is split across multiple extents, is severly
limited. When such a case is encountered, EOPNOTSUPP is returned.
- Enable ATTR_SIZE attribute changes in ntfs_setattr(). This completes
the initial implementation of file truncation. Now both open(2)ing
a file with the O_TRUNC flag and the {,f}truncate(2) system calls
will resize a file appropriately. The limitations are that only
uncompressed and unencrypted files are supported. Also, there is
only very limited support for highly fragmented files (the ones whose
$DATA attribute is split into multiple attribute extents).
- In attrib.c::ntfs_attr_set() call balance_dirty_pages_ratelimited()
and cond_resched() in the main loop as we could be dirtying a lot of
pages and this ensures we play nice with the VM and the system as a
whole.
- Implement file operations ->write, ->aio_write, ->writev for regular
files. This replaces the old use of generic_file_write(), et al and
the address space operations ->prepare_write and ->commit_write.
This means that both sparse and non-sparse (unencrypted and
uncompressed) files can now be extended using the normal write(2)
code path. There are two limitations at present and these are that
we never create sparse files and that we only have limited support
for highly fragmented files, i.e. ones whose data attribute is split
across multiple extents. When such a case is encountered,
EOPNOTSUPP is returned.
- $EA attributes can be both resident and non-resident.
- Use %z for size_t to fix compilation warnings. (Andrew Morton)
- Fix compilation warnings with gcc-4.0.2 on SUSE 10.0.
- Document extended attribute ($EA) NEED_EA flag. (Based on libntfs
patch by Yura Pakhuchiy.)
2.1.24 - Lots of bug fixes and support more clean journal states. 2.1.24 - Lots of bug fixes and support more clean journal states.
- Support journals ($LogFile) which have been modified by chkdsk. This - Support journals ($LogFile) which have been modified by chkdsk. This
......
...@@ -6,7 +6,7 @@ ntfs-objs := aops.o attrib.o collate.o compress.o debug.o dir.o file.o \ ...@@ -6,7 +6,7 @@ ntfs-objs := aops.o attrib.o collate.o compress.o debug.o dir.o file.o \
index.o inode.o mft.o mst.o namei.o runlist.o super.o sysctl.o \ index.o inode.o mft.o mst.o namei.o runlist.o super.o sysctl.o \
unistr.o upcase.o unistr.o upcase.o
EXTRA_CFLAGS = -DNTFS_VERSION=\"2.1.24\" EXTRA_CFLAGS = -DNTFS_VERSION=\"2.1.25\"
ifeq ($(CONFIG_NTFS_DEBUG),y) ifeq ($(CONFIG_NTFS_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG EXTRA_CFLAGS += -DDEBUG
......
This diff is collapsed.
This diff is collapsed.
...@@ -60,14 +60,15 @@ typedef struct { ...@@ -60,14 +60,15 @@ typedef struct {
ATTR_RECORD *base_attr; ATTR_RECORD *base_attr;
} ntfs_attr_search_ctx; } ntfs_attr_search_ctx;
extern int ntfs_map_runlist_nolock(ntfs_inode *ni, VCN vcn); extern int ntfs_map_runlist_nolock(ntfs_inode *ni, VCN vcn,
ntfs_attr_search_ctx *ctx);
extern int ntfs_map_runlist(ntfs_inode *ni, VCN vcn); extern int ntfs_map_runlist(ntfs_inode *ni, VCN vcn);
extern LCN ntfs_attr_vcn_to_lcn_nolock(ntfs_inode *ni, const VCN vcn, extern LCN ntfs_attr_vcn_to_lcn_nolock(ntfs_inode *ni, const VCN vcn,
const BOOL write_locked); const BOOL write_locked);
extern runlist_element *ntfs_attr_find_vcn_nolock(ntfs_inode *ni, extern runlist_element *ntfs_attr_find_vcn_nolock(ntfs_inode *ni,
const VCN vcn, const BOOL write_locked); const VCN vcn, ntfs_attr_search_ctx *ctx);
int ntfs_attr_lookup(const ATTR_TYPE type, const ntfschar *name, int ntfs_attr_lookup(const ATTR_TYPE type, const ntfschar *name,
const u32 name_len, const IGNORE_CASE_BOOL ic, const u32 name_len, const IGNORE_CASE_BOOL ic,
...@@ -102,7 +103,10 @@ extern int ntfs_attr_record_resize(MFT_RECORD *m, ATTR_RECORD *a, u32 new_size); ...@@ -102,7 +103,10 @@ extern int ntfs_attr_record_resize(MFT_RECORD *m, ATTR_RECORD *a, u32 new_size);
extern int ntfs_resident_attr_value_resize(MFT_RECORD *m, ATTR_RECORD *a, extern int ntfs_resident_attr_value_resize(MFT_RECORD *m, ATTR_RECORD *a,
const u32 new_size); const u32 new_size);
extern int ntfs_attr_make_non_resident(ntfs_inode *ni); extern int ntfs_attr_make_non_resident(ntfs_inode *ni, const u32 data_size);
extern s64 ntfs_attr_extend_allocation(ntfs_inode *ni, s64 new_alloc_size,
const s64 new_data_size, const s64 data_start);
extern int ntfs_attr_set(ntfs_inode *ni, const s64 ofs, const s64 cnt, extern int ntfs_attr_set(ntfs_inode *ni, const s64 ofs, const s64 cnt,
const u8 val); const u8 val);
......
This diff is collapsed.
This diff is collapsed.
...@@ -1021,10 +1021,17 @@ enum { ...@@ -1021,10 +1021,17 @@ enum {
FILE_NAME_POSIX = 0x00, FILE_NAME_POSIX = 0x00,
/* This is the largest namespace. It is case sensitive and allows all /* This is the largest namespace. It is case sensitive and allows all
Unicode characters except for: '\0' and '/'. Beware that in Unicode characters except for: '\0' and '/'. Beware that in
WinNT/2k files which eg have the same name except for their case WinNT/2k/2003 by default files which eg have the same name except
will not be distinguished by the standard utilities and thus a "del for their case will not be distinguished by the standard utilities
filename" will delete both "filename" and "fileName" without and thus a "del filename" will delete both "filename" and "fileName"
warning. */ without warning. However if for example Services For Unix (SFU) are
installed and the case sensitive option was enabled at installation
time, then you can create/access/delete such files.
Note that even SFU places restrictions on the filenames beyond the
'\0' and '/' and in particular the following set of characters is
not allowed: '"', '/', '<', '>', '\'. All other characters,
including the ones no allowed in WIN32 namespace are allowed.
Tested with SFU 3.5 (this is now free) running on Windows XP. */
FILE_NAME_WIN32 = 0x01, FILE_NAME_WIN32 = 0x01,
/* The standard WinNT/2k NTFS long filenames. Case insensitive. All /* The standard WinNT/2k NTFS long filenames. Case insensitive. All
Unicode chars except: '\0', '"', '*', '/', ':', '<', '>', '?', '\', Unicode chars except: '\0', '"', '*', '/', ':', '<', '>', '?', '\',
...@@ -2367,7 +2374,9 @@ typedef struct { ...@@ -2367,7 +2374,9 @@ typedef struct {
* Extended attribute flags (8-bit). * Extended attribute flags (8-bit).
*/ */
enum { enum {
NEED_EA = 0x80 NEED_EA = 0x80 /* If set the file to which the EA belongs
cannot be interpreted without understanding
the associates extended attributes. */
} __attribute__ ((__packed__)); } __attribute__ ((__packed__));
typedef u8 EA_FLAGS; typedef u8 EA_FLAGS;
...@@ -2375,20 +2384,20 @@ typedef u8 EA_FLAGS; ...@@ -2375,20 +2384,20 @@ typedef u8 EA_FLAGS;
/* /*
* Attribute: Extended attribute (EA) (0xe0). * Attribute: Extended attribute (EA) (0xe0).
* *
* NOTE: Always non-resident. (Is this true?) * NOTE: Can be resident or non-resident.
* *
* Like the attribute list and the index buffer list, the EA attribute value is * Like the attribute list and the index buffer list, the EA attribute value is
* a sequence of EA_ATTR variable length records. * a sequence of EA_ATTR variable length records.
*
* FIXME: It appears weird that the EA name is not unicode. Is it true?
*/ */
typedef struct { typedef struct {
le32 next_entry_offset; /* Offset to the next EA_ATTR. */ le32 next_entry_offset; /* Offset to the next EA_ATTR. */
EA_FLAGS flags; /* Flags describing the EA. */ EA_FLAGS flags; /* Flags describing the EA. */
u8 ea_name_length; /* Length of the name of the EA in bytes. */ u8 ea_name_length; /* Length of the name of the EA in bytes
excluding the '\0' byte terminator. */
le16 ea_value_length; /* Byte size of the EA's value. */ le16 ea_value_length; /* Byte size of the EA's value. */
u8 ea_name[0]; /* Name of the EA. */ u8 ea_name[0]; /* Name of the EA. Note this is ASCII, not
u8 ea_value[0]; /* The value of the EA. Immediately follows Unicode and it is zero terminated. */
u8 ea_value[0]; /* The value of the EA. Immediately follows
the name. */ the name. */
} __attribute__ ((__packed__)) EA_ATTR; } __attribute__ ((__packed__)) EA_ATTR;
......
...@@ -76,6 +76,7 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, ...@@ -76,6 +76,7 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol,
* @count: number of clusters to allocate * @count: number of clusters to allocate
* @start_lcn: starting lcn at which to allocate the clusters (or -1 if none) * @start_lcn: starting lcn at which to allocate the clusters (or -1 if none)
* @zone: zone from which to allocate the clusters * @zone: zone from which to allocate the clusters
* @is_extension: if TRUE, this is an attribute extension
* *
* Allocate @count clusters preferably starting at cluster @start_lcn or at the * Allocate @count clusters preferably starting at cluster @start_lcn or at the
* current allocator position if @start_lcn is -1, on the mounted ntfs volume * current allocator position if @start_lcn is -1, on the mounted ntfs volume
...@@ -86,6 +87,13 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, ...@@ -86,6 +87,13 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol,
* @start_vcn specifies the vcn of the first allocated cluster. This makes * @start_vcn specifies the vcn of the first allocated cluster. This makes
* merging the resulting runlist with the old runlist easier. * merging the resulting runlist with the old runlist easier.
* *
* If @is_extension is TRUE, the caller is allocating clusters to extend an
* attribute and if it is FALSE, the caller is allocating clusters to fill a
* hole in an attribute. Practically the difference is that if @is_extension
* is TRUE the returned runlist will be terminated with LCN_ENOENT and if
* @is_extension is FALSE the runlist will be terminated with
* LCN_RL_NOT_MAPPED.
*
* You need to check the return value with IS_ERR(). If this is false, the * You need to check the return value with IS_ERR(). If this is false, the
* function was successful and the return value is a runlist describing the * function was successful and the return value is a runlist describing the
* allocated cluster(s). If IS_ERR() is true, the function failed and * allocated cluster(s). If IS_ERR() is true, the function failed and
...@@ -137,7 +145,8 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, ...@@ -137,7 +145,8 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol,
*/ */
runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, const VCN start_vcn, runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, const VCN start_vcn,
const s64 count, const LCN start_lcn, const s64 count, const LCN start_lcn,
const NTFS_CLUSTER_ALLOCATION_ZONES zone) const NTFS_CLUSTER_ALLOCATION_ZONES zone,
const BOOL is_extension)
{ {
LCN zone_start, zone_end, bmp_pos, bmp_initial_pos, last_read_pos, lcn; LCN zone_start, zone_end, bmp_pos, bmp_initial_pos, last_read_pos, lcn;
LCN prev_lcn = 0, prev_run_len = 0, mft_zone_size; LCN prev_lcn = 0, prev_run_len = 0, mft_zone_size;
...@@ -310,7 +319,7 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, const VCN start_vcn, ...@@ -310,7 +319,7 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, const VCN start_vcn,
continue; continue;
} }
bit = 1 << (lcn & 7); bit = 1 << (lcn & 7);
ntfs_debug("bit %i.", bit); ntfs_debug("bit 0x%x.", bit);
/* If the bit is already set, go onto the next one. */ /* If the bit is already set, go onto the next one. */
if (*byte & bit) { if (*byte & bit) {
lcn++; lcn++;
...@@ -729,7 +738,7 @@ switch_to_data1_zone: search_zone = 2; ...@@ -729,7 +738,7 @@ switch_to_data1_zone: search_zone = 2;
/* Add runlist terminator element. */ /* Add runlist terminator element. */
if (likely(rl)) { if (likely(rl)) {
rl[rlpos].vcn = rl[rlpos - 1].vcn + rl[rlpos - 1].length; rl[rlpos].vcn = rl[rlpos - 1].vcn + rl[rlpos - 1].length;
rl[rlpos].lcn = LCN_RL_NOT_MAPPED; rl[rlpos].lcn = is_extension ? LCN_ENOENT : LCN_RL_NOT_MAPPED;
rl[rlpos].length = 0; rl[rlpos].length = 0;
} }
if (likely(page && !IS_ERR(page))) { if (likely(page && !IS_ERR(page))) {
...@@ -782,6 +791,7 @@ switch_to_data1_zone: search_zone = 2; ...@@ -782,6 +791,7 @@ switch_to_data1_zone: search_zone = 2;
* @ni: ntfs inode whose runlist describes the clusters to free * @ni: ntfs inode whose runlist describes the clusters to free
* @start_vcn: vcn in the runlist of @ni at which to start freeing clusters * @start_vcn: vcn in the runlist of @ni at which to start freeing clusters
* @count: number of clusters to free or -1 for all clusters * @count: number of clusters to free or -1 for all clusters
* @ctx: active attribute search context if present or NULL if not
* @is_rollback: true if this is a rollback operation * @is_rollback: true if this is a rollback operation
* *
* Free @count clusters starting at the cluster @start_vcn in the runlist * Free @count clusters starting at the cluster @start_vcn in the runlist
...@@ -791,15 +801,39 @@ switch_to_data1_zone: search_zone = 2; ...@@ -791,15 +801,39 @@ switch_to_data1_zone: search_zone = 2;
* deallocated. Thus, to completely free all clusters in a runlist, use * deallocated. Thus, to completely free all clusters in a runlist, use
* @start_vcn = 0 and @count = -1. * @start_vcn = 0 and @count = -1.
* *
* If @ctx is specified, it is an active search context of @ni and its base mft
* record. This is needed when __ntfs_cluster_free() encounters unmapped
* runlist fragments and allows their mapping. If you do not have the mft
* record mapped, you can specify @ctx as NULL and __ntfs_cluster_free() will
* perform the necessary mapping and unmapping.
*
* Note, __ntfs_cluster_free() saves the state of @ctx on entry and restores it
* before returning. Thus, @ctx will be left pointing to the same attribute on
* return as on entry. However, the actual pointers in @ctx may point to
* different memory locations on return, so you must remember to reset any
* cached pointers from the @ctx, i.e. after the call to __ntfs_cluster_free(),
* you will probably want to do:
* m = ctx->mrec;
* a = ctx->attr;
* Assuming you cache ctx->attr in a variable @a of type ATTR_RECORD * and that
* you cache ctx->mrec in a variable @m of type MFT_RECORD *.
*
* @is_rollback should always be FALSE, it is for internal use to rollback * @is_rollback should always be FALSE, it is for internal use to rollback
* errors. You probably want to use ntfs_cluster_free() instead. * errors. You probably want to use ntfs_cluster_free() instead.
* *
* Note, ntfs_cluster_free() does not modify the runlist at all, so the caller * Note, __ntfs_cluster_free() does not modify the runlist, so you have to
* has to deal with it later. * remove from the runlist or mark sparse the freed runs later.
* *
* Return the number of deallocated clusters (not counting sparse ones) on * Return the number of deallocated clusters (not counting sparse ones) on
* success and -errno on error. * success and -errno on error.
* *
* WARNING: If @ctx is supplied, regardless of whether success or failure is
* returned, you need to check IS_ERR(@ctx->mrec) and if TRUE the @ctx
* is no longer valid, i.e. you need to either call
* ntfs_attr_reinit_search_ctx() or ntfs_attr_put_search_ctx() on it.
* In that case PTR_ERR(@ctx->mrec) will give you the error code for
* why the mapping of the old inode failed.
*
* Locking: - The runlist described by @ni must be locked for writing on entry * Locking: - The runlist described by @ni must be locked for writing on entry
* and is locked on return. Note the runlist may be modified when * and is locked on return. Note the runlist may be modified when
* needed runlist fragments need to be mapped. * needed runlist fragments need to be mapped.
...@@ -807,9 +841,13 @@ switch_to_data1_zone: search_zone = 2; ...@@ -807,9 +841,13 @@ switch_to_data1_zone: search_zone = 2;
* on return. * on return.
* - This function takes the volume lcn bitmap lock for writing and * - This function takes the volume lcn bitmap lock for writing and
* modifies the bitmap contents. * modifies the bitmap contents.
* - If @ctx is NULL, the base mft record of @ni must not be mapped on
* entry and it will be left unmapped on return.
* - If @ctx is not NULL, the base mft record must be mapped on entry
* and it will be left mapped on return.
*/ */
s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count, s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count,
const BOOL is_rollback) ntfs_attr_search_ctx *ctx, const BOOL is_rollback)
{ {
s64 delta, to_free, total_freed, real_freed; s64 delta, to_free, total_freed, real_freed;
ntfs_volume *vol; ntfs_volume *vol;
...@@ -839,7 +877,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count, ...@@ -839,7 +877,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count,
total_freed = real_freed = 0; total_freed = real_freed = 0;
rl = ntfs_attr_find_vcn_nolock(ni, start_vcn, TRUE); rl = ntfs_attr_find_vcn_nolock(ni, start_vcn, ctx);
if (IS_ERR(rl)) { if (IS_ERR(rl)) {
if (!is_rollback) if (!is_rollback)
ntfs_error(vol->sb, "Failed to find first runlist " ntfs_error(vol->sb, "Failed to find first runlist "
...@@ -893,7 +931,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count, ...@@ -893,7 +931,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count,
/* Attempt to map runlist. */ /* Attempt to map runlist. */
vcn = rl->vcn; vcn = rl->vcn;
rl = ntfs_attr_find_vcn_nolock(ni, vcn, TRUE); rl = ntfs_attr_find_vcn_nolock(ni, vcn, ctx);
if (IS_ERR(rl)) { if (IS_ERR(rl)) {
err = PTR_ERR(rl); err = PTR_ERR(rl);
if (!is_rollback) if (!is_rollback)
...@@ -961,7 +999,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count, ...@@ -961,7 +999,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count,
* If rollback fails, set the volume errors flag, emit an error * If rollback fails, set the volume errors flag, emit an error
* message, and return the error code. * message, and return the error code.
*/ */
delta = __ntfs_cluster_free(ni, start_vcn, total_freed, TRUE); delta = __ntfs_cluster_free(ni, start_vcn, total_freed, ctx, TRUE);
if (delta < 0) { if (delta < 0) {
ntfs_error(vol->sb, "Failed to rollback (error %i). Leaving " ntfs_error(vol->sb, "Failed to rollback (error %i). Leaving "
"inconsistent metadata! Unmount and run " "inconsistent metadata! Unmount and run "
......
...@@ -27,6 +27,7 @@ ...@@ -27,6 +27,7 @@
#include <linux/fs.h> #include <linux/fs.h>
#include "attrib.h"
#include "types.h" #include "types.h"
#include "inode.h" #include "inode.h"
#include "runlist.h" #include "runlist.h"
...@@ -41,16 +42,18 @@ typedef enum { ...@@ -41,16 +42,18 @@ typedef enum {
extern runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, extern runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,
const VCN start_vcn, const s64 count, const LCN start_lcn, const VCN start_vcn, const s64 count, const LCN start_lcn,
const NTFS_CLUSTER_ALLOCATION_ZONES zone); const NTFS_CLUSTER_ALLOCATION_ZONES zone,
const BOOL is_extension);
extern s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, extern s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn,
s64 count, const BOOL is_rollback); s64 count, ntfs_attr_search_ctx *ctx, const BOOL is_rollback);
/** /**
* ntfs_cluster_free - free clusters on an ntfs volume * ntfs_cluster_free - free clusters on an ntfs volume
* @ni: ntfs inode whose runlist describes the clusters to free * @ni: ntfs inode whose runlist describes the clusters to free
* @start_vcn: vcn in the runlist of @ni at which to start freeing clusters * @start_vcn: vcn in the runlist of @ni at which to start freeing clusters
* @count: number of clusters to free or -1 for all clusters * @count: number of clusters to free or -1 for all clusters
* @ctx: active attribute search context if present or NULL if not
* *
* Free @count clusters starting at the cluster @start_vcn in the runlist * Free @count clusters starting at the cluster @start_vcn in the runlist
* described by the ntfs inode @ni. * described by the ntfs inode @ni.
...@@ -59,12 +62,36 @@ extern s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, ...@@ -59,12 +62,36 @@ extern s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn,
* deallocated. Thus, to completely free all clusters in a runlist, use * deallocated. Thus, to completely free all clusters in a runlist, use
* @start_vcn = 0 and @count = -1. * @start_vcn = 0 and @count = -1.
* *
* Note, ntfs_cluster_free() does not modify the runlist at all, so the caller * If @ctx is specified, it is an active search context of @ni and its base mft
* has to deal with it later. * record. This is needed when ntfs_cluster_free() encounters unmapped runlist
* fragments and allows their mapping. If you do not have the mft record
* mapped, you can specify @ctx as NULL and ntfs_cluster_free() will perform
* the necessary mapping and unmapping.
*
* Note, ntfs_cluster_free() saves the state of @ctx on entry and restores it
* before returning. Thus, @ctx will be left pointing to the same attribute on
* return as on entry. However, the actual pointers in @ctx may point to
* different memory locations on return, so you must remember to reset any
* cached pointers from the @ctx, i.e. after the call to ntfs_cluster_free(),
* you will probably want to do:
* m = ctx->mrec;
* a = ctx->attr;
* Assuming you cache ctx->attr in a variable @a of type ATTR_RECORD * and that
* you cache ctx->mrec in a variable @m of type MFT_RECORD *.
*
* Note, ntfs_cluster_free() does not modify the runlist, so you have to remove
* from the runlist or mark sparse the freed runs later.
* *
* Return the number of deallocated clusters (not counting sparse ones) on * Return the number of deallocated clusters (not counting sparse ones) on
* success and -errno on error. * success and -errno on error.
* *
* WARNING: If @ctx is supplied, regardless of whether success or failure is
* returned, you need to check IS_ERR(@ctx->mrec) and if TRUE the @ctx
* is no longer valid, i.e. you need to either call
* ntfs_attr_reinit_search_ctx() or ntfs_attr_put_search_ctx() on it.
* In that case PTR_ERR(@ctx->mrec) will give you the error code for
* why the mapping of the old inode failed.
*
* Locking: - The runlist described by @ni must be locked for writing on entry * Locking: - The runlist described by @ni must be locked for writing on entry
* and is locked on return. Note the runlist may be modified when * and is locked on return. Note the runlist may be modified when
* needed runlist fragments need to be mapped. * needed runlist fragments need to be mapped.
...@@ -72,11 +99,15 @@ extern s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, ...@@ -72,11 +99,15 @@ extern s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn,
* on return. * on return.
* - This function takes the volume lcn bitmap lock for writing and * - This function takes the volume lcn bitmap lock for writing and
* modifies the bitmap contents. * modifies the bitmap contents.
* - If @ctx is NULL, the base mft record of @ni must not be mapped on
* entry and it will be left unmapped on return.
* - If @ctx is not NULL, the base mft record must be mapped on entry
* and it will be left mapped on return.
*/ */
static inline s64 ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, static inline s64 ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn,
s64 count) s64 count, ntfs_attr_search_ctx *ctx)
{ {
return __ntfs_cluster_free(ni, start_vcn, count, FALSE); return __ntfs_cluster_free(ni, start_vcn, count, ctx, FALSE);
} }
extern int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, extern int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol,
......
...@@ -39,8 +39,7 @@ ...@@ -39,8 +39,7 @@
* If there was insufficient memory to complete the request, return NULL. * If there was insufficient memory to complete the request, return NULL.
* Depending on @gfp_mask the allocation may be guaranteed to succeed. * Depending on @gfp_mask the allocation may be guaranteed to succeed.
*/ */
static inline void *__ntfs_malloc(unsigned long size, static inline void *__ntfs_malloc(unsigned long size, gfp_t gfp_mask)
gfp_t gfp_mask)
{ {
if (likely(size <= PAGE_SIZE)) { if (likely(size <= PAGE_SIZE)) {
BUG_ON(!size); BUG_ON(!size);
......
...@@ -49,7 +49,8 @@ static inline MFT_RECORD *map_mft_record_page(ntfs_inode *ni) ...@@ -49,7 +49,8 @@ static inline MFT_RECORD *map_mft_record_page(ntfs_inode *ni)
ntfs_volume *vol = ni->vol; ntfs_volume *vol = ni->vol;
struct inode *mft_vi = vol->mft_ino; struct inode *mft_vi = vol->mft_ino;
struct page *page; struct page *page;
unsigned long index, ofs, end_index; unsigned long index, end_index;
unsigned ofs;
BUG_ON(ni->page); BUG_ON(ni->page);
/* /*
...@@ -1308,7 +1309,7 @@ static int ntfs_mft_bitmap_extend_allocation_nolock(ntfs_volume *vol) ...@@ -1308,7 +1309,7 @@ static int ntfs_mft_bitmap_extend_allocation_nolock(ntfs_volume *vol)
ll = mftbmp_ni->allocated_size; ll = mftbmp_ni->allocated_size;
read_unlock_irqrestore(&mftbmp_ni->size_lock, flags); read_unlock_irqrestore(&mftbmp_ni->size_lock, flags);
rl = ntfs_attr_find_vcn_nolock(mftbmp_ni, rl = ntfs_attr_find_vcn_nolock(mftbmp_ni,
(ll - 1) >> vol->cluster_size_bits, TRUE); (ll - 1) >> vol->cluster_size_bits, NULL);
if (unlikely(IS_ERR(rl) || !rl->length || rl->lcn < 0)) { if (unlikely(IS_ERR(rl) || !rl->length || rl->lcn < 0)) {
up_write(&mftbmp_ni->runlist.lock); up_write(&mftbmp_ni->runlist.lock);
ntfs_error(vol->sb, "Failed to determine last allocated " ntfs_error(vol->sb, "Failed to determine last allocated "
...@@ -1354,7 +1355,8 @@ static int ntfs_mft_bitmap_extend_allocation_nolock(ntfs_volume *vol) ...@@ -1354,7 +1355,8 @@ static int ntfs_mft_bitmap_extend_allocation_nolock(ntfs_volume *vol)
up_write(&vol->lcnbmp_lock); up_write(&vol->lcnbmp_lock);
ntfs_unmap_page(page); ntfs_unmap_page(page);
/* Allocate a cluster from the DATA_ZONE. */ /* Allocate a cluster from the DATA_ZONE. */
rl2 = ntfs_cluster_alloc(vol, rl[1].vcn, 1, lcn, DATA_ZONE); rl2 = ntfs_cluster_alloc(vol, rl[1].vcn, 1, lcn, DATA_ZONE,
TRUE);
if (IS_ERR(rl2)) { if (IS_ERR(rl2)) {
up_write(&mftbmp_ni->runlist.lock); up_write(&mftbmp_ni->runlist.lock);
ntfs_error(vol->sb, "Failed to allocate a cluster for " ntfs_error(vol->sb, "Failed to allocate a cluster for "
...@@ -1738,7 +1740,7 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol) ...@@ -1738,7 +1740,7 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol)
ll = mft_ni->allocated_size; ll = mft_ni->allocated_size;
read_unlock_irqrestore(&mft_ni->size_lock, flags); read_unlock_irqrestore(&mft_ni->size_lock, flags);
rl = ntfs_attr_find_vcn_nolock(mft_ni, rl = ntfs_attr_find_vcn_nolock(mft_ni,
(ll - 1) >> vol->cluster_size_bits, TRUE); (ll - 1) >> vol->cluster_size_bits, NULL);
if (unlikely(IS_ERR(rl) || !rl->length || rl->lcn < 0)) { if (unlikely(IS_ERR(rl) || !rl->length || rl->lcn < 0)) {
up_write(&mft_ni->runlist.lock); up_write(&mft_ni->runlist.lock);
ntfs_error(vol->sb, "Failed to determine last allocated " ntfs_error(vol->sb, "Failed to determine last allocated "
...@@ -1779,7 +1781,8 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol) ...@@ -1779,7 +1781,8 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol)
nr > min_nr ? "default" : "minimal", (long long)nr); nr > min_nr ? "default" : "minimal", (long long)nr);
old_last_vcn = rl[1].vcn; old_last_vcn = rl[1].vcn;
do { do {
rl2 = ntfs_cluster_alloc(vol, old_last_vcn, nr, lcn, MFT_ZONE); rl2 = ntfs_cluster_alloc(vol, old_last_vcn, nr, lcn, MFT_ZONE,
TRUE);
if (likely(!IS_ERR(rl2))) if (likely(!IS_ERR(rl2)))
break; break;
if (PTR_ERR(rl2) != -ENOSPC || nr == min_nr) { if (PTR_ERR(rl2) != -ENOSPC || nr == min_nr) {
...@@ -1951,20 +1954,21 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol) ...@@ -1951,20 +1954,21 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol)
NVolSetErrors(vol); NVolSetErrors(vol);
return ret; return ret;
} }
a = ctx->attr; ctx->attr->data.non_resident.highest_vcn =
a->data.non_resident.highest_vcn = cpu_to_sle64(old_last_vcn - 1); cpu_to_sle64(old_last_vcn - 1);
undo_alloc: undo_alloc:
if (ntfs_cluster_free(mft_ni, old_last_vcn, -1) < 0) { if (ntfs_cluster_free(mft_ni, old_last_vcn, -1, ctx) < 0) {
ntfs_error(vol->sb, "Failed to free clusters from mft data " ntfs_error(vol->sb, "Failed to free clusters from mft data "
"attribute.%s", es); "attribute.%s", es);
NVolSetErrors(vol); NVolSetErrors(vol);
} }
a = ctx->attr;
if (ntfs_rl_truncate_nolock(vol, &mft_ni->runlist, old_last_vcn)) { if (ntfs_rl_truncate_nolock(vol, &mft_ni->runlist, old_last_vcn)) {
ntfs_error(vol->sb, "Failed to truncate mft data attribute " ntfs_error(vol->sb, "Failed to truncate mft data attribute "
"runlist.%s", es); "runlist.%s", es);
NVolSetErrors(vol); NVolSetErrors(vol);
} }
if (mp_rebuilt) { if (mp_rebuilt && !IS_ERR(ctx->mrec)) {
if (ntfs_mapping_pairs_build(vol, (u8*)a + le16_to_cpu( if (ntfs_mapping_pairs_build(vol, (u8*)a + le16_to_cpu(
a->data.non_resident.mapping_pairs_offset), a->data.non_resident.mapping_pairs_offset),
old_alen - le16_to_cpu( old_alen - le16_to_cpu(
...@@ -1981,6 +1985,10 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol) ...@@ -1981,6 +1985,10 @@ static int ntfs_mft_data_extend_allocation_nolock(ntfs_volume *vol)
} }
flush_dcache_mft_record_page(ctx->ntfs_ino); flush_dcache_mft_record_page(ctx->ntfs_ino);
mark_mft_record_dirty(ctx->ntfs_ino); mark_mft_record_dirty(ctx->ntfs_ino);
} else if (IS_ERR(ctx->mrec)) {
ntfs_error(vol->sb, "Failed to restore attribute search "
"context.%s", es);
NVolSetErrors(vol);
} }
if (ctx) if (ctx)
ntfs_attr_put_search_ctx(ctx); ntfs_attr_put_search_ctx(ctx);
......
...@@ -1447,7 +1447,7 @@ static BOOL load_and_init_usnjrnl(ntfs_volume *vol) ...@@ -1447,7 +1447,7 @@ static BOOL load_and_init_usnjrnl(ntfs_volume *vol)
if (unlikely(i_size_read(tmp_ino) < sizeof(USN_HEADER))) { if (unlikely(i_size_read(tmp_ino) < sizeof(USN_HEADER))) {
ntfs_error(vol->sb, "Found corrupt $UsnJrnl/$DATA/$Max " ntfs_error(vol->sb, "Found corrupt $UsnJrnl/$DATA/$Max "
"attribute (size is 0x%llx but should be at " "attribute (size is 0x%llx but should be at "
"least 0x%x bytes).", i_size_read(tmp_ino), "least 0x%zx bytes).", i_size_read(tmp_ino),
sizeof(USN_HEADER)); sizeof(USN_HEADER));
return FALSE; return FALSE;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment