Merge flatcap.org:/home/flatcap/backup/bk/ntfs-2.6

into flatcap.org:/home/flatcap/backup/bk/ntfs-2.6-devel

Merge flatcap.org:/home/flatcap/backup/bk/ntfs-2.6
into flatcap.org:/home/flatcap/backup/bk/ntfs-2.6-devel
aae109de · Richard Russon · 3beb711b · bf71a676 · aae109de · aae109de
Commit aae109de authored Nov 10, 2004 by Richard Russon
22 changed files
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -317,8 +317,8 @@ prototypes:
 locking rules:
 	called from interrupts. In other words, extreme care is needed here.
 bh is locked, but that's all warranties we have here. Currently only RAID1,
-highmem and fs/buffer.c are providing these. Block devices call this method
-upon the IO completion.
+highmem, fs/buffer.c, and fs/ntfs/aops.c are providing these. Block devices
+call this method upon the IO completion.

 --------------------------- block_device_operations -----------------------
 prototypes:

--- a/Documentation/filesystems/ntfs.txt
+++ b/Documentation/filesystems/ntfs.txt
@@ -258,10 +258,10 @@ Then you would use ldminfo in dump mode to obtain the necessary information:
 $ ./ldminfo --dump /dev/hda

 This would dump the LDM database found on /dev/hda which describes all of your
-dinamic disks and all the volumes on them.  At the bottom you will see the
+dynamic disks and all the volumes on them.  At the bottom you will see the
 VOLUME DEFINITIONS section which is all you really need.  You may need to look
 further above to determine which of the disks in the volume definitions is
-which device in Linux.  Hint: Run ldminfo on each of your dinamic disks and
+which device in Linux.  Hint: Run ldminfo on each of your dynamic disks and
 look at the Disk Id close to the top of the output for each (the PRIVATE HEADER
 section).  You can then find these Disk Ids in the VBLK DATABASE section in the
 <Disk> components where you will get the LDM Name for the disk that is found in

--- a/fs/ntfs/ChangeLog
+++ b/fs/ntfs/ChangeLog
@@ -12,6 +12,10 @@ ToDo/Notes:
 	  OTOH, perhaps i_sem, which is held accross generic_file_write is
 	  sufficient for synchronisation here. We then just need to make sure
 	  ntfs_readpage/writepage/truncate interoperate properly with us.
+	  UPDATE: The above is all ok as it is due to i_sem held.  The only
+	  thing that needs to be checked is ntfs_writepage() which does not
+	  hold i_sem.  It cannot change i_size but it needs to cope with a
+	  concurrent i_size change.
 	- Implement mft.c::sync_mft_mirror_umount().  We currently will just
 	  leave the volume dirty on umount if the final iput(vol->mft_ino)
 	  causes a write of any mirrored mft records due to the mft mirror
@@ -21,6 +25,67 @@ ToDo/Notes:
 	- Enable the code for setting the NT4 compatibility flag when we start
 	  making NTFS 1.2 specific modifications.

+2.1.22-WIP
+
+	- Improve error handling in fs/ntfs/inode.c::ntfs_truncate().
+	- Change fs/ntfs/inode.c::ntfs_truncate() to return an error code
+	  instead of void and provide a helper ntfs_truncate_vfs() for the
+	  vfs ->truncate method.
+	- Add a new ntfs inode flag NInoTruncateFailed() and modify
+	  fs/ntfs/inode.c::ntfs_truncate() to set and clear it appropriately.
+	- Fix min_size and max_size definitions in ATTR_DEF structure in
+	  fs/ntfs/layout.h to be signed.
+	- Add attribute definition handling helpers to fs/ntfs/attrib.[hc]:
+	  ntfs_attr_size_bounds_check(), ntfs_attr_can_be_non_resident(), and
+	  ntfs_attr_can_be_resident(), which in turn use the new private helper
+	  ntfs_attr_find_in_attrdef().
+	- In fs/ntfs/aops.c::mark_ntfs_record_dirty(), take the
+	  mapping->private_lock around the dirtying of the buffer heads
+	  analagous to the way it is done in __set_page_dirty_buffers().
+	- Ensure the mft record size does not exceed the PAGE_CACHE_SIZE at
+	  mount time as this cannot work with the current implementation.
+	- Check for location of attribute name and improve error handling in
+	  general in fs/ntfs/inode.c::ntfs_read_locked_inode() and friends.
+	- In fs/ntfs/aops.c::ntfs_writepage(), if the page is fully outside
+	  i_size, i.e. race with truncate, invalidate the buffers on the page
+	  so that they become freeable and hence the page does not leak.
+	- Remove unused function fs/ntfs/runlist.c::ntfs_rl_merge().  (Adrian
+	  Bunk)
+	- Fix stupid bug in fs/ntfs/attrib.c::ntfs_attr_find() that resulted in
+	  a NULL pointer dereference in the error code path when a corrupt
+	  attribute was found.  (Thanks to Domen Puncer for the bug report.)
+	- Add MODULE_VERSION() to fs/ntfs/super.c.
+	- Make several functions and variables static.  (Adrian Bunk)
+	- Modify fs/ntfs/aops.c::mark_ntfs_record_dirty() so it allocates
+	  buffers for the page if they are not present and then marks the
+	  buffers belonging to the ntfs record dirty.  This causes the buffers
+	  to become busy and hence they are safe from removal until the page
+	  has been written out.
+	- Fix stupid bug in fs/ntfs/attrib.c::ntfs_external_attr_find() in the
+	  error handling code path that resulted in a BUG() due to trying to
+	  unmap an extent mft record when the mapping of it had failed and it
+	  thus was not mapped.  (Thanks to Ken MacFerrin for the bug report.)
+	- Drop the runlist lock after the vcn has been read in
+	  fs/ntfs/lcnalloc.c::__ntfs_cluster_free().
+	- Rewrite handling of multi sector transfer errors.  We now do not set
+	  PageError() when such errors are detected in the async i/o handler
+	  fs/ntfs/aops.c::ntfs_end_buffer_async_read().  All users of mst
+	  protected attributes now check the magic of each ntfs record as they
+	  use it and act appropriately.  This has the effect of making errors
+	  granular per ntfs record rather than per page which solves the case
+	  where we cannot access any of the ntfs records in a page when a
+	  single one of them had an mst error.  (Thanks to Ken MacFerrin for
+	  the bug report.)
+	- Fix error handling in fs/ntfs/quota.c::ntfs_mark_quotas_out_of_date()
+	  where we failed to release i_sem on the $Quota/$Q attribute inode.
+	- Fix bug in handling of bad inodes in fs/ntfs/namei.c::ntfs_lookup().
+	- Add mapping of unmapped buffers to all remaining code paths, i.e.
+	  fs/ntfs/aops.c::ntfs_write_mst_block(), mft.c::ntfs_sync_mft_mirror(),
+	  and write_mft_record_nolock().  From now on we require that the
+	  complete runlist for the mft mirror is always mapped into memory.
+	- Add creation of buffers to fs/ntfs/mft.c::ntfs_sync_mft_mirror().
+	- Improve error handling in fs/ntfs/aops.c::ntfs_{read,write}_block().
+
 2.1.21 - Fix some races and bugs, rewrite mft write code, add mft allocator.

 	- Implement extent mft record deallocation

--- a/fs/ntfs/Makefile
+++ b/fs/ntfs/Makefile
@@ -6,7 +6,7 @@ ntfs-objs := aops.o attrib.o collate.o compress.o debug.o dir.o file.o \
 	     index.o inode.o mft.o mst.o namei.o runlist.o super.o sysctl.o \
 	     unistr.o upcase.o

-EXTRA_CFLAGS = -DNTFS_VERSION=\"2.1.21\"
+EXTRA_CFLAGS = -DNTFS_VERSION=\"2.1.22-WIP\"

 ifeq ($(CONFIG_NTFS_DEBUG),y)
 EXTRA_CFLAGS += -DDEBUG

--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -48,8 +48,8 @@
 *
 * If NInoMstProtected(), perform the post read mst fixups when all IO on the
 * page has been completed and mark the page uptodate or set the error bit on
- * the page. To determine the size of the records that need fixing up, we cheat
- * a little bit by setting the index_block_size in ntfs_inode to the ntfs
+ * the page.  To determine the size of the records that need fixing up, we
+ * cheat a little bit by setting the index_block_size in ntfs_inode to the ntfs
 * record size, and index_block_size_bits, to the log(base 2) of the ntfs
 * record size.
 */
@@ -90,7 +90,6 @@ static void ntfs_end_buffer_async_read(struct buffer_head *bh, int uptodate)
 				(unsigned long long)bh->b_blocknr);
 		SetPageError(page);
 	}
-
 	spin_lock_irqsave(&page_uptodate_lock, flags);
 	clear_buffer_async_read(bh);
 	unlock_buffer(bh);
@@ -111,42 +110,30 @@ static void ntfs_end_buffer_async_read(struct buffer_head *bh, int uptodate)
 	 * If none of the buffers had errors then we can set the page uptodate,
 	 * but we first have to perform the post read mst fixups, if the
 	 * attribute is mst protected, i.e. if NInoMstProteced(ni) is true.
+	 * Note we ignore fixup errors as those are detected when
+	 * map_mft_record() is called which gives us per record granularity
+	 * rather than per page granularity.
 	 */
 	if (!NInoMstProtected(ni)) {
 		if (likely(page_uptodate && !PageError(page)))
 			SetPageUptodate(page);
 	} else {
 		char *addr;
-		unsigned int i, recs, nr_err;
+		unsigned int i, recs;
 		u32 rec_size;

 		rec_size = ni->itype.index.block_size;
 		recs = PAGE_CACHE_SIZE / rec_size;
+		/* Should have been verified before we got here... */
+		BUG_ON(!recs);
 		addr = kmap_atomic(page, KM_BIO_SRC_IRQ);
-		for (i = nr_err = 0; i < recs; i++) {
-			if (likely(!post_read_mst_fixup((NTFS_RECORD*)(addr +
-					i * rec_size), rec_size)))
-				continue;
-			nr_err++;
-			ntfs_error(ni->vol->sb, "post_read_mst_fixup() failed, "
-					"corrupt %s record 0x%llx. Run chkdsk.",
-					ni->mft_no ? "index" : "mft",
-					(unsigned long long)(((s64)page->index
-					<< PAGE_CACHE_SHIFT >>
-					ni->itype.index.block_size_bits) + i));
-		}
+		for (i = 0; i < recs; i++)
+			post_read_mst_fixup((NTFS_RECORD*)(addr +
+					i * rec_size), rec_size);
 		flush_dcache_page(page);
 		kunmap_atomic(addr, KM_BIO_SRC_IRQ);
-		if (likely(!PageError(page))) {
-			if (likely(!nr_err && recs)) {
-				if (likely(page_uptodate))
+		if (likely(!PageError(page) && page_uptodate))
 			SetPageUptodate(page);
-			} else {
-				ntfs_error(ni->vol->sb, "Setting page error, "
-						"index 0x%lx.", page->index);
-				SetPageError(page);
-			}
-		}
 	}
 	unlock_page(page);
 	return;
@@ -188,6 +175,9 @@ static int ntfs_read_block(struct page *page)
 	ni = NTFS_I(page->mapping->host);
 	vol = ni->vol;

+	/* $MFT/$DATA must have its complete runlist in memory at all times. */
+	BUG_ON(!ni->runlist.rl && !ni->mft_no && !NInoAttr(ni));
+
 	blocksize_bits = VFS_I(ni)->i_blkbits;
 	blocksize = 1 << blocksize_bits;

@@ -203,12 +193,6 @@ static int ntfs_read_block(struct page *page)
 	lblock = (ni->allocated_size + blocksize - 1) >> blocksize_bits;
 	zblock = (ni->initialized_size + blocksize - 1) >> blocksize_bits;

-#ifdef DEBUG
-	if (unlikely(!ni->runlist.rl && !ni->mft_no && !NInoAttr(ni)))
-		panic("NTFS: $MFT/$DATA runlist has been unmapped! This is a "
-				"very serious bug! Cannot continue...");
-#endif
-
 	/* Loop through all the buffers in the page. */
 	rl = NULL;
 	nr = i = 0;
@@ -262,24 +246,30 @@ static int ntfs_read_block(struct page *page)
 				goto handle_hole;
 			/* If first try and runlist unmapped, map and retry. */
 			if (!is_retry && lcn == LCN_RL_NOT_MAPPED) {
+				int err;
 				is_retry = TRUE;
 				/*
 				 * Attempt to map runlist, dropping lock for
 				 * the duration.
 				 */
 				up_read(&ni->runlist.lock);
-				if (!ntfs_map_runlist(ni, vcn))
+				err = ntfs_map_runlist(ni, vcn);
+				if (likely(!err))
 					goto lock_retry_remap;
 				rl = NULL;
+				lcn = err;
 			}
 			/* Hard error, zero out region. */
+			bh->b_blocknr = -1;
 			SetPageError(page);
-			ntfs_error(vol->sb, "ntfs_rl_vcn_to_lcn(vcn = 0x%llx) "
-					"failed with error code 0x%llx%s.",
-					(unsigned long long)vcn,
-					(unsigned long long)-lcn,
-					is_retry ? " even after retrying" : "");
-			// FIXME: Depending on vol->on_errors, do something.
+			ntfs_error(vol->sb, "Failed to read from inode 0x%lx, "
+					"attribute type 0x%x, vcn 0x%llx, "
+					"offset 0x%x because its location on "
+					"disk could not be determined%s "
+					"(error code %lli).", ni->mft_no,
+					ni->type, (unsigned long long)vcn,
+					vcn_ofs, is_retry ? " even after "
+					"retrying" : "", (long long)lcn);
 		}
 		/*
 		 * Either iblock was outside lblock limits or
@@ -348,10 +338,8 @@ static int ntfs_read_block(struct page *page)
 * for it to be read in before we can do the copy.
 *
 * Return 0 on success and -errno on error.
- *
- * WARNING: Do not make this function static! It is used by mft.c!
 */
-int ntfs_readpage(struct file *file, struct page *page)
+static int ntfs_readpage(struct file *file, struct page *page)
 {
 	s64 attr_pos;
 	ntfs_inode *ni, *base_ni;
@@ -452,8 +440,8 @@ int ntfs_readpage(struct file *file, struct page *page)

 /**
 * ntfs_write_block - write a @page to the backing store
- * @wbc:	writeback control structure
 * @page:	page cache page to write out
+ * @wbc:	writeback control structure
 *
 * This function is for writing pages belonging to non-resident, non-mst
 * protected attributes to their backing store.
@@ -472,7 +460,7 @@ int ntfs_readpage(struct file *file, struct page *page)
 *
 * Based on ntfs_read_block() and __block_write_full_page().
 */
-static int ntfs_write_block(struct writeback_control *wbc, struct page *page)
+static int ntfs_write_block(struct page *page, struct writeback_control *wbc)
 {
 	VCN vcn;
 	LCN lcn;
@@ -492,7 +480,7 @@ static int ntfs_write_block(struct writeback_control *wbc, struct page *page)
 	vol = ni->vol;

 	ntfs_debug("Entering for inode 0x%lx, attribute type 0x%x, page index "
-			"0x%lx.", vi->i_ino, ni->type, page->index);
+			"0x%lx.", ni->mft_no, ni->type, page->index);

 	BUG_ON(!NInoNonResident(ni));
 	BUG_ON(NInoMstProtected(ni));
@@ -633,9 +621,9 @@ static int ntfs_write_block(struct writeback_control *wbc, struct page *page)
 		bh->b_bdev = vol->sb->s_bdev;

 		/* Convert block into corresponding vcn and offset. */
-		vcn = (VCN)block << blocksize_bits >> vol->cluster_size_bits;
-		vcn_ofs = ((VCN)block << blocksize_bits) &
-				vol->cluster_size_mask;
+		vcn = (VCN)block << blocksize_bits;
+		vcn_ofs = vcn & vol->cluster_size_mask;
+		vcn >>= vol->cluster_size_bits;
 		if (!rl) {
 lock_retry_remap:
 			down_read(&ni->runlist.lock);
@@ -678,15 +666,17 @@ static int ntfs_write_block(struct writeback_control *wbc, struct page *page)
 			if (likely(!err))
 				goto lock_retry_remap;
 			rl = NULL;
+			lcn = err;
 		}
 		/* Failed to map the buffer, even after retrying. */
-		bh->b_blocknr = -1UL;
-		ntfs_error(vol->sb, "ntfs_rl_vcn_to_lcn(vcn = 0x%llx) failed "
-				"with error code 0x%llx%s.",
-				(unsigned long long)vcn,
-				(unsigned long long)-lcn,
-				is_retry ? " even after retrying" : "");
-		// FIXME: Depending on vol->on_errors, do something.
+		bh->b_blocknr = -1;
+		ntfs_error(vol->sb, "Failed to write to inode 0x%lx, "
+				"attribute type 0x%x, vcn 0x%llx, offset 0x%x "
+				"because its location on disk could not be "
+				"determined%s (error code %lli).", ni->mft_no,
+				ni->type, (unsigned long long)vcn,
+				vcn_ofs, is_retry ? " even after "
+				"retrying" : "", (long long)lcn);
 		if (!err)
 			err = -EIO;
 		break;
@@ -782,8 +772,8 @@ static int ntfs_write_block(struct writeback_control *wbc, struct page *page)

 /**
 * ntfs_write_mst_block - write a @page to the backing store
- * @wbc:	writeback control structure
 * @page:	page cache page to write out
+ * @wbc:	writeback control structure
 *
 * This function is for writing pages belonging to non-resident, mst protected
 * attributes to their backing store.  The only supported attributes are index
@@ -804,22 +794,24 @@ static int ntfs_write_block(struct writeback_control *wbc, struct page *page)
 * Based on ntfs_write_block(), ntfs_mft_writepage(), and
 * write_mft_record_nolock().
 */
-static int ntfs_write_mst_block(struct writeback_control *wbc,
-		struct page *page)
+static int ntfs_write_mst_block(struct page *page,
+		struct writeback_control *wbc)
 {
 	sector_t block, dblock, rec_block;
 	struct inode *vi = page->mapping->host;
 	ntfs_inode *ni = NTFS_I(vi);
 	ntfs_volume *vol = ni->vol;
 	u8 *kaddr;
-	unsigned int bh_size = 1 << vi->i_blkbits;
+	unsigned char bh_size_bits = vi->i_blkbits;
+	unsigned int bh_size = 1 << bh_size_bits;
 	unsigned int rec_size = ni->itype.index.block_size;
 	ntfs_inode *locked_nis[PAGE_CACHE_SIZE / rec_size];
-	struct buffer_head *bh, *head, *tbh;
+	struct buffer_head *bh, *head, *tbh, *rec_start_bh;
 	int max_bhs = PAGE_CACHE_SIZE / bh_size;
 	struct buffer_head *bhs[max_bhs];
-	int i, nr_locked_nis, nr_recs, nr_bhs, bhs_per_rec, err;
-	unsigned char bh_size_bits, rec_size_bits;
+	runlist_element *rl;
+	int i, nr_locked_nis, nr_recs, nr_bhs, bhs_per_rec, err, err2;
+	unsigned rec_size_bits;
 	BOOL sync, is_mft, page_is_dirty, rec_is_dirty;

 	ntfs_debug("Entering for inode 0x%lx, attribute type 0x%x, page index "
@@ -827,6 +819,12 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 	BUG_ON(!NInoNonResident(ni));
 	BUG_ON(!NInoMstProtected(ni));
 	is_mft = (S_ISREG(vi->i_mode) && !vi->i_ino);
+	/*
+	 * NOTE: ntfs_write_mst_block() would be called for $MFTMirr if a page
+	 * in its page cache were to be marked dirty.  However this should
+	 * never happen with the current driver and considering we do not
+	 * handle this case here we do want to BUG(), at least for now.
+	 */
 	BUG_ON(!(is_mft || S_ISDIR(vi->i_mode) ||
 			(NInoAttr(ni) && ni->type == AT_INDEX_ALLOCATION)));
 	BUG_ON(!max_bhs);
@@ -839,7 +837,6 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 	bh = head = page_buffers(page);
 	BUG_ON(!bh);

-	bh_size_bits = vi->i_blkbits;
 	rec_size_bits = ni->itype.index.block_size_bits;
 	BUG_ON(!(PAGE_CACHE_SIZE >> rec_size_bits));
 	bhs_per_rec = rec_size >> bh_size_bits;
@@ -852,25 +849,18 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 	/* The first out of bounds block for the data size. */
 	dblock = (vi->i_size + bh_size - 1) >> bh_size_bits;

-	err = nr_bhs = nr_recs = nr_locked_nis = 0;
+	rl = NULL;
+	err = err2 = nr_bhs = nr_recs = nr_locked_nis = 0;
 	page_is_dirty = rec_is_dirty = FALSE;
+	rec_start_bh = NULL;
 	do {
+		BOOL is_retry = FALSE;
+
+		if (likely(block < rec_block)) {
 			if (unlikely(block >= dblock)) {
-			/*
-			 * Mapped buffers outside i_size will occur, because
-			 * this page can be outside i_size when there is a
-			 * truncate in progress.  The contents of such buffers
-			 * were zeroed by ntfs_writepage().
-			 *
-			 * FIXME: What about the small race window where
-			 * ntfs_writepage() has not done any clearing because
-			 * the page was within i_size but before we get here,
-			 * vmtruncate() modifies i_size?
-			 */
 				clear_buffer_dirty(bh);
 				continue;
 			}
-		if (likely(block < rec_block)) {
 			/*
 			 * This block is not the first one in the record.  We
 			 * ignore the buffer's dirty state because we could
@@ -878,22 +868,121 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 			 */
 			if (!rec_is_dirty)
 				continue;
+			if (unlikely(err2)) {
+				if (err2 != -ENOMEM)
+					clear_buffer_dirty(bh);
+				continue;
+			}
 		} else /* if (block == rec_block) */ {
 			BUG_ON(block > rec_block);
 			/* This block is the first one in the record. */
 			rec_block += bhs_per_rec;
+			err2 = 0;
+			if (unlikely(block >= dblock)) {
+				clear_buffer_dirty(bh);
+				continue;
+			}
 			if (!buffer_dirty(bh)) {
 				/* Clean records are not written out. */
 				rec_is_dirty = FALSE;
 				continue;
 			}
 			rec_is_dirty = TRUE;
+			rec_start_bh = bh;
+		}
+		/* Need to map the buffer if it is not mapped already. */
+		if (unlikely(!buffer_mapped(bh))) {
+			VCN vcn;
+			LCN lcn;
+			unsigned int vcn_ofs;
+
+			/* Obtain the vcn and offset of the current block. */
+			vcn = (VCN)block << bh_size_bits;
+			vcn_ofs = vcn & vol->cluster_size_mask;
+			vcn >>= vol->cluster_size_bits;
+			if (!rl) {
+lock_retry_remap:
+				down_read(&ni->runlist.lock);
+				rl = ni->runlist.rl;
+			}
+			if (likely(rl != NULL)) {
+				/* Seek to element containing target vcn. */
+				while (rl->length && rl[1].vcn <= vcn)
+					rl++;
+				lcn = ntfs_rl_vcn_to_lcn(rl, vcn);
+			} else
+				lcn = LCN_RL_NOT_MAPPED;
+			/* Successful remap. */
+			if (likely(lcn >= 0)) {
+				/* Setup buffer head to correct block. */
+				bh->b_blocknr = ((lcn <<
+						vol->cluster_size_bits) +
+						vcn_ofs) >> bh_size_bits;
+				set_buffer_mapped(bh);
+			} else {
+				/*
+				 * Remap failed.  Retry to map the runlist once
+				 * unless we are working on $MFT which always
+				 * has the whole of its runlist in memory.
+				 */
+				if (!is_mft && !is_retry &&
+						lcn == LCN_RL_NOT_MAPPED) {
+					is_retry = TRUE;
+					/*
+					 * Attempt to map runlist, dropping
+					 * lock for the duration.
+					 */
+					up_read(&ni->runlist.lock);
+					err2 = ntfs_map_runlist(ni, vcn);
+					if (likely(!err2))
+						goto lock_retry_remap;
+					if (err2 == -ENOMEM)
+						page_is_dirty = TRUE;
+					lcn = err2;
+				} else
+					err2 = -EIO;
+				/* Hard error.  Abort writing this record. */
+				if (!err || err == -ENOMEM)
+					err = err2;
+				bh->b_blocknr = -1;
+				ntfs_error(vol->sb, "Cannot write ntfs record "
+						"0x%llx (inode 0x%lx, "
+						"attribute type 0x%x) because "
+						"its location on disk could "
+						"not be determined (error "
+						"code %lli).", (s64)block <<
+						bh_size_bits >>
+						vol->mft_record_size_bits,
+						ni->mft_no, ni->type,
+						(long long)lcn);
+				/*
+				 * If this is not the first buffer, remove the
+				 * buffers in this record from the list of
+				 * buffers to write and clear their dirty bit
+				 * if not error -ENOMEM.
+				 */
+				if (rec_start_bh != bh) {
+					while (bhs[--nr_bhs] != rec_start_bh)
+						;
+					if (err2 != -ENOMEM) {
+						do {
+							clear_buffer_dirty(
+								rec_start_bh);
+						} while ((rec_start_bh =
+								rec_start_bh->
+								b_this_page) !=
+								bh);
+					}
+				}
+				continue;
+			}
 		}
-		BUG_ON(!buffer_mapped(bh));
 		BUG_ON(!buffer_uptodate(bh));
+		BUG_ON(nr_bhs >= max_bhs);
 		bhs[nr_bhs++] = bh;
-		BUG_ON(nr_bhs > max_bhs);
 	} while (block++, (bh = bh->b_this_page) != head);
+	if (unlikely(rl))
+		up_read(&ni->runlist.lock);
 	/* If there were no dirty buffers, we are done. */
 	if (!nr_bhs)
 		goto done;
@@ -945,9 +1034,11 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 				locked_nis[nr_locked_nis++] = tni;
 		}
 		/* Apply the mst protection fixups. */
-		err = pre_write_mst_fixup((NTFS_RECORD*)(kaddr + ofs),
+		err2 = pre_write_mst_fixup((NTFS_RECORD*)(kaddr + ofs),
 				rec_size);
-		if (unlikely(err)) {
+		if (unlikely(err2)) {
+			if (!err || err == -ENOMEM)
+				err = -EIO;
 			ntfs_error(vol->sb, "Failed to apply mst fixups "
 					"(inode 0x%lx, attribute type 0x%x, "
 					"page index 0x%lx, page offset 0x%x)!"
@@ -1001,6 +1092,7 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 					"0x%lx, page offset 0x%lx)!  Unmount "
 					"and run chkdsk.", vi->i_ino, ni->type,
 					page->index, bh_offset(tbh));
+			if (!err || err == -ENOMEM)
 				err = -EIO;
 			/*
 			 * Set the buffer uptodate so the page and buffer
@@ -1071,13 +1163,18 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 		atomic_dec(&tni->count);
 		iput(VFS_I(base_tni));
 	}
-	if (unlikely(err)) {
-		SetPageError(page);
-		NVolSetErrors(vol);
-	}
 	SetPageUptodate(page);
 	kunmap(page);
 done:
+	if (unlikely(err && err != -ENOMEM)) {
+		/*
+		 * Set page error if there is only one ntfs record in the page.
+		 * Otherwise we would loose per-record granularity.
+		 */
+		if (ni->itype.index.block_size == PAGE_CACHE_SIZE)
+			SetPageError(page);
+		NVolSetErrors(vol);
+	}
 	if (page_is_dirty) {
 		ntfs_debug("Page still contains one or more dirty ntfs "
 				"records.  Redirtying the page starting at "
@@ -1117,7 +1214,8 @@ static int ntfs_write_mst_block(struct writeback_control *wbc,
 * For resident attributes, OTOH, ntfs_writepage() writes the @page by copying
 * the data to the mft record (which at this stage is most likely in memory).
 * The mft record is then marked dirty and written out asynchronously via the
- * vfs inode dirty code path.
+ * vfs inode dirty code path for the inode the mft record belongs to or via the
+ * vm page dirty code path for the page the mft record is in.
 *
 * Based on ntfs_readpage() and fs/buffer.c::block_write_full_page().
 *
@@ -1141,6 +1239,11 @@ static int ntfs_writepage(struct page *page, struct writeback_control *wbc)
 	/* Is the page fully outside i_size? (truncate in progress) */
 	if (unlikely(page->index >= (vi->i_size + PAGE_CACHE_SIZE - 1) >>
 			PAGE_CACHE_SHIFT)) {
+		/*
+		 * The page may have dirty, unmapped buffers.  Make them
+		 * freeable here, so the page does not leak.
+		 */
+		block_invalidatepage(page, 0);
 		unlock_page(page);
 		ntfs_debug("Write outside i_size - truncated?");
 		return 0;
@@ -1191,14 +1294,13 @@ static int ntfs_writepage(struct page *page, struct writeback_control *wbc)
 		}
 		/* Handle mst protected attributes. */
 		if (NInoMstProtected(ni))
-			return ntfs_write_mst_block(wbc, page);
+			return ntfs_write_mst_block(page, wbc);
 		/* Normal data stream. */
-		return ntfs_write_block(wbc, page);
+		return ntfs_write_block(page, wbc);
 	}
-
 	/*
-	 * Attribute is resident, implying it is not compressed, encrypted, or
-	 * mst protected.
+	 * Attribute is resident, implying it is not compressed, encrypted,
+	 * sparse, or mst protected.
 	 */
 	BUG_ON(page_has_buffers(page));
 	BUG_ON(!PageUptodate(page));
@@ -1277,6 +1379,9 @@ static int ntfs_writepage(struct page *page, struct writeback_control *wbc)
 	 * zeroing below is enabled, we MUST move the unlock_page() from above
 	 * to after the kunmap_atomic(), i.e. just before the
 	 * end_page_writeback().
+	 * UPDATE: ntfs_prepare/commit_write() do the zeroing on i_size
+	 * increases for resident attributes so those are ok.
+	 * TODO: ntfs_truncate(), others?
 	 */

 	kaddr = kmap_atomic(page, KM_USER0);
@@ -1350,11 +1455,10 @@ static int ntfs_prepare_nonresident_write(struct page *page,
 	vol = ni->vol;

 	ntfs_debug("Entering for inode 0x%lx, attribute type 0x%x, page index "
-			"0x%lx, from = %u, to = %u.", vi->i_ino, ni->type,
+			"0x%lx, from = %u, to = %u.", ni->mft_no, ni->type,
 			page->index, from, to);

 	BUG_ON(!NInoNonResident(ni));
-	BUG_ON(NInoMstProtected(ni));

 	blocksize_bits = vi->i_blkbits;
 	blocksize = 1 << blocksize_bits;
@@ -1545,21 +1649,24 @@ static int ntfs_prepare_nonresident_write(struct page *page,
 					if (likely(!err))
 						goto lock_retry_remap;
 					rl = NULL;
+					lcn = err;
 				}
 				/*
 				 * Failed to map the buffer, even after
 				 * retrying.
 				 */
-				bh->b_blocknr = -1UL;
-				ntfs_error(vol->sb, "ntfs_rl_vcn_to_lcn(vcn = "
-						"0x%llx) failed with error "
-						"code 0x%llx%s.",
+				bh->b_blocknr = -1;
+				ntfs_error(vol->sb, "Failed to write to inode "
+						"0x%lx, attribute type 0x%x, "
+						"vcn 0x%llx, offset 0x%x "
+						"because its location on disk "
+						"could not be determined%s "
+						"(error code %lli).",
+						ni->mft_no, ni->type,
 						(unsigned long long)vcn,
-						(unsigned long long)-lcn,
-						is_retry ? " even after "
-						"retrying" : "");
-				// FIXME: Depending on vol->on_errors, do
-				// something.
+						vcn_ofs, is_retry ? " even "
+						"after retrying" : "",
+						(long long)lcn);
 				if (!err)
 					err = -EIO;
 				goto err_out;
@@ -1682,8 +1789,8 @@ static int ntfs_prepare_nonresident_write(struct page *page,
 * ntfs_prepare_write - prepare a page for receiving data
 *
 * This is called from generic_file_write() with i_sem held on the inode
- * (@page->mapping->host). The @page is locked and kmap()ped so page_address()
- * can simply be used. The source data has not yet been copied into the @page.
+ * (@page->mapping->host).  The @page is locked but not kmap()ped.  The source
+ * data has not yet been copied into the @page.
 *
 * Need to extend the attribute/fill in holes if necessary, create blocks and
 * make partially overwritten blocks uptodate,
@@ -1693,8 +1800,8 @@ static int ntfs_prepare_nonresident_write(struct page *page,
 * Return 0 on success or -errno on error.
 *
 * Should be using block_prepare_write() [support for sparse files] or
- * cont_prepare_write() [no support for sparse files]. Can't do that due to
- * ntfs specifics but can look at them for implementation guidancea.
+ * cont_prepare_write() [no support for sparse files].  Cannot do that due to
+ * ntfs specifics but can look at them for implementation guidance.
 *
 * Note: In the range, @from is inclusive and @to is exclusive, i.e. @from is
 * the first byte in the page that will be written to and @to is the first byte
@@ -1703,18 +1810,40 @@ static int ntfs_prepare_nonresident_write(struct page *page,
 static int ntfs_prepare_write(struct file *file, struct page *page,
 		unsigned from, unsigned to)
 {
+	s64 new_size;
 	struct inode *vi = page->mapping->host;
-	ntfs_inode   *ni = NTFS_I(vi);
+	ntfs_inode *base_ni = NULL, *ni = NTFS_I(vi);
+	ntfs_volume *vol = ni->vol;
+	ntfs_attr_search_ctx *ctx = NULL;
+	MFT_RECORD *m = NULL;
+	ATTR_RECORD *a;
+	u8 *kaddr;
+	u32 attr_len;
+	int err;

 	ntfs_debug("Entering for inode 0x%lx, attribute type 0x%x, page index "
 			"0x%lx, from = %u, to = %u.", vi->i_ino, ni->type,
 			page->index, from, to);
-
 	BUG_ON(!PageLocked(page));
 	BUG_ON(from > PAGE_CACHE_SIZE);
 	BUG_ON(to > PAGE_CACHE_SIZE);
 	BUG_ON(from > to);
-
+	BUG_ON(NInoMstProtected(ni));
+	/*
+	 * If a previous ntfs_truncate() failed, repeat it and abort if it
+	 * fails again.
+	 */
+	if (unlikely(NInoTruncateFailed(ni))) {
+		down_write(&vi->i_alloc_sem);
+		err = ntfs_truncate(vi);
+		up_write(&vi->i_alloc_sem);
+		if (err || NInoTruncateFailed(ni)) {
+			if (!err)
+				err = -EIO;
+			goto err_out;
+		}
+	}
+	/* If the attribute is not resident, deal with it elsewhere. */
 	if (NInoNonResident(ni)) {
 		/*
 		 * Only unnamed $DATA attributes can be compressed, encrypted,
@@ -1743,33 +1872,112 @@ static int ntfs_prepare_write(struct file *file, struct page *page,
 				return -EOPNOTSUPP;
 			}
 		}
-
-		// TODO: Implement and remove this check.
-		if (NInoMstProtected(ni)) {
-			ntfs_error(vi->i_sb, "Writing to MST protected "
-					"attributes is not supported yet. "
-					"Sorry.");
-			return -EOPNOTSUPP;
-		}
-
 		/* Normal data stream. */
 		return ntfs_prepare_nonresident_write(page, from, to);
 	}
-
 	/*
 	 * Attribute is resident, implying it is not compressed, encrypted, or
-	 * mst protected.
+	 * sparse.
 	 */
 	BUG_ON(page_has_buffers(page));
+	new_size = ((s64)page->index << PAGE_CACHE_SHIFT) + to;
+	/* If we do not need to resize the attribute allocation we are done. */
+	if (new_size <= vi->i_size)
+		goto done;

-	/* Do we need to resize the attribute? */
-	if (((s64)page->index << PAGE_CACHE_SHIFT) + to > vi->i_size) {
-		// TODO: Implement resize...
-		ntfs_error(vi->i_sb, "Writing beyond the existing file size is "
-				"not supported yet. Sorry.");
+	// FIXME: We abort for now as this code is not safe.
+	ntfs_error(vi->i_sb, "Changing the file size is not supported yet.  "
+			"Sorry.");
 	return -EOPNOTSUPP;
-	}

+	/* Map, pin, and lock the (base) mft record. */
+	if (!NInoAttr(ni))
+		base_ni = ni;
+	else
+		base_ni = ni->ext.base_ntfs_ino;
+	m = map_mft_record(base_ni);
+	if (IS_ERR(m)) {
+		err = PTR_ERR(m);
+		m = NULL;
+		ctx = NULL;
+		goto err_out;
+	}
+	ctx = ntfs_attr_get_search_ctx(base_ni, m);
+	if (unlikely(!ctx)) {
+		err = -ENOMEM;
+		goto err_out;
+	}
+	err = ntfs_attr_lookup(ni->type, ni->name, ni->name_len,
+			CASE_SENSITIVE, 0, NULL, 0, ctx);
+	if (unlikely(err)) {
+		if (err == -ENOENT)
+			err = -EIO;
+		goto err_out;
+	}
+	m = ctx->mrec;
+	a = ctx->attr;
+	/* The total length of the attribute value. */
+	attr_len = le32_to_cpu(a->data.resident.value_length);
+	BUG_ON(vi->i_size != attr_len);
+	/* Check if new size is allowed in $AttrDef. */
+	err = ntfs_attr_size_bounds_check(vol, ni->type, new_size);
+	if (unlikely(err)) {
+		if (err == -ERANGE) {
+			ntfs_error(vol->sb, "Write would cause the inode "
+					"0x%lx to exceed the maximum size for "
+					"its attribute type (0x%x).  Aborting "
+					"write.", vi->i_ino,
+					le32_to_cpu(ni->type));
+		} else {
+			ntfs_error(vol->sb, "Inode 0x%lx has unknown "
+					"attribute type 0x%x.  Aborting "
+					"write.", vi->i_ino,
+					le32_to_cpu(ni->type));
+			err = -EIO;
+		}
+		goto err_out2;
+	}
+	/*
+	 * Extend the attribute record to be able to store the new attribute
+	 * size.
+	 */
+	if (new_size >= vol->mft_record_size || ntfs_attr_record_resize(m, a,
+			le16_to_cpu(a->data.resident.value_offset) +
+			new_size)) {
+		/* Not enough space in the mft record. */
+		ntfs_error(vol->sb, "Not enough space in the mft record for "
+				"the resized attribute value.  This is not "
+				"supported yet.  Aborting write.");
+		err = -EOPNOTSUPP;
+		goto err_out2;
+	}
+	/*
+	 * We have enough space in the mft record to fit the write.  This
+	 * implies the attribute is smaller than the mft record and hence the
+	 * attribute must be in a single page and hence page->index must be 0.
+	 */
+	BUG_ON(page->index);
+	/*
+	 * If the beginning of the write is past the old size, enlarge the
+	 * attribute value up to the beginning of the write and fill it with
+	 * zeroes.
+	 */
+	if (from > attr_len) {
+		memset((u8*)a + le16_to_cpu(a->data.resident.value_offset) +
+				attr_len, 0, from - attr_len);
+		a->data.resident.value_length = cpu_to_le32(from);
+		/* Zero the corresponding area in the page as well. */
+		if (PageUptodate(page)) {
+			kaddr = kmap_atomic(page, KM_USER0);
+			memset(kaddr + attr_len, 0, from - attr_len);
+			kunmap_atomic(kaddr, KM_USER0);
+			flush_dcache_page(page);
+		}
+	}
+	flush_dcache_mft_record_page(ctx->ntfs_ino);
+	mark_mft_record_dirty(ctx->ntfs_ino);
+	ntfs_attr_put_search_ctx(ctx);
+	unmap_mft_record(base_ni);
 	/*
 	 * Because resident attributes are handled by memcpy() to/from the
 	 * corresponding MFT record, and because this form of i/o is byte
@@ -1779,26 +1987,30 @@ static int ntfs_prepare_write(struct file *file, struct page *page,
 	 * generic_file_write() does the copying from userspace.
 	 *
 	 * We thus defer the uptodate bringing of the page region outside the
-	 * region written to to ntfs_commit_write(). The reason for doing this
-	 * is that we save one round of:
-	 *	map_mft_record(), ntfs_attr_get_search_ctx(),
-	 *	ntfs_attr_lookup(), kmap_atomic(), kunmap_atomic(),
-	 *	ntfs_attr_put_search_ctx(), unmap_mft_record().
-	 * Which is obviously a very worthwhile save.
-	 *
-	 * Thus we just return success now...
+	 * region written to to ntfs_commit_write(), which makes the code
+	 * simpler and saves one atomic kmap which is good.
 	 */
+done:
 	ntfs_debug("Done.");
 	return 0;
+err_out:
+	if (err == -ENOMEM)
+		ntfs_warning(vi->i_sb, "Error allocating memory required to "
+				"prepare the write.");
+	else {
+		ntfs_error(vi->i_sb, "Resident attribute prepare write failed "
+				"with error %i.", err);
+		NVolSetErrors(vol);
+		make_bad_inode(vi);
+	}
+err_out2:
+	if (ctx)
+		ntfs_attr_put_search_ctx(ctx);
+	if (m)
+		unmap_mft_record(base_ni);
+	return err;
 }

-/*
- * NOTES: There is a disparity between the apparent need to extend the
- * attribute in prepare write but to update i_size only in commit write.
- * Need to make sure i_sem protection is sufficient. And if not will need to
- * handle this in some way or another.
- */
-
 /**
 * ntfs_commit_nonresident_write -
 *
@@ -1807,24 +2019,21 @@ static int ntfs_commit_nonresident_write(struct page *page,
 		unsigned from, unsigned to)
 {
 	s64 pos = ((s64)page->index << PAGE_CACHE_SHIFT) + to;
-	struct inode *vi;
+	struct inode *vi = page->mapping->host;
 	struct buffer_head *bh, *head;
 	unsigned int block_start, block_end, blocksize;
 	BOOL partial;

-	vi = page->mapping->host;
-
 	ntfs_debug("Entering for inode 0x%lx, attribute type 0x%x, page index "
 			"0x%lx, from = %u, to = %u.", vi->i_ino,
 			NTFS_I(vi)->type, page->index, from, to);
-
 	blocksize = 1 << vi->i_blkbits;

-	// FIXME: We need a whole slew of special cases in here for MST
-	// protected attributes for example. For compressed files, too...
+	// FIXME: We need a whole slew of special cases in here for compressed
+	// files for example...
 	// For now, we know ntfs_prepare_write() would have failed so we can't
 	// get here in any of the cases which we have to special case, so we
-	// are just a ripped off unrolled generic_commit_write() at present.
+	// are just a ripped off, unrolled generic_commit_write().

 	bh = head = page_buffers(page);
 	block_start = 0;
@@ -1839,7 +2048,6 @@ static int ntfs_commit_nonresident_write(struct page *page,
 			mark_buffer_dirty(bh);
 		}
 	} while (block_start = block_end, (bh = bh->b_this_page) != head);
-
 	/*
 	 * If this is a partial write which happened to make all buffers
 	 * uptodate then we can optimize away a bogus ->readpage() for the next
@@ -1848,7 +2056,6 @@ static int ntfs_commit_nonresident_write(struct page *page,
 	 */
 	if (!partial)
 		SetPageUptodate(page);
-
 	/*
 	 * Not convinced about this at all.  See disparity comment above.  For
 	 * now we know ntfs_prepare_write() would have failed in the write
@@ -1869,8 +2076,10 @@ static int ntfs_commit_nonresident_write(struct page *page,
 * ntfs_commit_write - commit the received data
 *
 * This is called from generic_file_write() with i_sem held on the inode
- * (@page->mapping->host). The @page is locked and kmap()ped so page_address()
- * can simply be used. The source data has already been copied into the @page.
+ * (@page->mapping->host).  The @page is locked but not kmap()ped.  The source
+ * data has already been copied into the @page.  ntfs_prepare_write() has been
+ * called before the data copied and it returned success so we can take the
+ * results of various BUG checks and some error handling for granted.
 *
 * Need to mark modified blocks dirty so they get written out later when
 * ntfs_writepage() is invoked by the VM.
@@ -1880,107 +2089,60 @@ static int ntfs_commit_nonresident_write(struct page *page,
 * Should be using generic_commit_write().  This marks buffers uptodate and
 * dirty, sets the page uptodate if all buffers in the page are uptodate, and
 * updates i_size if the end of io is beyond i_size.  In that case, it also
- * marks the inode dirty. - We could still use this (obviously except for
- * NInoMstProtected() attributes, where we will need to duplicate the core code
- * because we need our own async_io completion handler) but we could just do
- * the i_size update in prepare write, when we resize the attribute. Then
- * we would avoid the i_size update and mark_inode_dirty() happening here.
+ * marks the inode dirty.
 *
- * Can't use generic_commit_write() due to ntfs specialities but can look at
+ * Cannot use generic_commit_write() due to ntfs specialities but can look at
 * it for implementation guidance.
 *
 * If things have gone as outlined in ntfs_prepare_write(), then we do not
 * need to do any page content modifications here at all, except in the write
 * to resident attribute case, where we need to do the uptodate bringing here
- * which we combine with the copying into the mft record which means we only
- * need to map the mft record and find the attribute record in it only once.
+ * which we combine with the copying into the mft record which means we save
+ * one atomic kmap.
 */
 static int ntfs_commit_write(struct file *file, struct page *page,
 		unsigned from, unsigned to)
 {
-	s64 attr_pos;
-	struct inode *vi;
-	ntfs_inode *ni, *base_ni;
+	struct inode *vi = page->mapping->host;
+	ntfs_inode *base_ni, *ni = NTFS_I(vi);
 	char *kaddr, *kattr;
 	ntfs_attr_search_ctx *ctx;
 	MFT_RECORD *m;
-	u32 attr_len, bytes;
+	ATTR_RECORD *a;
+	u32 attr_len;
 	int err;

-	vi = page->mapping->host;
-	ni = NTFS_I(vi);
-
 	ntfs_debug("Entering for inode 0x%lx, attribute type 0x%x, page index "
 			"0x%lx, from = %u, to = %u.", vi->i_ino, ni->type,
 			page->index, from, to);
-
+	/* If the attribute is not resident, deal with it elsewhere. */
 	if (NInoNonResident(ni)) {
-		/*
-		 * Only unnamed $DATA attributes can be compressed, encrypted,
-		 * and/or sparse.
-		 */
+		/* Only unnamed $DATA attributes can be compressed/encrypted. */
 		if (ni->type == AT_DATA && !ni->name_len) {
-			/* If file is encrypted, deny access, just like NT4. */
+			/* Encrypted files need separate handling. */
 			if (NInoEncrypted(ni)) {
-				// Should never get here!
-				ntfs_debug("Denying write access to encrypted "
-						"file.");
-				return -EACCES;
+				// We never get here at present!
+				BUG();
 			}
 			/* Compressed data streams are handled in compress.c. */
 			if (NInoCompressed(ni)) {
-				// TODO: Implement and replace this check with
+				// TODO: Implement this!
 				// return ntfs_write_compressed_block(page);
-				// Should never get here!
-				ntfs_error(vi->i_sb, "Writing to compressed "
-						"files is not supported yet. "
-						"Sorry.");
-				return -EOPNOTSUPP;
-			}
-			// TODO: Implement and remove this check.
-			if (NInoSparse(ni)) {
-				// Should never get here!
-				ntfs_error(vi->i_sb, "Writing to sparse files "
-						"is not supported yet. Sorry.");
-				return -EOPNOTSUPP;
+				// We never get here at present!
+				BUG();
 			}
 		}
-
-		// TODO: Implement and remove this check.
-		if (NInoMstProtected(ni)) {
-			// Should never get here!
-			ntfs_error(vi->i_sb, "Writing to MST protected "
-					"attributes is not supported yet. "
-					"Sorry.");
-			return -EOPNOTSUPP;
-		}
-
 		/* Normal data stream. */
 		return ntfs_commit_nonresident_write(page, from, to);
 	}
-
 	/*
 	 * Attribute is resident, implying it is not compressed, encrypted, or
-	 * mst protected.
+	 * sparse.
 	 */
-
-	/* Do we need to resize the attribute? */
-	if (((s64)page->index << PAGE_CACHE_SHIFT) + to > vi->i_size) {
-		// TODO: Implement resize...
-		// pos = ((s64)page->index << PAGE_CACHE_SHIFT) + to;
-		// vi->i_size = pos;
-		// mark_inode_dirty(vi);
-		// Should never get here!
-		ntfs_error(vi->i_sb, "Writing beyond the existing file size is "
-				"not supported yet. Sorry.");
-		return -EOPNOTSUPP;
-	}
-
 	if (!NInoAttr(ni))
 		base_ni = ni;
 	else
 		base_ni = ni->ext.base_ntfs_ino;
-
 	/* Map, pin, and lock the mft record. */
 	m = map_mft_record(base_ni);
 	if (IS_ERR(m)) {
@@ -1996,61 +2158,36 @@ static int ntfs_commit_write(struct file *file, struct page *page,
 	}
 	err = ntfs_attr_lookup(ni->type, ni->name, ni->name_len,
 			CASE_SENSITIVE, 0, NULL, 0, ctx);
-	if (unlikely(err))
-		goto err_out;
-
-	/* Starting position of the page within the attribute value. */
-	attr_pos = page->index << PAGE_CACHE_SHIFT;
-
-	/* The total length of the attribute value. */
-	attr_len = le32_to_cpu(ctx->attr->data.resident.value_length);
-
-	if (unlikely(vi->i_size != attr_len)) {
-		ntfs_error(vi->i_sb, "BUG()! i_size (0x%llx) doesn't match "
-				"attr_len (0x%x). Aborting write.", vi->i_size,
-				attr_len);
-		err = -EIO;
-		goto err_out;
-	}
-	if (unlikely(attr_pos >= attr_len)) {
-		ntfs_error(vi->i_sb, "BUG()! attr_pos (0x%llx) > attr_len "
-				"(0x%x). Aborting write.",
-				(unsigned long long)attr_pos, attr_len);
+	if (unlikely(err)) {
+		if (err == -ENOENT)
 			err = -EIO;
 		goto err_out;
 	}
-
-	bytes = attr_len - attr_pos;
-	if (unlikely(bytes > PAGE_CACHE_SIZE))
-		bytes = PAGE_CACHE_SIZE;
-
-	/*
-	 * Calculate the address of the attribute value corresponding to the
-	 * beginning of the current data @page.
-	 */
-	kattr = (u8*)ctx->attr + le16_to_cpu(
-			ctx->attr->data.resident.value_offset) + attr_pos;
-
+	a = ctx->attr;
+	/* The total length of the attribute value. */
+	attr_len = le32_to_cpu(a->data.resident.value_length);
+	BUG_ON(from > attr_len);
+	kattr = (u8*)a + le16_to_cpu(a->data.resident.value_offset);
 	kaddr = kmap_atomic(page, KM_USER0);
-
 	/* Copy the received data from the page to the mft record. */
 	memcpy(kattr + from, kaddr + from, to - from);
-	flush_dcache_mft_record_page(ctx->ntfs_ino);
-
-	if (!PageUptodate(page)) {
+	/* Update the attribute length if necessary. */
+	if (to > attr_len) {
+		attr_len = to;
+		a->data.resident.value_length = cpu_to_le32(attr_len);
+	}
 	/*
-		 * Bring the out of bounds area(s) uptodate by copying data
-		 * from the mft record to the page.
+	 * If the page is not uptodate, bring the out of bounds area(s)
+	 * uptodate by copying data from the mft record to the page.
 	 */
+	if (!PageUptodate(page)) {
 		if (from > 0)
 			memcpy(kaddr, kattr, from);
-		if (to < bytes)
-			memcpy(kaddr + to, kattr + to, bytes - to);
-
+		if (to < attr_len)
+			memcpy(kaddr + to, kattr + to, attr_len - to);
 		/* Zero the region outside the end of the attribute value. */
-		if (likely(bytes < PAGE_CACHE_SIZE))
-			memset(kaddr + bytes, 0, PAGE_CACHE_SIZE - bytes);
-
+		if (attr_len < PAGE_CACHE_SIZE)
+			memset(kaddr + attr_len, 0, PAGE_CACHE_SIZE - attr_len);
 		/*
 		 * The probability of not having done any of the above is
 		 * extremely small, so we just flush unconditionally.
@@ -2059,10 +2196,14 @@ static int ntfs_commit_write(struct file *file, struct page *page,
 		SetPageUptodate(page);
 	}
 	kunmap_atomic(kaddr, KM_USER0);
-
+	/* Update i_size if necessary. */
+	if (vi->i_size < attr_len) {
+		ni->allocated_size = ni->initialized_size = attr_len;
+		i_size_write(vi, attr_len);
+	}
 	/* Mark the mft record dirty, so it gets written back. */
+	flush_dcache_mft_record_page(ctx->ntfs_ino);
 	mark_mft_record_dirty(ctx->ntfs_ino);
-
 	ntfs_attr_put_search_ctx(ctx);
 	unmap_mft_record(base_ni);
 	ntfs_debug("Done.");
@@ -2077,17 +2218,18 @@ static int ntfs_commit_write(struct file *file, struct page *page,
 					"later on by the VM.");
 			/*
 			 * Put the page on mapping->dirty_pages, but leave its
-			 * buffer's dirty state as-is.
+			 * buffers' dirty state as-is.
 			 */
 			__set_page_dirty_nobuffers(page);
 			err = 0;
 		} else
 			ntfs_error(vi->i_sb, "Page is not uptodate.  Written "
-					"data has been lost. )-:");
+					"data has been lost.");
 	} else {
-		ntfs_error(vi->i_sb, "Resident attribute write failed with "
-				"error %i. Setting page error flag.", -err);
-		SetPageError(page);
+		ntfs_error(vi->i_sb, "Resident attribute commit write failed "
+				"with error %i.", err);
+		NVolSetErrors(ni->vol);
+		make_bad_inode(vi);
 	}
 	if (ctx)
 		ntfs_attr_put_search_ctx(ctx);
@@ -2136,28 +2278,43 @@ struct address_space_operations ntfs_mst_aops = {
 * @page:	page containing the ntfs record to mark dirty
 * @ofs:	byte offset within @page at which the ntfs record begins
 *
- * If the ntfs record is the same size as the page cache page @page, set all
- * buffers in the page dirty.  Otherwise, set only the buffers in which the
- * ntfs record is located dirty.
+ * Set the buffers and the page in which the ntfs record is located dirty.
+ *
+ * The latter also marks the vfs inode the ntfs record belongs to dirty
+ * (I_DIRTY_PAGES only).
 *
- * Also, set the page containing the ntfs record dirty, which also marks the
- * vfs inode the ntfs record belongs to dirty (I_DIRTY_PAGES).
+ * If the page does not have buffers, we create them and set them uptodate.
+ * The page may not be locked which is why we need to handle the buffers under
+ * the mapping->private_lock.  Once the buffers are marked dirty we no longer
+ * need the lock since try_to_free_buffers() does not free dirty buffers.
 */
 void mark_ntfs_record_dirty(struct page *page, const unsigned int ofs) {
-	ntfs_inode *ni;
-	struct buffer_head *bh, *head;
+	struct address_space *mapping = page->mapping;
+	ntfs_inode *ni = NTFS_I(mapping->host);
+	struct buffer_head *bh, *head, *buffers_to_free = NULL;
 	unsigned int end, bh_size, bh_ofs;

-	BUG_ON(!page);
-	BUG_ON(!page_has_buffers(page));
-	ni = NTFS_I(page->mapping->host);
-	BUG_ON(!ni);
-	if (ni->itype.index.block_size == PAGE_CACHE_SIZE) {
-		__set_page_dirty_buffers(page);
-		return;
-	}
+	BUG_ON(!PageUptodate(page));
 	end = ofs + ni->itype.index.block_size;
-	bh_size = ni->vol->sb->s_blocksize;
+	bh_size = 1 << VFS_I(ni)->i_blkbits;
+	spin_lock(&mapping->private_lock);
+	if (unlikely(!page_has_buffers(page))) {
+		spin_unlock(&mapping->private_lock);
+		bh = head = alloc_page_buffers(page, bh_size, 1);
+		spin_lock(&mapping->private_lock);
+		if (likely(!page_has_buffers(page))) {
+			struct buffer_head *tail;
+
+			do {
+				set_buffer_uptodate(bh);
+				tail = bh;
+				bh = bh->b_this_page;
+			} while (bh);
+			tail->b_this_page = head;
+			attach_page_buffers(page, head);
+		} else
+			buffers_to_free = bh;
+	}
 	bh = head = page_buffers(page);
 	do {
 		bh_ofs = bh_offset(bh);
@@ -2167,7 +2324,15 @@ void mark_ntfs_record_dirty(struct page *page, const unsigned int ofs) {
 			break;
 		set_buffer_dirty(bh);
 	} while ((bh = bh->b_this_page) != head);
+	spin_unlock(&mapping->private_lock);
 	__set_page_dirty_nobuffers(page);
+	if (unlikely(buffers_to_free)) {
+		do {
+			bh = buffers_to_free->b_this_page;
+			free_buffer_head(buffers_to_free);
+			buffers_to_free = bh;
+		} while (buffers_to_free);
+	}
 }

 #endif /* NTFS_RW */
--- a/fs/ntfs/aops.h
+++ b/fs/ntfs/aops.h
@@ -55,6 +55,13 @@ static inline void ntfs_unmap_page(struct page *page)
 * method defined in the address space operations of @mapping and the page is
 * added to the page cache of @mapping in the process.
 *
+ * If the page belongs to an mst protected attribute and it is marked as such
+ * in its ntfs inode (NInoMstProtected()) the mst fixups are applied but no
+ * error checking is performed.  This means the caller has to verify whether
+ * the ntfs record(s) contained in the page are valid or not using one of the
+ * ntfs_is_XXXX_record{,p}() macros, where XXXX is the record type you are
+ * expecting to see.  (For details of the macros, see fs/ntfs/layout.h.)
+ *
 * If the page is in high memory it is mapped into memory directly addressible
 * by the kernel.
 *

--- a/fs/ntfs/attrib.c
+++ b/fs/ntfs/attrib.c
@@ -24,8 +24,10 @@

 #include "attrib.h"
 #include "debug.h"
+#include "layout.h"
 #include "mft.h"
 #include "ntfs.h"
+#include "types.h"

 /**
 * ntfs_map_runlist - map (a part of) a runlist of an ntfs inode
@@ -248,19 +250,10 @@ static int ntfs_attr_find(const ATTR_TYPE type, const ntfschar *name,
 		const u8 *val, const u32 val_len, ntfs_attr_search_ctx *ctx)
 {
 	ATTR_RECORD *a;
-	ntfs_volume *vol;
-	ntfschar *upcase;
-	u32 upcase_len;
+	ntfs_volume *vol = ctx->ntfs_ino->vol;
+	ntfschar *upcase = vol->upcase;
+	u32 upcase_len = vol->upcase_len;

-	if (ic == IGNORE_CASE) {
-		vol = ctx->ntfs_ino->vol;
-		upcase = vol->upcase;
-		upcase_len = vol->upcase_len;
-	} else {
-		vol = NULL;
-		upcase = NULL;
-		upcase_len = 0;
-	}
 	/*
 	 * Iterate over attributes in mft record starting at @ctx->attr, or the
 	 * attribute following that, if @ctx->is_first is TRUE.
@@ -352,7 +345,7 @@ static int ntfs_attr_find(const ATTR_TYPE type, const ntfschar *name,
 				return -ENOENT;
 		}
 	}
-	ntfs_error(NULL, "Inode is corrupt.  Run chkdsk.");
+	ntfs_error(vol->sb, "Inode is corrupt.  Run chkdsk.");
 	NVolSetErrors(vol);
 	return -EIO;
 }
@@ -662,7 +655,6 @@ static int ntfs_external_attr_find(const ATTR_TYPE type,
 				ctx->mrec = map_extent_mft_record(base_ni,
 						le64_to_cpu(
 						al_entry->mft_reference), &ni);
-				ctx->ntfs_ino = ni;
 				if (IS_ERR(ctx->mrec)) {
 					ntfs_error(vol->sb, "Failed to map "
 							"extent mft record "
@@ -674,8 +666,11 @@ static int ntfs_external_attr_find(const ATTR_TYPE type,
 					err = PTR_ERR(ctx->mrec);
 					if (err == -ENOENT)
 						err = -EIO;
+					/* Cause @ctx to be sanitized below. */
+					ni = NULL;
 					break;
 				}
+				ctx->ntfs_ino = ni;
 			}
 			ctx->attr = (ATTR_RECORD*)((u8*)ctx->mrec +
 					le16_to_cpu(ctx->mrec->attrs_offset));
@@ -747,6 +742,7 @@ static int ntfs_external_attr_find(const ATTR_TYPE type,
 		err = -EIO;
 	}
 	if (ni != base_ni) {
+		if (ni)
 			unmap_extent_mft_record(ni);
 		ctx->ntfs_ino = base_ni;
 		ctx->mrec = ctx->base_mrec;
@@ -949,6 +945,133 @@ void ntfs_attr_put_search_ctx(ntfs_attr_search_ctx *ctx)
 	return;
 }

+/**
+ * ntfs_attr_find_in_attrdef - find an attribute in the $AttrDef system file
+ * @vol:	ntfs volume to which the attribute belongs
+ * @type:	attribute type which to find
+ *
+ * Search for the attribute definition record corresponding to the attribute
+ * @type in the $AttrDef system file.
+ *
+ * Return the attribute type definition record if found and NULL if not found.
+ */
+static ATTR_DEF *ntfs_attr_find_in_attrdef(const ntfs_volume *vol,
+		const ATTR_TYPE type)
+{
+	ATTR_DEF *ad;
+
+	BUG_ON(!vol->attrdef);
+	BUG_ON(!type);
+	for (ad = vol->attrdef; (u8*)ad - (u8*)vol->attrdef <
+			vol->attrdef_size && ad->type; ++ad) {
+		/* We have not found it yet, carry on searching. */
+		if (likely(le32_to_cpu(ad->type) < le32_to_cpu(type)))
+			continue;
+		/* We found the attribute; return it. */
+		if (likely(ad->type == type))
+			return ad;
+		/* We have gone too far already.  No point in continuing. */
+		break;
+	}
+	/* Attribute not found. */
+	ntfs_debug("Attribute type 0x%x not found in $AttrDef.",
+			le32_to_cpu(type));
+	return NULL;
+}
+
+/**
+ * ntfs_attr_size_bounds_check - check a size of an attribute type for validity
+ * @vol:	ntfs volume to which the attribute belongs
+ * @type:	attribute type which to check
+ * @size:	size which to check
+ *
+ * Check whether the @size in bytes is valid for an attribute of @type on the
+ * ntfs volume @vol.  This information is obtained from $AttrDef system file.
+ *
+ * Return 0 if valid, -ERANGE if not valid, or -ENOENT if the attribute is not
+ * listed in $AttrDef.
+ */
+int ntfs_attr_size_bounds_check(const ntfs_volume *vol, const ATTR_TYPE type,
+		const s64 size)
+{
+	ATTR_DEF *ad;
+
+	BUG_ON(size < 0);
+	/*
+	 * $ATTRIBUTE_LIST has a maximum size of 256kiB, but this is not
+	 * listed in $AttrDef.
+	 */
+	if (unlikely(type == AT_ATTRIBUTE_LIST && size > 256 * 1024))
+		return -ERANGE;
+	/* Get the $AttrDef entry for the attribute @type. */
+	ad = ntfs_attr_find_in_attrdef(vol, type);
+	if (unlikely(!ad))
+		return -ENOENT;
+	/* Do the bounds check. */
+	if (((sle64_to_cpu(ad->min_size) > 0) &&
+			size < sle64_to_cpu(ad->min_size)) ||
+			((sle64_to_cpu(ad->max_size) > 0) && size >
+			sle64_to_cpu(ad->max_size)))
+		return -ERANGE;
+	return 0;
+}
+
+/**
+ * ntfs_attr_can_be_non_resident - check if an attribute can be non-resident
+ * @vol:	ntfs volume to which the attribute belongs
+ * @type:	attribute type which to check
+ *
+ * Check whether the attribute of @type on the ntfs volume @vol is allowed to
+ * be non-resident.  This information is obtained from $AttrDef system file.
+ *
+ * Return 0 if the attribute is allowed to be non-resident, -EPERM if not, or
+ * -ENOENT if the attribute is not listed in $AttrDef.
+ */
+int ntfs_attr_can_be_non_resident(const ntfs_volume *vol, const ATTR_TYPE type)
+{
+	ATTR_DEF *ad;
+
+	/*
+	 * $DATA is always allowed to be non-resident even if $AttrDef does not
+	 * specify this in the flags of the $DATA attribute definition record.
+	 */
+	if (type == AT_DATA)
+		return 0;
+	/* Find the attribute definition record in $AttrDef. */
+	ad = ntfs_attr_find_in_attrdef(vol, type);
+	if (unlikely(!ad))
+		return -ENOENT;
+	/* Check the flags and return the result. */
+	if (ad->flags & CAN_BE_NON_RESIDENT)
+		return 0;
+	return -EPERM;
+}
+
+/**
+ * ntfs_attr_can_be_resident - check if an attribute can be resident
+ * @vol:	ntfs volume to which the attribute belongs
+ * @type:	attribute type which to check
+ *
+ * Check whether the attribute of @type on the ntfs volume @vol is allowed to
+ * be resident.  This information is derived from our ntfs knowledge and may
+ * not be completely accurate, especially when user defined attributes are
+ * present.  Basically we allow everything to be resident except for index
+ * allocation and $EA attributes.
+ *
+ * Return 0 if the attribute is allowed to be non-resident and -EPERM if not.
+ *
+ * Warning: In the system file $MFT the attribute $Bitmap must be non-resident
+ *	    otherwise windows will not boot (blue screen of death)!  We cannot
+ *	    check for this here as we do not know which inode's $Bitmap is
+ *	    being asked about so the caller needs to special case this.
+ */
+int ntfs_attr_can_be_resident(const ntfs_volume *vol, const ATTR_TYPE type)
+{
+	if (type != AT_INDEX_ALLOCATION && type != AT_EA)
+		return 0;
+	return -EPERM;
+}
+
 /**
 * ntfs_attr_record_resize - resize an attribute record
 * @m:		mft record containing attribute record

--- a/fs/ntfs/attrib.h
+++ b/fs/ntfs/attrib.h
@@ -29,6 +29,7 @@
 #include "layout.h"
 #include "inode.h"
 #include "runlist.h"
+#include "volume.h"

 /**
 * ntfs_attr_search_ctx - used in attribute search functions
@@ -84,6 +85,13 @@ extern ntfs_attr_search_ctx *ntfs_attr_get_search_ctx(ntfs_inode *ni,
 		MFT_RECORD *mrec);
 extern void ntfs_attr_put_search_ctx(ntfs_attr_search_ctx *ctx);

+extern int ntfs_attr_size_bounds_check(const ntfs_volume *vol,
+		const ATTR_TYPE type, const s64 size);
+extern int ntfs_attr_can_be_non_resident(const ntfs_volume *vol,
+		const ATTR_TYPE type);
+extern int ntfs_attr_can_be_resident(const ntfs_volume *vol,
+		const ATTR_TYPE type);
+
 extern int ntfs_attr_record_resize(MFT_RECORD *m, ATTR_RECORD *a, u32 new_size);

 extern int ntfs_attr_set(ntfs_inode *ni, const s64 ofs, const s64 cnt,

--- a/fs/ntfs/debug.c
+++ b/fs/ntfs/debug.c
@@ -127,8 +127,8 @@ void __ntfs_debug (const char *file, int line, const char *function,
 	va_start(args, fmt);
 	vsnprintf(err_buf, sizeof(err_buf), fmt, args);
 	va_end(args);
-	printk(KERN_DEBUG "NTFS-fs DEBUG (%s, %d): %s(): %s\n",
-		file, line, flen ? function : "", err_buf);
+	printk(KERN_DEBUG "NTFS-fs DEBUG (%s, %d): %s(): %s\n", file, line,
+			flen ? function : "", err_buf);
 	spin_unlock(&err_buf_lock);
 }

@@ -141,8 +141,7 @@ void ntfs_debug_dump_runlist(const runlist_element *rl)

 	if (!debug_msgs)
 		return;
-	printk(KERN_DEBUG "NTFS-fs DEBUG: Dumping runlist (values "
-			"in hex):\n");
+	printk(KERN_DEBUG "NTFS-fs DEBUG: Dumping runlist (values in hex):\n");
 	if (!rl) {
 		printk(KERN_DEBUG "Run list not present.\n");
 		return;

--- a/fs/ntfs/dir.c
+++ b/fs/ntfs/dir.c
@@ -300,7 +300,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 		ntfs_error(sb, "No index allocation attribute but index entry "
 				"requires one. Directory inode 0x%lx is "
 				"corrupt or driver bug.", dir_ni->mft_no);
-		err = -EIO;
 		goto err_out;
 	}
 	/* Get the starting vcn of the index_block holding the child node. */
@@ -338,7 +337,13 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 	if ((u8*)ia < kaddr || (u8*)ia > kaddr + PAGE_CACHE_SIZE) {
 		ntfs_error(sb, "Out of bounds check failed. Corrupt directory "
 				"inode 0x%lx or driver bug.", dir_ni->mft_no);
-		err = -EIO;
+		goto unm_err_out;
+	}
+	/* Catch multi sector transfer fixup errors. */
+	if (unlikely(!ntfs_is_indx_record(ia->magic))) {
+		ntfs_error(sb, "Directory index record with vcn 0x%llx is "
+				"corrupt.  Corrupt inode 0x%lx.  Run chkdsk.",
+				(unsigned long long)vcn, dir_ni->mft_no);
 		goto unm_err_out;
 	}
 	if (sle64_to_cpu(ia->index_block_vcn) != vcn) {
@@ -348,7 +353,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 				"bug.", (unsigned long long)
 				sle64_to_cpu(ia->index_block_vcn),
 				(unsigned long long)vcn, dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	if (le32_to_cpu(ia->index.allocated_size) + 0x18 !=
@@ -360,7 +364,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 				(unsigned long long)vcn, dir_ni->mft_no,
 				le32_to_cpu(ia->index.allocated_size) + 0x18,
 				dir_ni->itype.index.block_size);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	index_end = (u8*)ia + dir_ni->itype.index.block_size;
@@ -370,7 +373,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 				"Cannot access! This is probably a bug in the "
 				"driver.", (unsigned long long)vcn,
 				dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	index_end = (u8*)&ia->index + le32_to_cpu(ia->index.index_length);
@@ -378,7 +380,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 		ntfs_error(sb, "Size of index buffer (VCN 0x%llx) of directory "
 				"inode 0x%lx exceeds maximum size.",
 				(unsigned long long)vcn, dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	/* The first index entry. */
@@ -398,7 +399,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 			ntfs_error(sb, "Index entry out of bounds in "
 					"directory inode 0x%lx.",
 					dir_ni->mft_no);
-			err = -EIO;
 			goto unm_err_out;
 		}
 		/*
@@ -551,7 +551,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 			ntfs_error(sb, "Index entry with child node found in "
 					"a leaf node in directory inode 0x%lx.",
 					dir_ni->mft_no);
-			err = -EIO;
 			goto unm_err_out;
 		}
 		/* Child node present, descend into it. */
@@ -572,7 +571,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 		}
 		ntfs_error(sb, "Negative child node vcn in directory inode "
 				"0x%lx.", dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	/*
@@ -591,6 +589,8 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 	unlock_page(page);
 	ntfs_unmap_page(page);
 err_out:
+	if (!err)
+		err = -EIO;
 	if (ctx)
 		ntfs_attr_put_search_ctx(ctx);
 	if (m)
@@ -602,7 +602,6 @@ MFT_REF ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 	return ERR_MREF(err);
 dir_err_out:
 	ntfs_error(sb, "Corrupt directory.  Aborting lookup.");
-	err = -EIO;
 	goto err_out;
 }

@@ -780,7 +779,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 		ntfs_error(sb, "No index allocation attribute but index entry "
 				"requires one. Directory inode 0x%lx is "
 				"corrupt or driver bug.", dir_ni->mft_no);
-		err = -EIO;
 		goto err_out;
 	}
 	/* Get the starting vcn of the index_block holding the child node. */
@@ -818,7 +816,13 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 	if ((u8*)ia < kaddr || (u8*)ia > kaddr + PAGE_CACHE_SIZE) {
 		ntfs_error(sb, "Out of bounds check failed. Corrupt directory "
 				"inode 0x%lx or driver bug.", dir_ni->mft_no);
-		err = -EIO;
+		goto unm_err_out;
+	}
+	/* Catch multi sector transfer fixup errors. */
+	if (unlikely(!ntfs_is_indx_record(ia->magic))) {
+		ntfs_error(sb, "Directory index record with vcn 0x%llx is "
+				"corrupt.  Corrupt inode 0x%lx.  Run chkdsk.",
+				(unsigned long long)vcn, dir_ni->mft_no);
 		goto unm_err_out;
 	}
 	if (sle64_to_cpu(ia->index_block_vcn) != vcn) {
@@ -828,7 +832,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 				"bug.", (unsigned long long)
 				sle64_to_cpu(ia->index_block_vcn),
 				(unsigned long long)vcn, dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	if (le32_to_cpu(ia->index.allocated_size) + 0x18 !=
@@ -840,7 +843,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 				(unsigned long long)vcn, dir_ni->mft_no,
 				le32_to_cpu(ia->index.allocated_size) + 0x18,
 				dir_ni->itype.index.block_size);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	index_end = (u8*)ia + dir_ni->itype.index.block_size;
@@ -850,7 +852,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 				"Cannot access! This is probably a bug in the "
 				"driver.", (unsigned long long)vcn,
 				dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	index_end = (u8*)&ia->index + le32_to_cpu(ia->index.index_length);
@@ -858,7 +859,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 		ntfs_error(sb, "Size of index buffer (VCN 0x%llx) of directory "
 				"inode 0x%lx exceeds maximum size.",
 				(unsigned long long)vcn, dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	/* The first index entry. */
@@ -878,7 +878,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 			ntfs_error(sb, "Index entry out of bounds in "
 					"directory inode 0x%lx.",
 					dir_ni->mft_no);
-			err = -EIO;
 			goto unm_err_out;
 		}
 		/*
@@ -962,7 +961,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 			ntfs_error(sb, "Index entry with child node found in "
 					"a leaf node in directory inode 0x%lx.",
 					dir_ni->mft_no);
-			err = -EIO;
 			goto unm_err_out;
 		}
 		/* Child node present, descend into it. */
@@ -982,7 +980,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 		}
 		ntfs_error(sb, "Negative child node vcn in directory inode "
 				"0x%lx.", dir_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	/* No child node, return -ENOENT. */
@@ -992,6 +989,8 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 	unlock_page(page);
 	ntfs_unmap_page(page);
 err_out:
+	if (!err)
+		err = -EIO;
 	if (ctx)
 		ntfs_attr_put_search_ctx(ctx);
 	if (m)
@@ -999,7 +998,6 @@ u64 ntfs_lookup_inode_by_name(ntfs_inode *dir_ni, const ntfschar *uname,
 	return ERR_MREF(err);
 dir_err_out:
 	ntfs_error(sb, "Corrupt directory. Aborting lookup.");
-	err = -EIO;
 	goto err_out;
 }

@@ -1340,6 +1338,14 @@ static int ntfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
 				"inode 0x%lx or driver bug.", vdir->i_ino);
 		goto err_out;
 	}
+	/* Catch multi sector transfer fixup errors. */
+	if (unlikely(!ntfs_is_indx_record(ia->magic))) {
+		ntfs_error(sb, "Directory index record with vcn 0x%llx is "
+				"corrupt.  Corrupt inode 0x%lx.  Run chkdsk.",
+				(unsigned long long)ia_pos >>
+				ndir->itype.index.vcn_size_bits, vdir->i_ino);
+		goto err_out;
+	}
 	if (unlikely(sle64_to_cpu(ia->index_block_vcn) != (ia_pos &
 			~(s64)(ndir->itype.index.block_size - 1)) >>
 			ndir->itype.index.vcn_size_bits)) {

--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -145,7 +145,7 @@ struct file_operations ntfs_file_ops = {

 struct inode_operations ntfs_file_inode_ops = {
 #ifdef NTFS_RW
-	.truncate	= ntfs_truncate,
+	.truncate	= ntfs_truncate_vfs,
 	.setattr	= ntfs_setattr,
 #endif /* NTFS_RW */
 };

--- a/fs/ntfs/index.c
+++ b/fs/ntfs/index.c
@@ -263,7 +263,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 		ntfs_error(sb, "No index allocation attribute but index entry "
 				"requires one.  Inode 0x%lx is corrupt or "
 				"driver bug.", idx_ni->mft_no);
-		err = -EIO;
 		goto err_out;
 	}
 	/* Get the starting vcn of the index_block holding the child node. */
@@ -301,7 +300,13 @@ int ntfs_index_lookup(const void *key, const int key_len,
 	if ((u8*)ia < kaddr || (u8*)ia > kaddr + PAGE_CACHE_SIZE) {
 		ntfs_error(sb, "Out of bounds check failed.  Corrupt inode "
 				"0x%lx or driver bug.", idx_ni->mft_no);
-		err = -EIO;
+		goto unm_err_out;
+	}
+	/* Catch multi sector transfer fixup errors. */
+	if (unlikely(!ntfs_is_indx_record(ia->magic))) {
+		ntfs_error(sb, "Index record with vcn 0x%llx is corrupt.  "
+				"Corrupt inode 0x%lx.  Run chkdsk.",
+				(long long)vcn, idx_ni->mft_no);
 		goto unm_err_out;
 	}
 	if (sle64_to_cpu(ia->index_block_vcn) != vcn) {
@@ -311,7 +316,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 				(unsigned long long)
 				sle64_to_cpu(ia->index_block_vcn),
 				(unsigned long long)vcn, idx_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	if (le32_to_cpu(ia->index.allocated_size) + 0x18 !=
@@ -323,7 +327,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 				idx_ni->mft_no,
 				le32_to_cpu(ia->index.allocated_size) + 0x18,
 				idx_ni->itype.index.block_size);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	index_end = (u8*)ia + idx_ni->itype.index.block_size;
@@ -333,7 +336,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 				"access!  This is probably a bug in the "
 				"driver.", (unsigned long long)vcn,
 				idx_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	index_end = (u8*)&ia->index + le32_to_cpu(ia->index.index_length);
@@ -341,7 +343,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 		ntfs_error(sb, "Size of index buffer (VCN 0x%llx) of inode "
 				"0x%lx exceeds maximum size.",
 				(unsigned long long)vcn, idx_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	/* The first index entry. */
@@ -359,11 +360,10 @@ int ntfs_index_lookup(const void *key, const int key_len,
 				(u8*)ie + le16_to_cpu(ie->length) > index_end) {
 			ntfs_error(sb, "Index entry out of bounds in inode "
 					"0x%lx.", idx_ni->mft_no);
-			err = -EIO;
 			goto unm_err_out;
 		}
 		/*
-		 * The last entry cannot contain a ket.  It can however contain
+		 * The last entry cannot contain a key.  It can however contain
 		 * a pointer to a child node in the B+tree so we just break out.
 		 */
 		if (ie->flags & INDEX_ENTRY_END)
@@ -377,7 +377,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 				le16_to_cpu(ie->length)) {
 			ntfs_error(sb, "Index entry out of bounds in inode "
 					"0x%lx.", idx_ni->mft_no);
-			err = -EIO;
 			goto unm_err_out;
 		}
 		/* If the keys match perfectly, we setup @ictx and return 0. */
@@ -424,7 +423,6 @@ int ntfs_index_lookup(const void *key, const int key_len,
 	if ((ia->index.flags & NODE_MASK) == LEAF_NODE) {
 		ntfs_error(sb, "Index entry with child node found in a leaf "
 				"node in inode 0x%lx.", idx_ni->mft_no);
-		err = -EIO;
 		goto unm_err_out;
 	}
 	/* Child node present, descend into it. */
@@ -446,11 +444,12 @@ int ntfs_index_lookup(const void *key, const int key_len,
 	}
 	ntfs_error(sb, "Negative child node vcn in inode 0x%lx.",
 			idx_ni->mft_no);
-	err = -EIO;
 unm_err_out:
 	unlock_page(page);
 	ntfs_unmap_page(page);
 err_out:
+	if (!err)
+		err = -EIO;
 	if (actx)
 		ntfs_attr_put_search_ctx(actx);
 	if (m)
@@ -458,6 +457,5 @@ int ntfs_index_lookup(const void *key, const int key_len,
 	return err;
 idx_err_out:
 	ntfs_error(sb, "Corrupt index.  Aborting lookup.");
-	err = -EIO;
 	goto err_out;
 }
--- a/fs/ntfs/inode.c
+++ b/fs/ntfs/inode.c
@@ -352,7 +352,7 @@ static inline ntfs_inode *ntfs_alloc_extent_inode(void)
 	return NULL;
 }

-void ntfs_destroy_extent_inode(ntfs_inode *ni)
+static void ntfs_destroy_extent_inode(ntfs_inode *ni)
 {
 	ntfs_debug("Entering.");
 	BUG_ON(ni->page);
@@ -564,13 +564,11 @@ static int ntfs_read_locked_inode(struct inode *vi)
 	}

 	if (!(m->flags & MFT_RECORD_IN_USE)) {
-		ntfs_error(vi->i_sb, "Inode is not in use! You should "
-				"run chkdsk.");
+		ntfs_error(vi->i_sb, "Inode is not in use!");
 		goto unm_err_out;
 	}
 	if (m->base_mft_record) {
-		ntfs_error(vi->i_sb, "Inode is an extent inode! You should "
-				"run chkdsk.");
+		ntfs_error(vi->i_sb, "Inode is an extent inode!");
 		goto unm_err_out;
 	}

@@ -667,7 +665,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 	if (err) {
 		if (unlikely(err != -ENOENT)) {
 			ntfs_error(vi->i_sb, "Failed to lookup attribute list "
-					"attribute. You should run chkdsk.");
+					"attribute.");
 			goto unm_err_out;
 		}
 	} else /* if (!err) */ {
@@ -679,9 +677,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 				ctx->attr->flags & ATTR_COMPRESSION_MASK ||
 				ctx->attr->flags & ATTR_IS_SPARSE) {
 			ntfs_error(vi->i_sb, "Attribute list attribute is "
-					"compressed/encrypted/sparse. Not "
-					"allowed. Corrupt inode. You should "
-					"run chkdsk.");
+					"compressed/encrypted/sparse.");
 			goto unm_err_out;
 		}
 		/* Now allocate memory for the attribute list. */
@@ -697,9 +693,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 			NInoSetAttrListNonResident(ni);
 			if (ctx->attr->data.non_resident.lowest_vcn) {
 				ntfs_error(vi->i_sb, "Attribute list has non "
-						"zero lowest_vcn. Inode is "
-						"corrupt. You should run "
-						"chkdsk.");
+						"zero lowest_vcn.");
 				goto unm_err_out;
 			}
 			/*
@@ -712,10 +706,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 				err = PTR_ERR(ni->attr_list_rl.rl);
 				ni->attr_list_rl.rl = NULL;
 				ntfs_error(vi->i_sb, "Mapping pairs "
-						"decompression failed with "
-						"error code %i. Corrupt "
-						"attribute list in inode.",
-						-err);
+						"decompression failed.");
 				goto unm_err_out;
 			}
 			/* Now load the attribute list. */
@@ -770,9 +761,18 @@ static int ntfs_read_locked_inode(struct inode *vi)
 			goto unm_err_out;
 		}
 		/* Set up the state. */
-		if (ctx->attr->non_resident) {
-			ntfs_error(vi->i_sb, "$INDEX_ROOT attribute is "
-					"not resident. Not allowed.");
+		if (unlikely(ctx->attr->non_resident)) {
+			ntfs_error(vol->sb, "$INDEX_ROOT attribute is not "
+					"resident.");
+			goto unm_err_out;
+		}
+		/* Ensure the attribute name is placed before the value. */
+		if (unlikely(ctx->attr->name_length &&
+				(le16_to_cpu(ctx->attr->name_offset) >=
+				le16_to_cpu(ctx->attr->data.resident.
+				value_offset)))) {
+			ntfs_error(vol->sb, "$INDEX_ROOT attribute name is "
+					"placed after the attribute value.");
 			goto unm_err_out;
 		}
 		/*
@@ -786,8 +786,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 		if (ctx->attr->flags & ATTR_IS_ENCRYPTED) {
 			if (ctx->attr->flags & ATTR_COMPRESSION_MASK) {
 				ntfs_error(vi->i_sb, "Found encrypted and "
-						"compressed attribute. Not "
-						"allowed.");
+						"compressed attribute.");
 				goto unm_err_out;
 			}
 			NInoSetEncrypted(ni);
@@ -811,12 +810,12 @@ static int ntfs_read_locked_inode(struct inode *vi)
 		}
 		if (ir->type != AT_FILE_NAME) {
 			ntfs_error(vi->i_sb, "Indexed attribute is not "
-					"$FILE_NAME. Not allowed.");
+					"$FILE_NAME.");
 			goto unm_err_out;
 		}
 		if (ir->collation_rule != COLLATION_FILE_NAME) {
 			ntfs_error(vi->i_sb, "Index collation rule is not "
-					"COLLATION_FILE_NAME. Not allowed.");
+					"COLLATION_FILE_NAME.");
 			goto unm_err_out;
 		}
 		ni->itype.index.collation_rule = ir->collation_rule;
@@ -883,8 +882,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 			if (err == -ENOENT)
 				ntfs_error(vi->i_sb, "$INDEX_ALLOCATION "
 						"attribute is not present but "
-						"$INDEX_ROOT indicated it "
-						"is.");
+						"$INDEX_ROOT indicated it is.");
 			else
 				ntfs_error(vi->i_sb, "Failed to lookup "
 						"$INDEX_ALLOCATION "
@@ -896,6 +894,19 @@ static int ntfs_read_locked_inode(struct inode *vi)
 					"is resident.");
 			goto unm_err_out;
 		}
+		/*
+		 * Ensure the attribute name is placed before the mapping pairs
+		 * array.
+		 */
+		if (unlikely(ctx->attr->name_length &&
+				(le16_to_cpu(ctx->attr->name_offset) >=
+				le16_to_cpu(ctx->attr->data.non_resident.
+				mapping_pairs_offset)))) {
+			ntfs_error(vol->sb, "$INDEX_ALLOCATION attribute name "
+					"is placed after the mapping pairs "
+					"array.");
+			goto unm_err_out;
+		}
 		if (ctx->attr->flags & ATTR_IS_ENCRYPTED) {
 			ntfs_error(vi->i_sb, "$INDEX_ALLOCATION attribute "
 					"is encrypted.");
@@ -914,8 +925,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 		if (ctx->attr->data.non_resident.lowest_vcn) {
 			ntfs_error(vi->i_sb, "First extent of "
 					"$INDEX_ALLOCATION attribute has non "
-					"zero lowest_vcn. Inode is corrupt. "
-					"You should run chkdsk.");
+					"zero lowest_vcn.");
 			goto unm_err_out;
 		}
 		vi->i_size = sle64_to_cpu(
@@ -997,8 +1007,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 				goto no_data_attr_special_case;
 			// FIXME: File is corrupt! Hot-fix with empty data
 			// attribute if recovery option is set.
-			ntfs_error(vi->i_sb, "$DATA attribute is "
-					"missing.");
+			ntfs_error(vi->i_sb, "$DATA attribute is missing.");
 			goto unm_err_out;
 		}
 		/* Setup the state. */
@@ -1029,9 +1038,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 					ntfs_error(vi->i_sb, "Found "
 						"nonstandard compression unit "
 						"(%u instead of 4).  Cannot "
-						"handle this. This might "
-						"indicate corruption so you "
-						"should run chkdsk.",
+						"handle this.",
 						ctx->attr->data.non_resident.
 						compression_unit);
 					err = -EOPNOTSUPP;
@@ -1057,8 +1064,7 @@ static int ntfs_read_locked_inode(struct inode *vi)
 			if (ctx->attr->data.non_resident.lowest_vcn) {
 				ntfs_error(vi->i_sb, "First extent of $DATA "
 						"attribute has non zero "
-						"lowest_vcn. Inode is corrupt. "
-						"You should run chkdsk.");
+						"lowest_vcn.");
 				goto unm_err_out;
 			}
 			/* Setup all the sizes. */
@@ -1127,9 +1133,11 @@ static int ntfs_read_locked_inode(struct inode *vi)
 	if (m)
 		unmap_mft_record(ni);
 err_out:
-	ntfs_error(vi->i_sb, "Failed with error code %i. Marking inode 0x%lx "
-			"as bad.", -err, vi->i_ino);
+	ntfs_error(vol->sb, "Failed with error code %i.  Marking corrupt "
+			"inode 0x%lx as bad.  Run chkdsk.", err, vi->i_ino);
 	make_bad_inode(vi);
+	if (err != -EOPNOTSUPP && err != -ENOMEM)
+		NVolSetErrors(vol);
 	return err;
 }

@@ -1200,15 +1208,21 @@ static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi)
 		goto unm_err_out;

 	if (!ctx->attr->non_resident) {
+		/* Ensure the attribute name is placed before the value. */
+		if (unlikely(ctx->attr->name_length &&
+				(le16_to_cpu(ctx->attr->name_offset) >=
+				le16_to_cpu(ctx->attr->data.resident.
+				value_offset)))) {
+			ntfs_error(vol->sb, "Attribute name is placed after "
+					"the attribute value.");
+			goto unm_err_out;
+		}
 		if (NInoMstProtected(ni) || ctx->attr->flags) {
 			ntfs_error(vi->i_sb, "Found mst protected attribute "
 					"or attribute with non-zero flags but "
-					"the attribute is resident (mft_no "
-					"0x%lx, type 0x%x, name_len %i). "
-					"Please report you saw this message "
-					"to linux-ntfs-dev@lists."
-					"sourceforge.net",
-					vi->i_ino, ni->type, ni->name_len);
+					"the attribute is resident.  Please "
+					"report you saw this message to "
+					"linux-ntfs-dev@lists.sourceforge.net");
 			goto unm_err_out;
 		}
 		/*
@@ -1219,58 +1233,61 @@ static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi)
 			le32_to_cpu(ctx->attr->data.resident.value_length);
 	} else {
 		NInoSetNonResident(ni);
+		/*
+		 * Ensure the attribute name is placed before the mapping pairs
+		 * array.
+		 */
+		if (unlikely(ctx->attr->name_length &&
+				(le16_to_cpu(ctx->attr->name_offset) >=
+				le16_to_cpu(ctx->attr->data.non_resident.
+				mapping_pairs_offset)))) {
+			ntfs_error(vol->sb, "Attribute name is placed after "
+					"the mapping pairs array.");
+			goto unm_err_out;
+		}
 		if (ctx->attr->flags & ATTR_COMPRESSION_MASK) {
 			if (NInoMstProtected(ni)) {
 				ntfs_error(vi->i_sb, "Found mst protected "
 						"attribute but the attribute "
-						"is compressed (mft_no 0x%lx, "
-						"type 0x%x, name_len %i). "
-						"Please report you saw this "
-						"message to linux-ntfs-dev@"
-						"lists.sourceforge.net",
-						vi->i_ino, ni->type,
-						ni->name_len);
+						"is compressed.  Please report "
+						"you saw this message to "
+						"linux-ntfs-dev@lists."
+						"sourceforge.net");
 				goto unm_err_out;
 			}
 			NInoSetCompressed(ni);
 			if ((ni->type != AT_DATA) || (ni->type == AT_DATA &&
 					ni->name_len)) {
-				ntfs_error(vi->i_sb, "Found compressed non-"
-						"data or named data attribute "
-						"(mft_no 0x%lx, type 0x%x, "
-						"name_len %i). Please report "
+				ntfs_error(vi->i_sb, "Found compressed "
+						"non-data or named data "
+						"attribute.  Please report "
 						"you saw this message to "
 						"linux-ntfs-dev@lists."
-						"sourceforge.net",
-						vi->i_ino, ni->type,
-						ni->name_len);
+						"sourceforge.net");
 				goto unm_err_out;
 			}
 			if (vol->cluster_size > 4096) {
-				ntfs_error(vi->i_sb, "Found "
-					"compressed attribute but "
-					"compression is disabled due "
-					"to cluster size (%i) > 4kiB.",
+				ntfs_error(vi->i_sb, "Found compressed "
+						"attribute but compression is "
+						"disabled due to cluster size "
+						"(%i) > 4kiB.",
 						vol->cluster_size);
 				goto unm_err_out;
 			}
 			if ((ctx->attr->flags & ATTR_COMPRESSION_MASK)
 					!= ATTR_IS_COMPRESSED) {
 				ntfs_error(vi->i_sb, "Found unknown "
-						"compression method or "
-						"corrupt file.");
+						"compression method.");
 				goto unm_err_out;
 			}
 			ni->itype.compressed.block_clusters = 1U <<
 					ctx->attr->data.non_resident.
 					compression_unit;
-			if (ctx->attr->data.non_resident.compression_unit != 4) {
-				ntfs_error(vi->i_sb, "Found "
-					"nonstandard compression unit "
-					"(%u instead of 4). Cannot "
-					"handle this. This might "
-					"indicate corruption so you "
-					"should run chkdsk.",
+			if (ctx->attr->data.non_resident.compression_unit !=
+					4) {
+				ntfs_error(vi->i_sb, "Found nonstandard "
+						"compression unit (%u instead "
+						"of 4).  Cannot handle this.",
 						ctx->attr->data.non_resident.
 						compression_unit);
 				err = -EOPNOTSUPP;
@@ -1292,13 +1309,10 @@ static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi)
 			if (NInoMstProtected(ni)) {
 				ntfs_error(vi->i_sb, "Found mst protected "
 						"attribute but the attribute "
-						"is encrypted (mft_no 0x%lx, "
-						"type 0x%x, name_len %i). "
-						"Please report you saw this "
-						"message to linux-ntfs-dev@"
-						"lists.sourceforge.net",
-						vi->i_ino, ni->type,
-						ni->name_len);
+						"is encrypted.  Please report "
+						"you saw this message to "
+						"linux-ntfs-dev@lists."
+						"sourceforge.net");
 				goto unm_err_out;
 			}
 			NInoSetEncrypted(ni);
@@ -1307,21 +1321,17 @@ static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi)
 			if (NInoMstProtected(ni)) {
 				ntfs_error(vi->i_sb, "Found mst protected "
 						"attribute but the attribute "
-						"is sparse (mft_no 0x%lx, "
-						"type 0x%x, name_len %i). "
-						"Please report you saw this "
-						"message to linux-ntfs-dev@"
-						"lists.sourceforge.net",
-						vi->i_ino, ni->type,
-						ni->name_len);
+						"is sparse.  Please report "
+						"you saw this message to "
+						"linux-ntfs-dev@lists."
+						"sourceforge.net");
 				goto unm_err_out;
 			}
 			NInoSetSparse(ni);
 		}
 		if (ctx->attr->data.non_resident.lowest_vcn) {
 			ntfs_error(vi->i_sb, "First extent of attribute has "
-					"non-zero lowest_vcn. Inode is "
-					"corrupt. You should run chkdsk.");
+					"non-zero lowest_vcn.");
 			goto unm_err_out;
 		}
 		/* Setup all the sizes. */
@@ -1372,10 +1382,15 @@ static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi)
 		ntfs_attr_put_search_ctx(ctx);
 	unmap_mft_record(base_ni);
 err_out:
-	ntfs_error(vi->i_sb, "Failed with error code %i while reading "
-			"attribute inode (mft_no 0x%lx, type 0x%x, name_len "
-			"%i.", -err, vi->i_ino, ni->type, ni->name_len);
+	ntfs_error(vol->sb, "Failed with error code %i while reading attribute "
+			"inode (mft_no 0x%lx, type 0x%x, name_len %i).  "
+			"Marking corrupt inode and base inode 0x%lx as bad.  "
+			"Run chkdsk.", err, vi->i_ino, ni->type, ni->name_len,
+			base_vi->i_ino);
 	make_bad_inode(vi);
+	make_bad_inode(base_vi);
+	if (err != -ENOMEM)
+		NVolSetErrors(vol);
 	return err;
 }

@@ -1460,16 +1475,24 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 		goto unm_err_out;
 	}
 	/* Set up the state. */
-	if (ctx->attr->non_resident) {
-		ntfs_error(vi->i_sb, "$INDEX_ROOT attribute is not resident.  "
-				"Not allowed.");
+	if (unlikely(ctx->attr->non_resident)) {
+		ntfs_error(vol->sb, "$INDEX_ROOT attribute is not resident.");
+		goto unm_err_out;
+	}
+	/* Ensure the attribute name is placed before the value. */
+	if (unlikely(ctx->attr->name_length &&
+			(le16_to_cpu(ctx->attr->name_offset) >=
+			le16_to_cpu(ctx->attr->data.resident.
+			value_offset)))) {
+		ntfs_error(vol->sb, "$INDEX_ROOT attribute name is placed "
+				"after the attribute value.");
 		goto unm_err_out;
 	}
 	/* Compressed/encrypted/sparse index root is not allowed. */
 	if (ctx->attr->flags & (ATTR_COMPRESSION_MASK | ATTR_IS_ENCRYPTED |
 			ATTR_IS_SPARSE)) {
 		ntfs_error(vi->i_sb, "Found compressed/encrypted/sparse index "
-				"root attribute.  Not allowed.");
+				"root attribute.");
 		goto unm_err_out;
 	}
 	ir = (INDEX_ROOT*)((u8*)ctx->attr +
@@ -1485,8 +1508,8 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 		goto unm_err_out;
 	}
 	if (ir->type) {
-		ntfs_error(vi->i_sb, "Index type is not 0 (type is 0x%x).  "
-				"Not allowed.", le32_to_cpu(ir->type));
+		ntfs_error(vi->i_sb, "Index type is not 0 (type is 0x%x).",
+				le32_to_cpu(ir->type));
 		goto unm_err_out;
 	}
 	ni->itype.index.collation_rule = ir->collation_rule;
@@ -1552,6 +1575,16 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 				"resident.");
 		goto unm_err_out;
 	}
+	/*
+	 * Ensure the attribute name is placed before the mapping pairs array.
+	 */
+	if (unlikely(ctx->attr->name_length && (le16_to_cpu(
+			ctx->attr->name_offset) >= le16_to_cpu(
+			ctx->attr->data.non_resident.mapping_pairs_offset)))) {
+		ntfs_error(vol->sb, "$INDEX_ALLOCATION attribute name is "
+				"placed after the mapping pairs array.");
+		goto unm_err_out;
+	}
 	if (ctx->attr->flags & ATTR_IS_ENCRYPTED) {
 		ntfs_error(vi->i_sb, "$INDEX_ALLOCATION attribute is "
 				"encrypted.");
@@ -1568,8 +1601,7 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 	}
 	if (ctx->attr->data.non_resident.lowest_vcn) {
 		ntfs_error(vi->i_sb, "First extent of $INDEX_ALLOCATION "
-				"attribute has non zero lowest_vcn.  Inode is "
-				"corrupt. You should run chkdsk.");
+				"attribute has non zero lowest_vcn.");
 		goto unm_err_out;
 	}
 	vi->i_size = sle64_to_cpu(ctx->attr->data.non_resident.data_size);
@@ -1595,16 +1627,16 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 	bni = NTFS_I(bvi);
 	if (NInoCompressed(bni) || NInoEncrypted(bni) ||
 			NInoSparse(bni)) {
-		ntfs_error(vi->i_sb, "$BITMAP attribute is compressed "
-				"and/or encrypted and/or sparse.");
+		ntfs_error(vi->i_sb, "$BITMAP attribute is compressed and/or "
+				"encrypted and/or sparse.");
 		goto iput_unm_err_out;
 	}
 	/* Consistency check bitmap size vs. index allocation size. */
 	if ((bvi->i_size << 3) < (vi->i_size >>
 			ni->itype.index.block_size_bits)) {
-		ntfs_error(vi->i_sb, "Index bitmap too small (0x%llx) "
-				"for index allocation (0x%llx).",
-				bvi->i_size << 3, vi->i_size);
+		ntfs_error(vi->i_sb, "Index bitmap too small (0x%llx) for "
+				"index allocation (0x%llx).", bvi->i_size << 3,
+				vi->i_size);
 		goto iput_unm_err_out;
 	}
 	ni->itype.index.bmp_ino = bvi;
@@ -1637,9 +1669,11 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 		unmap_mft_record(base_ni);
 err_out:
 	ntfs_error(vi->i_sb, "Failed with error code %i while reading index "
-			"inode (mft_no 0x%lx, name_len %i.", -err, vi->i_ino,
+			"inode (mft_no 0x%lx, name_len %i.", err, vi->i_ino,
 			ni->name_len);
 	make_bad_inode(vi);
+	if (err != -EOPNOTSUPP && err != -ENOMEM)
+		NVolSetErrors(vol);
 	return err;
 }

@@ -2099,7 +2133,7 @@ void ntfs_put_inode(struct inode *vi)
 	}
 }

-void __ntfs_clear_inode(ntfs_inode *ni)
+static void __ntfs_clear_inode(ntfs_inode *ni)
 {
 	/* Free all alocated memory. */
 	down_write(&ni->runlist.lock);
@@ -2263,68 +2297,93 @@ int ntfs_show_options(struct seq_file *sf, struct vfsmount *mnt)
 * This implies for us that @vi is a file inode rather than a directory, index,
 * or attribute inode as well as that @vi is a base inode.
 *
+ * Returns 0 on success or -errno on error.
+ *
 * Called with ->i_sem held.  In all but one case ->i_alloc_sem is held for
 * writing.  The only case where ->i_alloc_sem is not held is
 * mm/filemap.c::generic_file_buffered_write() where vmtruncate() is called
 * with the current i_size as the offset which means that it is a noop as far
 * as ntfs_truncate() is concerned.
 */
-void ntfs_truncate(struct inode *vi)
+int ntfs_truncate(struct inode *vi)
 {
 	ntfs_inode *ni = NTFS_I(vi);
+	ntfs_volume *vol = ni->vol;
 	ntfs_attr_search_ctx *ctx;
 	MFT_RECORD *m;
+	const char *te = "  Leaving file length out of sync with i_size.";
 	int err;

+	ntfs_debug("Entering for inode 0x%lx.", vi->i_ino);
 	BUG_ON(NInoAttr(ni));
 	BUG_ON(ni->nr_extents < 0);
 	m = map_mft_record(ni);
 	if (IS_ERR(m)) {
+		err = PTR_ERR(m);
 		ntfs_error(vi->i_sb, "Failed to map mft record for inode 0x%lx "
-				"(error code %ld).", vi->i_ino, PTR_ERR(m));
-		if (PTR_ERR(m) != ENOMEM)
-			make_bad_inode(vi);
-		return;
+				"(error code %d).%s", vi->i_ino, err, te);
+		ctx = NULL;
+		m = NULL;
+		goto err_out;
 	}
 	ctx = ntfs_attr_get_search_ctx(ni, m);
 	if (unlikely(!ctx)) {
-		ntfs_error(vi->i_sb, "Failed to allocate a search context: "
-				"Not enough memory");
-		// FIXME: We can't report an error code upstream.  So what do
-		// we do?!?  make_bad_inode() seems a bit harsh...
-		unmap_mft_record(ni);
-		return;
+		ntfs_error(vi->i_sb, "Failed to allocate a search context for "
+				"inode 0x%lx (not enough memory).%s",
+				vi->i_ino, te);
+		err = -ENOMEM;
+		goto err_out;
 	}
 	err = ntfs_attr_lookup(ni->type, ni->name, ni->name_len,
 			CASE_SENSITIVE, 0, NULL, 0, ctx);
 	if (unlikely(err)) {
-		if (err == -ENOENT) {
+		if (err == -ENOENT)
 			ntfs_error(vi->i_sb, "Open attribute is missing from "
 					"mft record.  Inode 0x%lx is corrupt.  "
 					"Run chkdsk.", vi->i_ino);
-			make_bad_inode(vi);
-		} else {
+		else
 			ntfs_error(vi->i_sb, "Failed to lookup attribute in "
 					"inode 0x%lx (error code %d).",
 					vi->i_ino, err);
-			// FIXME: We can't report an error code upstream.  So
-			// what do we do?!?  make_bad_inode() seems a bit
-			// harsh...
-		}
-		goto out;
+		goto err_out;
 	}
 	/* If the size has not changed there is nothing to do. */
 	if (ntfs_attr_size(ctx->attr) == i_size_read(vi))
-		goto out;
+		goto done;
 	// TODO: Implement the truncate...
 	ntfs_error(vi->i_sb, "Inode size has changed but this is not "
 			"implemented yet.  Resetting inode size to old value. "
 			" This is most likely a bug in the ntfs driver!");
 	i_size_write(vi, ntfs_attr_size(ctx->attr)); 
-out:
+done:
 	ntfs_attr_put_search_ctx(ctx);
 	unmap_mft_record(ni);
-	return;
+	NInoClearTruncateFailed(ni);
+	ntfs_debug("Done.");
+	return 0;
+err_out:
+	if (err != -ENOMEM) {
+		NVolSetErrors(vol);
+		make_bad_inode(vi);
+	}
+	if (ctx)
+		ntfs_attr_put_search_ctx(ctx);
+	if (m)
+		unmap_mft_record(ni);
+	NInoSetTruncateFailed(ni);
+	return err;
+}
+
+/**
+ * ntfs_truncate_vfs - wrapper for ntfs_truncate() that has no return value
+ * @vi:		inode for which the i_size was changed
+ *
+ * Wrapper for ntfs_truncate() that has no return value.
+ *
+ * See ntfs_truncate() description above for details.
+ */
+void ntfs_truncate_vfs(struct inode *vi) {
+	ntfs_truncate(vi);
 }

 /**
@@ -2549,6 +2608,7 @@ int ntfs_write_inode(struct inode *vi, int sync)
 		ntfs_error(vi->i_sb, "Failed (error code %i):  Marking inode "
 				"as bad.  You should run chkdsk.", -err);
 		make_bad_inode(vi);
+		NVolSetErrors(ni->vol);
 	}
 	return err;
 }

--- a/fs/ntfs/inode.h
+++ b/fs/ntfs/inode.h
@@ -165,6 +165,7 @@ typedef enum {
 	NI_Sparse,		/* 1: Unnamed data attr is sparse (f).
 				   1: Create sparse files by default (d).
 				   1: Attribute is sparse (a). */
+	NI_TruncateFailed,	/* 1: Last ntfs_truncate() call failed. */
 } ntfs_inode_state_bits;

 /*
@@ -216,6 +217,7 @@ NINO_FNS(IndexAllocPresent)
 NINO_FNS(Compressed)
 NINO_FNS(Encrypted)
 NINO_FNS(Sparse)
+NINO_FNS(TruncateFailed)

 /*
 * The full structure containing a ntfs_inode and a vfs struct inode. Used for
@@ -300,7 +302,8 @@ extern int ntfs_show_options(struct seq_file *sf, struct vfsmount *mnt);

 #ifdef NTFS_RW

-extern void ntfs_truncate(struct inode *vi);
+extern int ntfs_truncate(struct inode *vi);
+extern void ntfs_truncate_vfs(struct inode *vi);

 extern int ntfs_setattr(struct dentry *dentry, struct iattr *attr);


--- a/fs/ntfs/layout.h
+++ b/fs/ntfs/layout.h
@@ -589,8 +589,8 @@ typedef struct {
 					   FIXME: What does it mean? (AIA) */
 /* 88*/ COLLATION_RULE collation_rule;	/* Default collation rule. */
 /* 8c*/	ATTR_DEF_FLAGS flags;		/* Flags describing the attribute. */
-/* 90*/	le64 min_size;			/* Optional minimum attribute size. */
-/* 98*/	le64 max_size;			/* Maximum size of attribute. */
+/* 90*/	sle64 min_size;			/* Optional minimum attribute size. */
+/* 98*/	sle64 max_size;			/* Maximum size of attribute. */
 /* sizeof() = 0xa0 or 160 bytes */
 } __attribute__ ((__packed__)) ATTR_DEF;


--- a/fs/ntfs/lcnalloc.c
+++ b/fs/ntfs/lcnalloc.c
@@ -903,8 +903,8 @@ s64 __ntfs_cluster_free(struct inode *vi, const VCN start_vcn, s64 count,
 			 * Attempt to map runlist, dropping runlist lock for
 			 * the duration.
 			 */
-			up_read(&ni->runlist.lock);
 			vcn = rl->vcn;
+			up_read(&ni->runlist.lock);
 			err = ntfs_map_runlist(ni, vcn);
 			if (err) {
 				if (!is_rollback)

--- a/fs/ntfs/mft.c
+++ b/fs/ntfs/mft.c
@@ -68,20 +68,31 @@ static inline MFT_RECORD *map_mft_record_page(ntfs_inode *ni)
 		if (index > end_index || (mft_vi->i_size & ~PAGE_CACHE_MASK) <
 				ofs + vol->mft_record_size) {
 			page = ERR_PTR(-ENOENT);
+			ntfs_error(vol->sb, "Attemt to read mft record 0x%lx, "
+					"which is beyond the end of the mft.  "
+					"This is probably a bug in the ntfs "
+					"driver.", ni->mft_no);
 			goto err_out;
 		}
 	}
 	/* Read, map, and pin the page. */
 	page = ntfs_map_page(mft_vi->i_mapping, index);
 	if (likely(!IS_ERR(page))) {
+		/* Catch multi sector transfer fixup errors. */
+		if (likely(ntfs_is_mft_recordp((le32*)(page_address(page) +
+				ofs)))) {
 			ni->page = page;
 			ni->page_ofs = ofs;
 			return page_address(page) + ofs;
 		}
+		ntfs_error(vol->sb, "Mft record 0x%lx is corrupt.  "
+				"Run chkdsk.", ni->mft_no);
+		ntfs_unmap_page(page);
+		page = ERR_PTR(-EIO);
+	}
 err_out:
 	ni->page = NULL;
 	ni->page_ofs = 0;
-	ntfs_error(vol->sb, "Failed with error code %lu.", -PTR_ERR(page));
 	return (void*)page;
 }

@@ -455,8 +466,10 @@ int ntfs_sync_mft_mirror(ntfs_volume *vol, const unsigned long mft_no,
 	struct buffer_head *bhs[max_bhs];
 	struct buffer_head *bh, *head;
 	u8 *kmirr;
-	unsigned int block_start, block_end, m_start, m_end;
+	runlist_element *rl;
+	unsigned int block_start, block_end, m_start, m_end, page_ofs;
 	int i_bhs, nr_bhs, err = 0;
+	unsigned char blocksize_bits = vol->mftmirr_ino->i_blkbits;

 	ntfs_debug("Entering for inode 0x%lx.", mft_no);
 	BUG_ON(!max_bhs);
@@ -475,24 +488,32 @@ int ntfs_sync_mft_mirror(ntfs_volume *vol, const unsigned long mft_no,
 		err = PTR_ERR(page);
 		goto err_out;
 	}
-	/*
-	 * Exclusion against other writers.   This should never be a problem
-	 * since the page in which the mft record @m resides is also locked and
-	 * hence any other writers would be held up there but it is better to
-	 * make sure no one is writing from elsewhere.
-	 */
 	lock_page(page);
 	BUG_ON(!PageUptodate(page));
 	ClearPageUptodate(page);
+	/* Offset of the mft mirror record inside the page. */
+	page_ofs = (mft_no << vol->mft_record_size_bits) & ~PAGE_CACHE_MASK;
 	/* The address in the page of the mirror copy of the mft record @m. */
-	kmirr = page_address(page) + ((mft_no << vol->mft_record_size_bits) &
-			~PAGE_CACHE_MASK);
+	kmirr = page_address(page) + page_ofs;
 	/* Copy the mst protected mft record to the mirror. */
 	memcpy(kmirr, m, vol->mft_record_size);
-	/* Make sure we have mapped buffers. */
+	/* Create uptodate buffers if not present. */
+	if (unlikely(!page_has_buffers(page))) {
+		struct buffer_head *tail;
+
+		bh = head = alloc_page_buffers(page, blocksize, 1);
+		do {
+			set_buffer_uptodate(bh);
+			tail = bh;
+			bh = bh->b_this_page;
+		} while (bh);
+		tail->b_this_page = head;
+		attach_page_buffers(page, head);
 		BUG_ON(!page_has_buffers(page));
+	}
 	bh = head = page_buffers(page);
 	BUG_ON(!bh);
+	rl = NULL;
 	nr_bhs = 0;
 	block_start = 0;
 	m_start = kmirr - (u8*)page_address(page);
@@ -500,15 +521,61 @@ int ntfs_sync_mft_mirror(ntfs_volume *vol, const unsigned long mft_no,
 	do {
 		block_end = block_start + blocksize;
 		/* If the buffer is outside the mft record, skip it. */
-		if ((block_end <= m_start) || (block_start >= m_end))
+		if (block_end <= m_start)
 			continue;
-		BUG_ON(!buffer_mapped(bh));
+		if (unlikely(block_start >= m_end))
+			break;
+		/* Need to map the buffer if it is not mapped already. */
+		if (unlikely(!buffer_mapped(bh))) {
+			VCN vcn;
+			LCN lcn;
+			unsigned int vcn_ofs;
+
+			/* Obtain the vcn and offset of the current block. */
+			vcn = ((VCN)mft_no << vol->mft_record_size_bits) +
+					(block_start - m_start);
+			vcn_ofs = vcn & vol->cluster_size_mask;
+			vcn >>= vol->cluster_size_bits;
+			if (!rl) {
+				down_read(&NTFS_I(vol->mftmirr_ino)->
+						runlist.lock);
+				rl = NTFS_I(vol->mftmirr_ino)->runlist.rl;
+				/*
+				 * $MFTMirr always has the whole of its runlist
+				 * in memory.
+				 */
+				BUG_ON(!rl);
+			}
+			/* Seek to element containing target vcn. */
+			while (rl->length && rl[1].vcn <= vcn)
+				rl++;
+			lcn = ntfs_rl_vcn_to_lcn(rl, vcn);
+			/* For $MFTMirr, only lcn >= 0 is a successful remap. */
+			if (likely(lcn >= 0)) {
+				/* Setup buffer head to correct block. */
+				bh->b_blocknr = ((lcn <<
+						vol->cluster_size_bits) +
+						vcn_ofs) >> blocksize_bits;
+				set_buffer_mapped(bh);
+			} else {
+				bh->b_blocknr = -1;
+				ntfs_error(vol->sb, "Cannot write mft mirror "
+						"record 0x%lx because its "
+						"location on disk could not "
+						"be determined (error code "
+						"%lli).", mft_no,
+						(long long)lcn);
+				err = -EIO;
+			}
+		}
 		BUG_ON(!buffer_uptodate(bh));
 		BUG_ON(!nr_bhs && (m_start != block_start));
 		BUG_ON(nr_bhs >= max_bhs);
 		bhs[nr_bhs++] = bh;
 		BUG_ON((nr_bhs >= max_bhs) && (m_end != block_end));
 	} while (block_start = block_end, (bh = bh->b_this_page) != head);
+	if (unlikely(rl))
+		up_read(&NTFS_I(vol->mftmirr_ino)->runlist.lock);
 	if (likely(!err)) {
 		/* Lock buffers and start synchronous write i/o on them. */
 		for (i_bhs = 0; i_bhs < nr_bhs; i_bhs++) {
@@ -517,7 +584,6 @@ int ntfs_sync_mft_mirror(ntfs_volume *vol, const unsigned long mft_no,
 			if (unlikely(test_set_buffer_locked(tbh)))
 				BUG();
 			BUG_ON(!buffer_uptodate(tbh));
-			if (buffer_dirty(tbh))
 			clear_buffer_dirty(tbh);
 			get_bh(tbh);
 			tbh->b_end_io = end_buffer_write_sync;
@@ -602,13 +668,14 @@ int write_mft_record_nolock(ntfs_inode *ni, MFT_RECORD *m, int sync)
 {
 	ntfs_volume *vol = ni->vol;
 	struct page *page = ni->page;
-	unsigned int blocksize = vol->sb->s_blocksize;
+	unsigned char blocksize_bits = vol->mft_ino->i_blkbits;
+	unsigned int blocksize = 1 << blocksize_bits;
 	int max_bhs = vol->mft_record_size / blocksize;
 	struct buffer_head *bhs[max_bhs];
 	struct buffer_head *bh, *head;
+	runlist_element *rl;
 	unsigned int block_start, block_end, m_start, m_end;
 	int i_bhs, nr_bhs, err = 0;
-	BOOL rec_is_dirty = TRUE;

 	ntfs_debug("Entering for inode 0x%lx.", ni->mft_no);
 	BUG_ON(NInoAttr(ni));
@@ -625,6 +692,7 @@ int write_mft_record_nolock(ntfs_inode *ni, MFT_RECORD *m, int sync)
 	BUG_ON(!page_has_buffers(page));
 	bh = head = page_buffers(page);
 	BUG_ON(!bh);
+	rl = NULL;
 	nr_bhs = 0;
 	block_start = 0;
 	m_start = ni->page_ofs;
@@ -636,31 +704,65 @@ int write_mft_record_nolock(ntfs_inode *ni, MFT_RECORD *m, int sync)
 			continue;
 		if (unlikely(block_start >= m_end))
 			break;
+		/*
+		 * If this block is not the first one in the record, we ignore
+		 * the buffer's dirty state because we could have raced with a
+		 * parallel mark_ntfs_record_dirty().
+		 */
 		if (block_start == m_start) {
 			/* This block is the first one in the record. */
 			if (!buffer_dirty(bh)) {
+				BUG_ON(nr_bhs);
 				/* Clean records are not written out. */
-				rec_is_dirty = FALSE;
-				continue;
+				break;
 			}
-			rec_is_dirty = TRUE;
+		}
+		/* Need to map the buffer if it is not mapped already. */
+		if (unlikely(!buffer_mapped(bh))) {
+			VCN vcn;
+			LCN lcn;
+			unsigned int vcn_ofs;
+
+			/* Obtain the vcn and offset of the current block. */
+			vcn = ((VCN)ni->mft_no << vol->mft_record_size_bits) +
+					(block_start - m_start);
+			vcn_ofs = vcn & vol->cluster_size_mask;
+			vcn >>= vol->cluster_size_bits;
+			if (!rl) {
+				down_read(&NTFS_I(vol->mft_ino)->runlist.lock);
+				rl = NTFS_I(vol->mft_ino)->runlist.rl;
+				BUG_ON(!rl);
+			}
+			/* Seek to element containing target vcn. */
+			while (rl->length && rl[1].vcn <= vcn)
+				rl++;
+			lcn = ntfs_rl_vcn_to_lcn(rl, vcn);
+			/* For $MFT, only lcn >= 0 is a successful remap. */
+			if (likely(lcn >= 0)) {
+				/* Setup buffer head to correct block. */
+				bh->b_blocknr = ((lcn <<
+						vol->cluster_size_bits) +
+						vcn_ofs) >> blocksize_bits;
+				set_buffer_mapped(bh);
 			} else {
-			/*
-			 * This block is not the first one in the record.  We
-			 * ignore the buffer's dirty state because we could
-			 * have raced with a parallel mark_ntfs_record_dirty().
-			 */
-			if (!rec_is_dirty)
-				continue;
+				bh->b_blocknr = -1;
+				ntfs_error(vol->sb, "Cannot write mft record "
+						"0x%lx because its location "
+						"on disk could not be "
+						"determined (error code %lli).",
+						ni->mft_no, (long long)lcn);
+				err = -EIO;
+			}
 		}
-		BUG_ON(!buffer_mapped(bh));
 		BUG_ON(!buffer_uptodate(bh));
 		BUG_ON(!nr_bhs && (m_start != block_start));
 		BUG_ON(nr_bhs >= max_bhs);
 		bhs[nr_bhs++] = bh;
 		BUG_ON((nr_bhs >= max_bhs) && (m_end != block_end));
 	} while (block_start = block_end, (bh = bh->b_this_page) != head);
-	if (!rec_is_dirty)
+	if (unlikely(rl))
+		up_read(&NTFS_I(vol->mft_ino)->runlist.lock);
+	if (!nr_bhs)
 		goto done;
 	if (unlikely(err))
 		goto cleanup_out;
@@ -734,7 +836,8 @@ int write_mft_record_nolock(ntfs_inode *ni, MFT_RECORD *m, int sync)
 				"Redirtying so the write is retried later.");
 		mark_mft_record_dirty(ni);
 		err = 0;
-	}
+	} else
+		NVolSetErrors(vol);
 	return err;
 }


--- a/fs/ntfs/namei.c
+++ b/fs/ntfs/namei.c
@@ -127,11 +127,12 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 		dent_inode = ntfs_iget(vol->sb, dent_ino);
 		if (likely(!IS_ERR(dent_inode))) {
 			/* Consistency check. */
-			if (MSEQNO(mref) == NTFS_I(dent_inode)->seq_no ||
+			if (is_bad_inode(dent_inode) || MSEQNO(mref) ==
+					NTFS_I(dent_inode)->seq_no ||
 					dent_ino == FILE_MFT) {
 				/* Perfect WIN32/POSIX match. -- Case 1. */
 				if (!name) {
-					ntfs_debug("Done.");
+					ntfs_debug("Done.  (Case 1.)");
 					return d_splice_alias(dent_inode, dent);
 				}
 				/*
@@ -181,6 +182,7 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,

 	nls_name.name = NULL;
 	if (name->type != FILE_NAME_DOS) {			/* Case 2. */
+		ntfs_debug("Case 2.");
 		nls_name.len = (unsigned)ntfs_ucstonls(vol,
 				(ntfschar*)&name->name, name->len,
 				(unsigned char**)&nls_name.name, 0);
@@ -188,6 +190,7 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 	} else /* if (name->type == FILE_NAME_DOS) */ {		/* Case 3. */
 		FILE_NAME_ATTR *fn;

+		ntfs_debug("Case 3.");
 		kfree(name);

 		/* Find the WIN32 name corresponding to the matched DOS name. */
@@ -271,12 +274,17 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 			dput(real_dent);
 		else
 			new_dent = real_dent;
+		ntfs_debug("Done.  (Created new dentry.)");
 		return new_dent;
 	}
 	kfree(nls_name.name);
 	/* Matching dentry exists, check if it is negative. */
 	if (real_dent->d_inode) {
-		BUG_ON(real_dent->d_inode != dent_inode);
+		if (unlikely(real_dent->d_inode != dent_inode)) {
+			/* This can happen because bad inodes are unhashed. */
+			BUG_ON(!is_bad_inode(dent_inode));
+			BUG_ON(!is_bad_inode(real_dent->d_inode));
+		}
 		/*
 		 * Already have the inode and the dentry attached, decrement
 		 * the reference count to balance the ntfs_iget() we did
@@ -285,6 +293,7 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 		 * about any NFS/disconnectedness issues here.
 		 */
 		iput(dent_inode);
+		ntfs_debug("Done.  (Already had inode and dentry.)");
 		return real_dent;
 	}
 	/*
@@ -295,6 +304,7 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 	if (!S_ISDIR(dent_inode->i_mode)) {
 		/* Not a directory; everything is easy. */
 		d_instantiate(real_dent, dent_inode);
+		ntfs_debug("Done.  (Already had negative file dentry.)");
 		return real_dent;
 	}
 	spin_lock(&dcache_lock);
@@ -308,6 +318,7 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 		real_dent->d_inode = dent_inode;
 		spin_unlock(&dcache_lock);
 		security_d_instantiate(real_dent, dent_inode);
+		ntfs_debug("Done.  (Already had negative directory dentry.)");
 		return real_dent;
 	}
 	/*
@@ -327,6 +338,8 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 	/* Throw away real_dent. */
 	dput(real_dent);
 	/* Use new_dent as the actual dentry. */
+	ntfs_debug("Done.  (Already had negative, disconnected directory "
+			"dentry.)");
 	return new_dent;

 eio_err_out:
@@ -338,6 +351,7 @@ static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
 	if (m)
 		unmap_mft_record(ni);
 	iput(dent_inode);
+	ntfs_error(vol->sb, "Failed, returning error code %i.", err);
 	return ERR_PTR(err);
   }
 }

--- a/fs/ntfs/ntfs.h
+++ b/fs/ntfs/ntfs.h
@@ -53,7 +53,6 @@ extern kmem_cache_t *ntfs_attr_ctx_cache;
 extern kmem_cache_t *ntfs_index_ctx_cache;

 /* The various operations structs defined throughout the driver files. */
-extern struct super_operations ntfs_sops;
 extern struct address_space_operations ntfs_aops;
 extern struct address_space_operations ntfs_mst_aops;

@@ -86,8 +85,6 @@ extern void free_compression_buffers(void);

 /* From fs/ntfs/super.c */
 #define default_upcase_len 0x10000
-extern ntfschar *default_upcase;
-extern unsigned long ntfs_nr_upcase_users;
 extern struct semaphore ntfs_lock;

 typedef struct {

--- a/fs/ntfs/quota.c
+++ b/fs/ntfs/quota.c
@@ -52,7 +52,7 @@ BOOL ntfs_mark_quotas_out_of_date(ntfs_volume *vol)
 	ictx = ntfs_index_ctx_get(NTFS_I(vol->quota_q_ino));
 	if (!ictx) {
 		ntfs_error(vol->sb, "Failed to get index context.");
-		return FALSE;
+		goto err_out;
 	}
 	err = ntfs_index_lookup(&qid, sizeof(qid), ictx);
 	if (err) {
@@ -108,6 +108,7 @@ BOOL ntfs_mark_quotas_out_of_date(ntfs_volume *vol)
 	ntfs_debug("Done.");
 	return TRUE;
 err_out:
+	if (ictx)
 		ntfs_index_ctx_put(ictx);
 	up(&vol->quota_q_ino->i_sem);
 	return FALSE;

--- a/fs/ntfs/runlist.c
+++ b/fs/ntfs/runlist.c
@@ -139,30 +139,6 @@ static inline void __ntfs_rl_merge(runlist_element *dst, runlist_element *src)
 	dst->length += src->length;
 }

-/**
- * ntfs_rl_merge - test if two runlists can be joined together and merge them
- * @dst:	original, destination runlist
- * @src:	new runlist to merge with @dst
- *
- * Test if two runlists can be joined together. For this, their VCNs and LCNs
- * must be adjacent. If they can be merged, perform the merge, writing into
- * the destination runlist @dst.
- *
- * It is up to the caller to serialize access to the runlists @dst and @src.
- *
- * Return: TRUE   Success, the runlists have been merged.
- *	   FALSE  Failure, the runlists cannot be merged and have not been
- *		  modified.
- */
-static inline BOOL ntfs_rl_merge(runlist_element *dst, runlist_element *src)
-{
-	BOOL merge = ntfs_are_rl_mergeable(dst, src);
-
-	if (merge)
-		__ntfs_rl_merge(dst, src);
-	return merge;
-}
-
 /**
 * ntfs_rl_append - append a runlist after a given element
 * @dst:	original runlist to be worked on

--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -44,6 +44,10 @@
 /* Number of mounted file systems which have compression enabled. */
 static unsigned long ntfs_nr_compression_users;

+/* A global default upcase table and a corresponding reference count. */
+static ntfschar *default_upcase = NULL;
+static unsigned long ntfs_nr_upcase_users = 0;
+
 /* Error constants/strings used in inode.c::ntfs_show_options(). */
 typedef enum {
 	/* One of these must be present, default is ON_ERRORS_CONTINUE. */
@@ -742,6 +746,18 @@ static BOOL parse_ntfs_boot_sector(ntfs_volume *vol, const NTFS_BOOT_SECTOR *b)
 			vol->mft_record_size_mask);
 	ntfs_debug("vol->mft_record_size_bits = %i (0x%x)",
 			vol->mft_record_size_bits, vol->mft_record_size_bits);
+	/*
+	 * We cannot support mft record sizes above the PAGE_CACHE_SIZE since
+	 * we store $MFT/$DATA, the table of mft records in the page cache.
+	 */
+	if (vol->mft_record_size > PAGE_CACHE_SIZE) {
+		ntfs_error(vol->sb, "Mft record size %i (0x%x) exceeds the "
+				"page cache size on your system %lu (0x%lx).  "
+				"This is not supported.  Sorry.",
+				vol->mft_record_size, vol->mft_record_size,
+				PAGE_CACHE_SIZE, PAGE_CACHE_SIZE);
+		return FALSE;
+	}
 	clusters_per_index_record = b->clusters_per_index_record;
 	ntfs_debug("clusters_per_index_record = %i (0x%x)",
 			clusters_per_index_record, clusters_per_index_record);
@@ -785,8 +801,8 @@ static BOOL parse_ntfs_boot_sector(ntfs_volume *vol, const NTFS_BOOT_SECTOR *b)
 	if (sizeof(unsigned long) < 8) {
 		if ((ll << vol->cluster_size_bits) >= (1ULL << 41)) {
 			ntfs_error(vol->sb, "Volume size (%lluTiB) is too "
-					"large for this architecture. Maximum "
-					"supported is 2TiB. Sorry.",
+					"large for this architecture.  "
+					"Maximum supported is 2TiB.  Sorry.",
 					(unsigned long long)ll >> (40 -
 					vol->cluster_size_bits));
 			return FALSE;
@@ -967,6 +983,10 @@ static BOOL load_and_init_mft_mirror(ntfs_volume *vol)
 * @vol:	ntfs super block describing device whose mft mirror to check
 *
 * Return TRUE on success or FALSE on error.
+ *
+ * Note, this function also results in the mft mirror runlist being completely
+ * mapped into memory.  The mft mirror write code requires this and will BUG()
+ * should it find an unmapped runlist element.
 */
 static BOOL check_mft_mirror(ntfs_volume *vol)
 {
@@ -2163,7 +2183,7 @@ static int ntfs_statfs(struct super_block *sb, struct kstatfs *sfs)
 /**
 * The complete super operations.
 */
-struct super_operations ntfs_sops = {
+static struct super_operations ntfs_sops = {
 	.alloc_inode	= ntfs_alloc_big_inode,	  /* VFS: Allocate new inode. */
 	.destroy_inode	= ntfs_destroy_big_inode, /* VFS: Deallocate inode. */
 	.put_inode	= ntfs_put_inode,	  /* VFS: Called just before
@@ -2581,10 +2601,6 @@ static void ntfs_big_inode_init_once(void *foo, kmem_cache_t *cachep,
 kmem_cache_t *ntfs_attr_ctx_cache;
 kmem_cache_t *ntfs_index_ctx_cache;

-/* A global default upcase table and a corresponding reference count. */
-ntfschar *default_upcase = NULL;
-unsigned long ntfs_nr_upcase_users = 0;
-
 /* Driver wide semaphore. */
 DECLARE_MUTEX(ntfs_lock);

@@ -2742,6 +2758,7 @@ static void __exit exit_ntfs_fs(void)

 MODULE_AUTHOR("Anton Altaparmakov <aia21@cantab.net>");
 MODULE_DESCRIPTION("NTFS 1.2/3.x driver - Copyright (c) 2001-2004 Anton Altaparmakov");
+MODULE_VERSION(NTFS_VERSION);
 MODULE_LICENSE("GPL");
 #ifdef DEBUG
 module_param(debug_msgs, bool, 0);