Commits · 9d0be50230b333005635967f7ecd4897dbfd181b · nexedi / linux

01 Jan, 2010 2 commits

ext4: Calculate metadata requirements more accurately · 9d0be502

Theodore Ts'o authored Jan 01, 2010

In the past, ext4_calc_metadata_amount(), and its sub-functions
ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
badly over-estimated the number of metadata blocks that might be
required for delayed allocation blocks. This didn't matter as much
when functions which managed the reserved metadata blocks were more
aggressive about dropping reserved metadata blocks as delayed
allocation blocks were written, but unfortunately they were too
aggressive. This was fixed in commit 0637c6f4, but as a result the
over-estimation by ext4_calc_metadata_amount() would lead to reserving
2-3 times the number of pending delayed allocation blocks as
potentially required metadata blocks. So if there are 1 megabytes of
blocks which have been not yet been allocation, up to 3 megabytes of
space would get reserved out of the user's quota and from the file
system free space pool until all of the inode's data blocks have been
allocated.

This commit addresses this problem by much more accurately estimating
the number of metadata blocks that will be required. It will still
somewhat over-estimate the number of blocks needed, since it must make
a worst case estimate not knowing which physical blocks will be
needed, but it is much more accurate than before.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9d0be502

ext4: Fix accounting of reserved metadata blocks · ee5f4d9c

Theodore Ts'o authored Jan 01, 2010

Commit 0637c6f4 had a typo which caused the reserved metadata blocks to
not be released correctly.   Fix this.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ee5f4d9c

30 Dec, 2009 2 commits

ext4: Patch up how we claim metadata blocks for quota purposes · 0637c6f4

Theodore Ts'o authored Dec 30, 2009

As reported in Kernel Bugzilla #14936, commit d21cd8f1 triggered a BUG
in the function ext4_da_update_reserve_space() found in
fs/ext4/inode.c.  The root cause of this BUG() was caused by the fact
that ext4_calc_metadata_amount() can severely over-estimate how many
metadata blocks will be needed, especially when using direct
block-mapped files.

In addition, it can also badly *under* estimate how much space is
needed, since ext4_calc_metadata_amount() assumes that the blocks are
contiguous, and this is not always true.  If the application is
writing blocks to a sparse file, the number of metadata blocks
necessary can be severly underestimated by the functions
ext4_da_reserve_space(), ext4_da_update_reserve_space() and
ext4_da_release_space().  This was the cause of the dq_claim_space
reports found on kerneloops.org.

Unfortunately, doing this right means that we need to massively
over-estimate the amount of free space needed.  So in some cases we
may need to force the inode to be written to disk asynchronously in
to avoid spurious quota failures.

http://bugzilla.kernel.org/show_bug.cgi?id=14936Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0637c6f4

ext4: Ensure zeroout blocks have no dirty metadata · 515f41c3

Aneesh Kumar K.V authored Dec 29, 2009

This fixes a bug (found by Curt Wohlgemuth) in which new blocks
returned from an extent created with ext4_ext_zeroout() can have dirty
metadata still associated with them.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

515f41c3

25 Dec, 2009 1 commit

ext4: return correct wbc.nr_to_write in ext4_da_writepages · 2faf2e19

Richard Kennedy authored Dec 25, 2009

When ext4_da_writepages increases the nr_to_write in writeback_control
then it must always re-base the return value.  Originally there was a
(misguided) attempt prevent wbc.nr_to_write from going negative.  In
fact, it's necessary to allow nr_to_write to be negative so that
wb_writeback() can correctly calculate how many pages were actually
written.  
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2faf2e19

24 Dec, 2009 1 commit

ext4: Update documentation to correct the inode_readahead_blks option name · 6d3b82f2

Fang Wenqi authored Dec 24, 2009

Per commit 240799cd, the option name for readahead should be
inode_readahead_blks, not inode_readahead.
Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

6d3b82f2

23 Dec, 2009 6 commits

jbd2: don't use __GFP_NOFAIL in journal_init_common() · 3ebfdf88

Andrew Morton authored Dec 23, 2009

It triggers the warning in get_page_from_freelist(), and it isn't
appropriate to use __GFP_NOFAIL here anyway.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14843Reported-by: Christian Casteyde <casteyde.christian@free.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

3ebfdf88

ext4: flush delalloc blocks when space is low · c8afb446

Eric Sandeen authored Dec 23, 2009

Creating many small files in rapid succession on a small
filesystem can lead to spurious ENOSPC; on a 104MB filesystem:

for i in `seq 1 22500`; do
    echo -n > $SCRATCH_MNT/$i
    echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i
done

leads to ENOSPC even though after a sync, 40% of the fs is free
again.

This is because we reserve worst-case metadata for delalloc writes,
and when data is allocated that worst-case reservation is not
usually needed.

When freespace is low, kicking off an async writeback will start
converting that worst-case space usage into something more realistic,
almost always freeing up space to continue.

This resolves the testcase for me, and survives all 4 generic
ENOSPC tests in xfstests.

We'll still need a hard synchronous sync to squeeze out the last bit,
but this fixes things up to a large degree.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

c8afb446

fs-writeback: Add helper function to start writeback if idle · 17bd55d0

Eric Sandeen authored Dec 23, 2009

ext4, at least, would like to start pushing on writeback if it starts
to get close to ENOSPC when reserving worst-case blocks for delalloc
writes.  Writing out delalloc data will convert those worst-case
predictions into usually smaller actual usage, freeing up space
before we hit ENOSPC based on this speculation.

Thanks to Jens for the suggestion for the helper function,
& the naming help.

I've made the helper return status on whether writeback was
started even though I don't plan to use it in the ext4 patch;
it seems like it would be potentially useful to test this
in some cases.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>

17bd55d0

ext4: Eliminate potential double free on error path · d3533d72

Julia Lawall authored Dec 23, 2009

b_entry_name and buffer are initially NULL, are initialized within a loop
to the result of calling kmalloc, and are freed at the bottom of this loop.
The loop contains gotos to cleanup, which also frees b_entry_name and
buffer.  Some of these gotos are before the reinitializations of
b_entry_name and buffer.  To maintain the invariant that b_entry_name and
buffer are NULL at the top of the loop, and thus acceptable arguments to
kfree, these variables are now set to NULL after the kfrees.

This seems to be the simplest solution.  A more complicated solution
would be to introduce more labels in the error handling code at the end of
the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier E;
expression E1;
iterator I;
statement S;
@@

*kfree(E);
... when != E = E1
    when != I(E,...) S
    when != &E
*kfree(E);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

d3533d72

ext4: fix unsigned long long printk warning in super.c · a6b43e38

Andrew Morton authored Dec 23, 2009

sparc64 allmodconfig:

fs/ext4/super.c: In function `lifetime_write_kbytes_show':
fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

a6b43e38

ext4, jbd2: Add barriers for file systems with exernal journals · cc3e1bea

Theodore Ts'o authored Dec 23, 2009

This is a bit complicated because we are trying to optimize when we
send barriers to the fs data disk. We could just throw in an extra
barrier to the data disk whenever we send a barrier to the journal
disk, but that's not always strictly necessary.

We only need to send a barrier during a commit when there are data
blocks which are must be written out due to an inode written in
ordered mode, or if fsync() depends on the commit to force data blocks
to disk. Finally, before we drop transactions from the beginning of
the journal during a checkpoint operation, we need to guarantee that
any blocks that were flushed out to the data disk are firmly on the
rust platter before we drop the transaction from the journal.

Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

cc3e1bea

14 Dec, 2009 1 commit

ext4: replace BUG() with return -EIO in ext4_ext_get_blocks · 034fb4c9

Surbhi Palande authored Dec 14, 2009

This patch fixes the Kernel BZ #14286. When the address of an extent
corresponding to a valid block is corrupted, a -EIO should be reported
instead of a BUG(). This situation should not normally not occur
except in the case of a corrupted filesystem. If however it does,
then the system should not panic directly but depending on the mount
time options appropriate action should be taken. If the mount options
so permit, the I/O should be gracefully aborted by returning a -EIO.

http://bugzilla.kernel.org/show_bug.cgi?id=14286Signed-off-by: Surbhi Palande <surbhi.palande@canonical.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

034fb4c9

21 Dec, 2009 2 commits

ext4: add module aliases for ext2 and ext3 · 51b7e3c9

Theodore Ts'o authored Dec 21, 2009

Add module aliases for ext2 and ext3 when CONFIG_EXT4_USE_FOR_EXT23 is
set.  This makes the existing user-space stuff like mkinitrd working
as is.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

51b7e3c9

ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured · 84c66473

David Howells authored Dec 21, 2009

Don't offer to build ext2/3 support into ext4 if ext4 itself is not
configured on.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

84c66473

14 Dec, 2009 1 commit

ext4: remove unused #include <linux/version.h> · 149feb00

Huang Weiyi authored Dec 14, 2009

Remove unused #include <linux/version.h>('s) in
  fs/ext4/block_validity.c
  fs/ext4/mballoc.h
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

149feb00

24 Dec, 2009 24 commits

Linux 2.6.33-rc2 · 6b7b2849
Linus Torvalds authored Dec 24, 2009

6b7b2849

Merge branch 'sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6 · 0b5e2588

Linus Torvalds authored Dec 24, 2009

* 'sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc-2.6:
  SYSCTL: Add a mutex to the page_alloc zone order sysctl
  SYSCTL: Print binary sysctl warnings (nearly) only once

0b5e2588

Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 · 6067d7e4
Linus Torvalds authored Dec 24, 2009
```
* 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6:
  HWPOISON: Add PROC_FS dependency to hwpoison injector v2
```
6067d7e4

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 · 71492fd1

Linus Torvalds authored Dec 24, 2009

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits)
  classmate-laptop: add support for Classmate PC ACPI devices
  hp-wmi: Fix two memleaks
  acer-wmi, msi-wmi: Remove needless DMI MODULE_ALIAS
  dell-wmi: do not keep driver loaded on unsupported boxes
  wmi: Free the allocated acpi objects through wmi_get_event_data
  drivers/platform/x86/acerhdf.c: check BIOS information whether it begins with string of table
  acerhdf: add new BIOS versions
  acerhdf: limit modalias matching to supported
  toshiba_acpi: convert to seq_file
  asus_acpi: convert to seq_file
  ACPI: do not select ACPI_DOCK from ATA_ACPI
  sony-laptop: enumerate rfkill devices using SN06
  sony-laptop: rfkill support for newer models
  ACPI: fix OSC regression that caused aer and pciehp not to load
  MAINTAINERS: add maintainer for msi-wmi driver
  fujitu-laptop: fix tests of acpi_evaluate_integer() return value
  arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: avoid cross-CPU interrupts by using smp_call_function_any()
  ACPI: processor: remove _PDC object list from struct acpi_processor
  ACPI: processor: change acpi_processor_set_pdc() interface
  ACPI: processor: open code acpi_processor_cleanup_pdc
  ...

71492fd1

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 · 45e62974

Linus Torvalds authored Dec 24, 2009

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c
  ocfs2/trivial: Use proper mask for 2 places in hearbeat.c
  Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink.
  Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode?
  ocfs2: Use FIEMAP_EXTENT_SHARED
  fiemap: Add new extent flag FIEMAP_EXTENT_SHARED
  ocfs2: replace u8 by __u8 in ocfs2_fs.h
  ocfs2: explicit declare uninitialized var in user_cluster_connect()
  ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock()
  ocfs2: return -EAGAIN instead of EAGAIN in dlm
  ocfs2/cluster: Make fence method configurable - v2
  ocfs2: Set MS_POSIXACL on remount
  ocfs2: Make acl use the default
  ocfs2: Always include ACL support

45e62974

Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm · 756fe285

Linus Torvalds authored Dec 24, 2009

* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
  VIDEO: cyberpro: pci_request_regions needs a persistent name
  ARM: dma-isa: request cascade channel after registering it
  ARM: footbridge: trim down old ISA rtc setup
  ARM: fix PAGE_KERNEL
  ARM: Fix wrong shared bit for CPU write buffer bug test
  ARM: 5857/1: ARM: dmabounce: fix build
  ARM: 5856/1: Fix bug of uart0 platfrom data for nuc900
  ARM: 5855/1: putc support for nuc900
  ARM: 5854/1: fix compiling error for NUC900
  ARM: 5849/1: ARMv7: fix Oprofile events count
  ARM: add missing include to nwflash.c
  ARM: Kill CONFIG_CPU_32
  ARM: Convert VFP/Crunch/XscaleCP thread_release() to exit_thread()
  ARM: 5853/1: ARM: Fix build break on ARM v6 and v7

756fe285

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · eec74a41

Linus Torvalds authored Dec 24, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  edac, pci: remove pesky debug printk
  amd64_edac: restrict PCI config space access
  amd64_edac: fix forcing module load/unload
  amd64_edac: make driver loading more robust
  amd64_edac: fix driver instance freeing
  amd64_edac: fix K8 chip select reporting

eec74a41

Merge branch 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 · ef2c55e5

Linus Torvalds authored Dec 24, 2009

* 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: Ensure all PG_dcache_dirty pages are written back.
  sh: mach-ecovec24: setup.c detailed correction
  serial: sh-sci: Convert tremaining ctrl_xxx I/O routines to __raw_xxx.
  serial: sh-sci: earlyprintk zero uartclk fix
  sh: Only use bl bit toggling for sleeping idle.
  sh: Restore bl bit toggling in idle loop.
  sh: Fix up MAX_DMA_CHANNELS definition when DMA is disabled.
  sh: dmaengine support for SH7785
  sh: dmaengine support for sh7724.

ef2c55e5

VIDEO: cyberpro: pci_request_regions needs a persistent name · ed5a35ac

Russell King authored Dec 24, 2009

Don't pass a name pointer from the kernel stack, it will not survive
and will result in corrupted /proc/iomem output.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ed5a35ac

ARM: dma-isa: request cascade channel after registering it · e8b8f5ef

Russell King authored Dec 24, 2009

We can't request the cascade channel before it's been registered, so
move it afterwards.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

e8b8f5ef

ARM: footbridge: trim down old ISA rtc setup · 382b4480

Russell King authored Dec 24, 2009

This fixes a "start_kernel(): bug: interrupts were enabled early".

rtc_cmos now takes care of initializing the ISA RTC and reading the
current time and date from it; there's no need to repeat that here,
thereby causing interrupts to be enabled too early.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

382b4480

ARM: fix PAGE_KERNEL · 6dc995a3

Russell King authored Dec 24, 2009

PAGE_KERNEL should not be executable; any area marked executable can
be prefetched into the instruction cache.  We don't want vmalloc areas
to be read in this way.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

6dc995a3

edac, pci: remove pesky debug printk · 5213c32f

Borislav Petkov authored Dec 21, 2009

Do not spam the logs needlessly with the sole info that
edac_pci_dev_parity_clear is being called.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

5213c32f

amd64_edac: restrict PCI config space access · 92389102

Borislav Petkov authored Dec 21, 2009

Do not access F2x19[0,4] on K8 since they're undefined there.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

92389102

amd64_edac: fix forcing module load/unload · 43f5e687

Borislav Petkov authored Dec 21, 2009

Clear the override flag after force-loading the module.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

43f5e687

amd64_edac: make driver loading more robust · 56b34b91

Borislav Petkov authored Dec 21, 2009

Currently, the module does not initialize fully when the DIMMs aren't
ECC but remains still loaded. Propagate the error when no instance of
the driver is properly initialized and prevent further loading.

Reorganize and polish error handling in amd64_edac_init() while at it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

56b34b91

amd64_edac: fix driver instance freeing · 8f68ed97

Borislav Petkov authored Dec 21, 2009

Fix use-after-free errors by pushing all memory-freeing calls to the end
of amd64_remove_one_instance().
Reported-by: Darren Jenkins <darrenrjenkins@gmail.com>
LKML-Reference: <1261370306.11354.52.camel@ICE-BOX>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

8f68ed97

amd64_edac: fix K8 chip select reporting · 603adaf6

Borislav Petkov authored Dec 21, 2009

Fix the case when amd64_debug_display_dimm_sizes() reports only half the
amount of DRAM on it because it doesn't account for when the single DCT
operates in 128-bit mode and merges chip selects from different DIMMs.
Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
LKML-Reference: <200912112202.48173.johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

603adaf6

Merge branch 'misc-2.6.33' into release · fcb11235
Len Brown authored Dec 24, 2009

fcb11235
Merge branch 'tc1100-wmi' into release · 78a5331d
Len Brown authored Dec 24, 2009

78a5331d
Merge branch 'sony' into release · fe7fa9c5
Len Brown authored Dec 24, 2009

fe7fa9c5
Merge branch 'classmate' into release · 6d3bf681
Len Brown authored Dec 24, 2009

6d3bf681
Merge branch 'pdc' into release · da3df858
Len Brown authored Dec 24, 2009

da3df858
Merge branches 'bugzilla-14446', 'bugzilla-14753' and 'bugzilla-14824' into release · 309ddc53
Len Brown authored Dec 24, 2009

309ddc53