Commits · 459d979e146378ae00368aa14d1255300291e009 · nexedi / linux

An error occurred fetching the project authors.

04 Mar, 2003 1 commit

[PATCH] dcache/inode hlist patchkit · 18bc0cec

Andi Kleen authored 21 years ago

 - Inode and dcache Hash table only needs half the memory/cache because
   of using hlists.
 - Simplify dcache-rcu code.  With NULL end markers in the hlists
   is_bucket is not needed anymore.  Also the list walking code
   generates better code on x86 now because it doesn't need to dedicate
   a register for the list head.
 - Reorganize struct dentry to be more cache friendly.  All the state
   accessed for the hash walk is in one chunk now together with the
   inline name (all at the end)
 - Add prefetching for all the list walks.  Old hash lookup code didn't
   use it.
 - Some other minor cleanup.

18bc0cec

27 Feb, 2003 1 commit
- d_validate() needs to use "__dget_locked()" since it's holding the · ea3d5d12
  Linus Torvalds authored 21 years ago
```
dcache lock.

Found by Maneesh Soni <maneesh@in.ibm.com>
```
  ea3d5d12
25 Feb, 2003 1 commit

[PATCH] Check for zero d_count in dget() · 3e38f30e

Andrew Morton authored 22 years ago

Patch from Maneesh Soni <maneesh@in.ibm.com>

Turns out that sysfs is doing dget() on a zero-ref dentry.  That's a bug, but
dcache is no longer detecting it.

The check was removed because with lockless d_lookup, there can be cases when
d_lookup and dput are going on concurrently, If d_lookup happens earlier then
it may do dget() on a dentry for which dput() has decremented the ref count
to zero.  This race is handled by taking the per dentry lock and checking the
DCACHE_UNHASHED flag.

The patch open-codes that part of d_lookup(), and restores the BUG check in
dget().

3e38f30e

15 Feb, 2003 1 commit

[PATCH] dcache_rcu · d8a55dda

Andrew Morton authored 22 years ago

Patch from Maneesh Soni <maneesh@in.ibm.com>, Dipankar Sarma
<dipankar@in.ibm.com> and probably others.

This patch provides dcache_lock free d_lookup() using RCU. Al pointed
races with d_move and lockfree d_lookup() while concurrent rename is
going on. We tested this with a test doing million renames
each in 50 threads on 50 different ramfs filesystems. And simultaneously
running millions of "ls". The tests were done on 4-way SMP box.

1. Lookup going to a different bucket as the current dentry is
moved to a different bucket due to rename. This is solved by
having a list_head pointer in the dentry structure which points
to the bucket head it belongs. The bucket pointer is updated when the
dentry is added to the hash chain. Lookup checks if the current
dentry belongs to a different bucket, the cached lookup is
failed and real lookup will be done. This condition occured nearly
about 100 times during the heavy_rename test.

2. Lookup has got the dentry it is looking and it is comparing
various keys and meanwhile a rename operation moves the dentry.
This is solved by using a per dentry counter (d_move_count) which
is updated at the end of d_move. Lookup takes a snapshot of the
d_move_count before comparing the keys and once the comparision
succeeds, it takes the per dentry lock to check the d_move_count
again. If move_count differs, then dentry is moved (or renamed)
and the lookup is failed.

3. There can be a theoritical race when a dentry keeps coming back
to original bucket due to double moves. Due to this lookup may
consider that it has never moved and can end up in a infinite loop.
This is solved by using a loop_counter which is compared with a
approximate maximum number of dentries per bucket. This never got
hit during the heavy_rename test.

4. There is one more change regarding the loop termintaion condition
in d_lookup, now the next hash pointer is compared with the current
dentries bucket pointer (is_bucket()).

5. memcmp() in d_lookup() can go out of bounds if name pointer and length
fields are not consistent. For this we used a pointer to qstr to keep
length and name pointer in one structre.

We also tried solving these by using a rwlock but it could not compete
with lockless solution.

d8a55dda

02 Feb, 2003 1 commit

[PATCH] properly handle too long pathnames in d_path · 28b6394d

Andrew Morton authored 22 years ago

Forward port of a 2.4 patch by Christoph Hellwig.

See http://cert.uni-stuttgart.de/archive/bugtraq/2002/03/msg00384.html
for the security implications.

28b6394d

18 Nov, 2002 1 commit

[PATCH] dcache usage cleanups · 3a708694

Maneesh Soni authored 22 years ago

This cleans up the dcache code to always use the proper dcache functions
(d_unhashed and __d_drop) instead of accessing the dentry lists
directly.

In other words: use "d_unhashed(dentry)" instead of doing a manual
"list_empty(&dentry->d_hash)" test.  And use "__d_drop(dentry)" instead
of doing "list_del_init(&dentry->d_hash)" by hand.

This will help the dcache-rcu patches.

3a708694

16 Nov, 2002 3 commits

[PATCH] include mount.h explicitly were needed · 754c5c66
Christoph Hellwig authored 22 years ago
```
This is a preparation to get rid of the implicit includes in
dcache.h and fs_struct.h.
```
754c5c66

[PATCH] better inode reclaim balancing · 9c716856

Andrew Morton authored 22 years ago

The inode reclaim is too aggressive at present - it is causing the
shootdown of lots of recently-used pagecache.  Simple testcase: run a
huge `dd' while running a concurrent `watch -n1 cat /proc/meminfo'.
The program text for `cat' gets loaded from disk once per second.

This is in fact because the dentry_unused reclaim is too aggressive.

(The general approach to inode reclaim is that it _not_ happen at the
inode level.  All the aging and lru activity happens at the dcache
level.)

The problem is partly due to a bug: shrink_dcache_memory() is returning
the *total* number of dentries to the VM, rather than the number of
unused dentries.

This patch fixes that, and goes a little further.

We do want to keep some unused dentries around.  Reclaiming the last
few thousand dentries is pretty pointless, and will allow reclaim of
the last few thousand inodes and their attached pagecache.

So the algorithm I have used is to not allow the number of unused
dentries to fall below the number of used ones.  This keeps a
reasonable number of dentries in cache while providing a level of
scaling to the system size and the current system activity.

(Magic number alert: why not pin nr_unused to seven times nr_used,
rather than one times??)

shrink_dcache_memory() has been changed to tell the VM that the number
of shrinkable dentries is:

	zero if (nr_unused < nr_used)
	otherwise (nr_unused - nr_used)

so when there is memory pressure the VM will prune the unused dentry
cache down to the size of the used dentry cache, but not below that.

The patch also arranges (awkwardly) for all modifications of
dentry_stat.nr_dentry to occur inside dcache_lock - it was racy.

9c716856

[PATCH] Remove d_path from sched.h · cd574b74

Matthew Wilcox authored 22 years ago

This patch from William Lee Irwin III privatizes __d_path() to dcache.c,
uninlines d_path(), moves its declaration to dcache.h, moves it to
dcache.c, and exports d_path() instead of __d_path().

cd574b74

15 Oct, 2002 1 commit

[PATCH] oprofile - dcookies · 7e1aee05

John Levon authored 22 years ago

This implements the persistent path-to-dcookies mapping, and adds a
system call for the user-space profiler to look up the profile data, so
it can tag profiles to specific binaries.

7e1aee05

13 Oct, 2002 1 commit

[PATCH] batched slab shrink and registration API · 71419dc7

Andrew Morton authored 22 years ago

From Ed Tomlinson, then mauled by yours truly.

The current shrinking of the dentry, inode and dquot caches seems to
work OK, but it is slightly CPU-inefficient: we call the shrinking
functions many times, for tiny numbers of objects.

So here, we just batch that up - shrinking happens at the same rate but
we perform it in larger units of work.

To do this, we need a way of knowing how many objects are currently in
use by individual caches.  slab does not actually track this
information, but the existing shrinkable caches do have this on hand.
So rather than adding the counters to slab, we require that the
shrinker callback functions keep their own count - we query that via
the callback.

We add a simple registration API which is exported to modules.  A
subsystem may register its own callback function via set_shrinker().

set_shrinker() simply takes a function pointer.  The function is called
with

	int (*shrinker)(int nr_to_shrink, unsigned int gfp_mask);

The shrinker callback must scan `nr_to_scan' objects and free all
freeable scanned objects.  Note: it doesn't have to *free* `nr_to_scan'
objects.  It need only scan that many.  Which is a fairly pedantic
detail, really.

The shrinker callback must return the number of objects which are in
its cache at the end of the scanning attempt.  It will be called with
nr_to_scan == 0 when we're just querying the cache size.

The set_shrinker() registration API is passed a hint as to how many
disk seeks a single cache object is worth.  Everything uses "2" at
present.

I saw no need to add the traditional `here is my void *data' to the
registration/callback.  Because there is a one-to-one relationship
between caches and their shrinkers.


Various cleanups became possible:

- shrink_icache_memory() is no longer exported to modules.

- shrink_icache_memory() is now static to fs/inode.c

- prune_icache() is now static to fs/inode.c, and made inline (single caller)

- shrink_dcache_memory() is made static to fs/dcache.c

- prune_dcache() is no longer exported to modules

- prune_dcache() is made static to fs/dcache.c

- shrink_dqcache_memory() is made static to fs/dquot.c

- All the quota init code has been moved from fs/dcache.c into fs/dquot.c

- All modifications to inodes_stat.nr_inodes are now inside
  inode_lock - the dispose_list one was racy.

71419dc7

08 Oct, 2002 1 commit
- [PATCH] named initialisers for dcache · 16270bc9
  Dave Jones authored 22 years ago
```
(Also a printk level addition)
```
  16270bc9
25 Sep, 2002 1 commit

[PATCH] slab reclaim balancing · b65bbded

Andrew Morton authored 22 years ago

A patch from Ed Tomlinson which improves the way in which the kernel
reclaims slab objects.

The theory is: a cached object's usefulness is measured in terms of the
number of disk seeks which it saves.  Furthermore, we assume that one
dentry or inode saves as many seeks as one pagecache page.

So we reap slab objects at the same rate as we reclaim pages.  For each
1% of reclaimed pagecache we reclaim 1% of slab.  (Actually, we _scan_
1% of slab for each 1% of scanned pages).

Furthermore we assume that one swapout costs twice as many seeks as one
pagecache page, and twice as many seeks as one slab object.  So we
double the pressure on slab when anonymous pages are being considered
for eviction.

The code works nicely, and smoothly.  Possibly it does not shrink slab
hard enough, but that is now very easy to tune up and down.  It is just:

	ratio *= 3;

in shrink_caches().

Slab caches no longer hold onto completely empty pages.  Instead, pages
are freed as soon as they have zero objects.  This is possibly a
performance hit for slabs which have constructors, but it's doubtful.
Most allocations after a batch of frees are satisfied from inside
internally-fragmented pages and by the time slab gets back onto using
the wholly-empty pages they'll be cache-cold.  slab would be better off
going and requesting a new, cache-warm page and reconstructing the
objects therein.  (Once we have the per-cpu hot-page allocator in
place.  It's happening).

As a consequence of the above, kmem_cache_shrink() is now unused.  No
great loss there - the serialising effect of kmem_cache_shrink and its
semaphore in front of page reclaim was measurably bad.

Still todo:

- batch up the shrinking so we don't call into prune_dcache and
  friends at high frequency asking for a tiny number of objects.

- Maybe expose the shrink ratio via a tunable.

- clean up slab.c

- highmem page reclaim in prune_icache: highmem pages can pin
  inodes.

b65bbded

14 Jul, 2002 1 commit
- Mark the dentry referenced at dput time. · afa29791
  Linus Torvalds authored 22 years ago
  
  afa29791
03 Jun, 2002 1 commit

[PATCH] fix NULL dereferencing in dcache.c · c4023a9c

Dan Aloni authored 22 years ago

  Unrelated to my first dcache patch, this is something more crucial
  and should be applied first.

  fs/dcache.c:
   - handle d_alloc() returning NULL.

c4023a9c

27 May, 2002 1 commit
- [PATCH] dcache.c spelling · 1e8b2524
  Rusty Russell authored 22 years ago
```
Dan Aloni <da-x@gmx.net>: fs_dcache.c - typo:
```
  1e8b2524
22 May, 2002 1 commit
- [PATCH] (4/) BKL removal in d_move() · 726761cc
  Alexander Viro authored 22 years ago
```
	... is finally done.
```
  726761cc
07 May, 2002 1 commit
- [PATCH] PATCH - kNFSd in 2.5.14 - Add a kernel_lock in d_splice_alias · 7f045822
  Neil Brown authored 22 years ago
```
d_move wants the kernel to be locked, so
d_splice_alias now takes that lock.
```
  7f045822
30 Apr, 2002 1 commit

[PATCH] remove buffer unused_list · 4beda7c1

Andrew Morton authored 22 years ago

Removes the buffer_head unused list.  Use a mempool instead.

The reduced lock contention provided about a 10% boost on ANton's
12-way.

4beda7c1

29 Apr, 2002 1 commit

[PATCH] Re: 2.5.11 breakage · 85d217f4

Alexander Viro authored 22 years ago

	OK, here comes.  Patch below is an attempt to do the fastwalk
stuff in right way and so far it seems to be working.

 - dentry leak is plugged
 - locked/unlocked state of nameidata doesn't depend on history - it
   depends only on point in code.
 - LOOKUP_LOCKED is gone.
 - following mounts and .. doesn't drop dcache_lock
 - light-weight permission check distinguishes between "don't know" and
   "permission denied", so we don't call full-blown permission() unless
   we have to.
 - code that changes root/pwd holds dcache_lock _and_ write lock on
   current->fs->lock.  I.e. if we hold dcache_lock we can safely
   access our ->fs->{root,pwd}{,mnt}
 - __d_lookup() does not increment refcount; callers do dget_locked()
   if they need it (behaviour of d_lookup() didn't change, obviously).
 - link_path_walk() logics had been (somewhat) cleaned up.

85d217f4

24 Apr, 2002 1 commit

[PATCH] FastWalk Dcache · 898683e9

Hanna V. Linder authored 22 years ago

Reduce cacheline bouncing when a dentry is in the cache.

Specifically, the d_count reference counter is not incremented and
decremented for every dentry in a path during path walking if the dentry
is in the dcache.  Execcisve atomic inc/dec's are expensive on SMP
systems due to the cachline bouncing.

898683e9

15 Apr, 2002 1 commit

[PATCH] dcache changes for preparing for "export_operations" interface for nfsd to use. · 0de4fa30

Neil Brown authored 22 years ago

Prepare for new export_operations interface (for filehandle lookup):

 - define d_splice_alias and d_alloc_anon.
 - define shrink_dcache_anon for removing anonymous dentries
 - modify d_move to work with anonymous dentries (IS_ROOT dentries)
 - modify d_find_alias to avoid anonymous dentries where possible
   as d_splice_alias and d_alloc_anon use this
 - put in place infrastructure for s_anon allocation and cleaning
 - replace a piece of code that is in nfsfh, reiserfs and fat
   with a call to d_alloc_anon
 - Rename DCACHE_NFSD_DISCONNECTED to DCACHE_DISCONNECTED
 - Add documentation at Documentation/filesystems/Exporting

0de4fa30

15 Mar, 2002 1 commit

[PATCH] nfsd as filesystem · 063b009f

Alexander Viro authored 22 years ago

* introduces a new filesystem - nfsd.  No, it's not a typo.  It's a small
  tree with fixed topology defined by nfsd and IO on its files does what
  we used to do by hand in nfsctl.c.
* turns sys_nfsservctl() into a sequence of open()/write()/read()/close()
  It works as it used to - we don't need nfsd to be mounted anywhere, etc.
* nfsd_linkage ugliness is gone.
* getfs and getfh demonstrate (rather trivial) example of "descriptor as
  transaction descriptor" behaviour.
* we are fairly close to the situation when driver-defined filesystems can
  be done with practically zero code overhead.  We are still not there, but
  it's a matter of adding a couple of helpers for populating the tree.

	One thing we get immediately is a cleanup of sys_nfsservctl() -
it got _much_ better.  Moreover, we get an alternative interface that
uses normal file IO and can be used without magic syscalls.

063b009f

25 Feb, 2002 1 commit

[PATCH] ->d_parent fixes · 5e37545c

Alexander Viro authored 23 years ago

Protect d_parent with "dparent_lock", making ready to get rid of
BKL for d_move().

5e37545c

06 Feb, 2002 1 commit

[PATCH] Automatic file-max sizing · d8fbaf73

Andi Kleen authored 23 years ago

The default for NR_FILES of 8192 is far too low for many workloads. This
patch does dynamic sizing for it instead. It assumes file+inode+dentry
are roughly 1K and will use upto 10% of the memory for it.

Also removes two obsolete prototypes.

d8fbaf73

05 Feb, 2002 15 commits

v2.5.1.11 -> v2.5.2 · 5fb612aa

Linus Torvalds authored 23 years ago

- Matt Domsch: combine common crc32 library
- Pete Zaitcev: ymfpci update
- Davide Libenzi: scheduler improvements
- Al Viro: almost there: "struct block_device *" everywhere
- Richard Gooch: devfs cpqarray update, race fix
- Rusty Russell: PATH_MAX should include the final '0' count
- David Miller: various random updates (mainly net and sparc)

5fb612aa

v2.5.0.8 -> v2.5.0.9 · b1507c9a

Linus Torvalds authored 23 years ago

- Jeff Garzik: separate out handling of older tulip chips
- Jens Axboe: more bio stuff
- Anton Altaparmakov: NTFS 1.1.21 update

b1507c9a

v2.5.0.4 -> v2.5.0.5 · cc5979c3

Linus Torvalds authored 23 years ago

- Patrick Mochel: driver model infrastructure, part 1
- Jens Axboe: more bio fixes, cleanups
- Andrew Morton: release locking fixes
- Al Viro: superblock/mount handling
- Kai Germaschewski: AVM Fritz!Card ISDN driver
- Christoph Hellwig: make cramfs SMP-safe.

cc5979c3

v2.4.9.10 -> v2.4.9.11 · a880f45a

Linus Torvalds authored 23 years ago

  - Neil Brown: md cleanups/fixes
  - Andrew Morton: console locking merge
  - Andrea Arkangeli: major VM merge

a880f45a

v2.4.7.3 -> v2.4.7.4 · 70d68bd3

Linus Torvalds authored 23 years ago

  - David Mosberger: IA64 update
  - Geert Uytterhoeven: cleanup, new atyfb
  - Marcelo Tosatti: zone aging fixes
  - me, others: limit IO requests sanely

70d68bd3

v2.4.6.2 -> v2.4.6.3 · d62f43c5

Linus Torvalds authored 23 years ago

  - merge with Alan (SCSI subsystem)
  - Jeff Garzik: make serial driver PCI hotplug-aware

d62f43c5

v2.4.5.1 -> v2.4.5.2 · 4fdbe71c

Linus Torvalds authored 23 years ago

  - Takanori Kawano: brlock indexing bugfix
  - Ingo Molnar, Jeff Garzik: softirq updates and fixes
  - Al Viro: rampage of superblock cleanups.
  - Jean Tourrilhes: Orinoco driver update v6, IrNET update
  - Trond Myklebust: NFS brown-paper-bag thing
  - Tim Waugh: parport update
  - David Miller: networking and sparc updates
  - Jes Sorensen: m68k update.
  - Ben Fennema: UDF update
  - Geert Uytterhoeven: fbdev logo updates
  - Willem Riede: osst driver updates
  - Paul Mackerras: PPC update
  - Marcelo Tosatti: unlazy swap cache
  - Mikulas Patocka: hpfs update

4fdbe71c

v2.4.4.4 -> v2.4.4.5 · 560e8996

Linus Torvalds authored 23 years ago

  - Al Viro: fs cleanups
  - David Miller: sparc semaphores
  - Christoph Hellwig: VxFS update
  - Asit Mallick: set machine check bit with set_in_cr4
  - Richard Henderson: fix alpha pci_controller_num(), sg_fill, SRM poweroff.
  - Johannes Erdfelt: USB updates
  - Cort Dougan: bitkeeper Id's on the ppc side
  - Matt Chapman: NFS file locking SMP lock fix
  - Alan Cox: further merging

560e8996

v2.4.3.8 -> v2.4.4 · 7216d3e9

Linus Torvalds authored 23 years ago

  - Andrea Arkangeli: raw-io fixes
  - Johannes Erdfelt: USB updates
  - reiserfs update
  - Al Viro: fsync/umount race fix
  - Rusty Russell: netfilter sync

7216d3e9

v2.4.3.7 -> v2.4.3.8 · 4095b99c

Linus Torvalds authored 23 years ago

  - Al Viro: fix d_flags race between low-level fs and VFS layer.
  - David Miller: sparc updates
  - S390 update

4095b99c

v2.4.3.2 -> v2.4.3.3 · 1a015350

Linus Torvalds authored 23 years ago

  - Hui-Fen Hsu: sis900 driver update
  - NIIBE Yutaka: Super-H update
  - Alan Cox: more resyncs (ARM down, but more to go)
  - David Miller: network zerocopy, Sparc sync, qlogic,FC fix, etc.
  - David Miller/me: get rid of various drivers hacks to do mmap
  alignment behind the back of the VM layer. Create a real
  protocol for it.

1a015350

v2.4.2.3 -> v2.4.2.4 · 8565fe85

Linus Torvalds authored 23 years ago

  - Petr Vandrovec, Al Viro: dentry revalidation fixes
  - Stephen Tweedie / Manfred Spraul: kswapd and ptrace race
  - Neil Brown: nfsd/rpc/raid cleanups and fixes

8565fe85

v2.4.2.1 -> v2.4.2.2 · 44e8778c

Linus Torvalds authored 23 years ago

  - Jens Axboe: fix loop device deadlocks
  - Greg KH: USB updates
  - Alan Cox: continued merging
  - Tim Waugh: parport and documentation updates
  - Cort Dougan: PowerPC merge
  - Jeff Garzik: network driver updates
  - Justin Gibbs: new and much improved aic7xxx driver 6.1.5

44e8778c

v2.4.1.4 -> v2.4.2 · 6db68906

Linus Torvalds authored 23 years ago

  - sync up more with Alan
  - Urban Widmark: smbfs and HIGHMEM fix
  - Chris Mason: reiserfs tail unpacking fix ("null bytes in reiserfs files")
  - Adan Richter: new cpia usb ID
  - Hugh Dickins: misc small sysv ipc fixes
  - Andries Brouwer: remove overly restrictive sector size check for
  SCSI cd-roms

6db68906

v2.4.1.2 -> v2.4.1.3 · c8ebfc88

Linus Torvalds authored 23 years ago

  - Jens: better ordering of requests when unable to merge
  - Neil Brown: make md work as a module again (we cannot autodetect
  in modules, not enough background information)
  - Neil Brown: raid5 SMP locking cleanups
  - Neil Brown: nfsd: handle Irix NFS clients named pipe behavior and
  dentry leak fix
  - maestro3 shutdown fix
  - fix dcache hash calculation that could cause bad hashes under certain
  circumstances (Dean Gaudet)
  - David Miller: networking and sparc updates
  - Jeff Garzik: include file cleanups
  - Andy Grover: ACPI update
  - Coda-fs error return fixes
  - rth: alpha Jensen update

c8ebfc88