Commits · a1016ebc4ac37affa671892b43cf24d33144744c · nexedi / linux

An error occurred fetching the project authors.

07 Mar, 2003 1 commit

[PATCH] no need for kernel_flag on UP · 339bf945

Robert Love authored 21 years ago

This is a minor cleanup.  We currently define and declare the BKL's
kernel_flag spinlock on either SMP or PREEMPT, which means a UP+PREEMPT
machine gets it.

We only need the actual lock on SMP.

339bf945

02 Mar, 2003 1 commit

[PATCH] loop: Fix OOM and oops · 29da03f1

Andrew Morton authored 21 years ago

The loop driver takes a copy of the data which it is writing. When this
happens on the try_to_free_pages() path, loop can easily consume ALL memory
and bio_copy() will fail to allocate a page.

Loop forgets to check the bio_copy() return value and oopses.

Fix this by dropping PF_MEMALLOC and throttling to the block writeout speed.

The patch exports blk_congestion_wait() to modules for this. This is a
needed export: several filesystems have a "try to allocate and yield if it
failed" loop and blk_congestion_wait() is a more appropriate way of
implementing the sleep in this situation.

29da03f1

18 Feb, 2003 1 commit

[PATCH] export add_to_page_cache() and __pagevec_lru_add to · 0d51be59

Andrew Morton authored 21 years ago

CIFS is using these.

Given that the readpages() address_space op is supposed to add the pages to
pagecache, it makes sense to make these functions available to modules.

I can't say that I put a lot of though into the readpages API.  It was
designed as just enough functionality to be able to stuff a bunch of
readahead pages into a single BIO.  The only reason I made it an a_op at all
was because we have toi enter the fs to pick up the ->get_block callback's
address.

But a couple of filesystems seem to be making use of it now.  Reiser4 will
need access at the do_page_cache_readahead() level too.

0d51be59

10 Feb, 2003 2 commits

[PATCH] uninline get_jiffies_64() for 32-bit architectures · 9bba8dd6
Andrew Morton authored 21 years ago
```
uninline get_jiffies_64() for 32-bit architectures
```
9bba8dd6

[PATCH] Fix synchronous writers to wait properly for the result · 8d49bf3f

Andrew Morton authored 21 years ago

Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> points out a bug in
ll_rw_block() usage.

Typical usage is:

	mark_buffer_dirty(bh);
	ll_rw_block(WRITE, 1, &bh);
	wait_on_buffer(bh);

the problem is that if the buffer was locked on entry to this code sequence
(due to in-progress I/O), ll_rw_block() will not wait, and start new I/O.  So
this code will wait on the _old_ I/O, and will then continue execution,
leaving the buffer dirty.

It turns out that all callers were only writing one buffer, and they were all
waiting on that writeout.  So I added a new sync_dirty_buffer() function:

	void sync_dirty_buffer(struct buffer_head *bh)
	{
		lock_buffer(bh);
		if (test_clear_buffer_dirty(bh)) {
			get_bh(bh);
			bh->b_end_io = end_buffer_io_sync;
			submit_bh(WRITE, bh);
		} else {
			unlock_buffer(bh);
		}
	}

which allowed a fair amount of code to be removed, while adding the desired
data-integrity guarantees.

UFS has its own wrappers around ll_rw_block() which got in the way, so this
operation was open-coded in that case.

8d49bf3f

05 Feb, 2003 1 commit

[PATCH] seqlock for xtime · bb59cfa4

Stephen Hemminger authored 21 years ago

Add "seqlock" infrastructure for doing low-overhead optimistic reader
locks (writer increments a sequence number, reader verifies that no
writers came in during the critical region, and lots of careful memory
barriers to take care of business).

Make xtime/get_jiffies_64() use this new locking.

bb59cfa4

03 Feb, 2003 1 commit

kbuild: Rename CONFIG_MODVERSIONING -> CONFIG_MODVERSIONS · d5ea3bb5

Kai Germaschewski authored 21 years ago

CONFIG_MODVERSIONING was a temporary name introduced to distinguish
between the old and new module version implementation. Since the
traces of the old implementation are now gone from the build system,
we rename the config option back in order to not confuse users more
than necessary in 2.6.
 
Also, remove some historic modversions cruft throughout the tree.

d5ea3bb5

02 Feb, 2003 1 commit

[PATCH] Fix inode size accounting race · 7619fd2b

Andrew Morton authored 21 years ago

Since Jan removed the lock_kernel()s in inode_add_bytes() and
inode_sub_bytes(), these functions have been racy.

One problematic workload has been discovered in which concurrent writepage
and truncate on SMP quickly causes i_blocks to go negative. writepage() does
not take i_sem, and it seems that for ext2, there are no other locks in
force when inode_add_bytes() is called.

Putting the BKL back in there is not acceptable. To fix this race I have
added a new spinlock "i_lock" to the inode.

That lock is presently used to protect i_bytes and i_blocks. We could use it
to protect i_size as well.

The splitting of the used disk space into i_blocks and i_bytes is silly - we
should nuke all that and just have a bare loff_t i_usedbytes. Later.

7619fd2b

13 Jan, 2003 1 commit
- [PATCH] IPMI (Intelligent Platform Management Interface) driver · ba3e3dba
  Corey Minyard authored 22 years ago
  
  ba3e3dba
11 Jan, 2003 1 commit

[PATCH] Fix an SMP+preempt latency problem · 2faf4338

Andrew Morton authored 22 years ago

Here is spin_lock():

#define spin_lock(lock) \
do { \
        preempt_disable(); \
        _raw_spin_lock(lock); \
} while(0)


Here is the scenario:

CPU0:
	spin_lock(some_lock);
	do_very_long_thing();	/* This has cond_resched()s in it */

CPU1:
	spin_lock(some_lock);

Now suppose that the scheduler tries to schedule a task on CPU1.  Nothing
happens, because CPU1 is spinning on the lock with preemption disabled.  CPU0
will happliy hold the lock for a long time because nobody has set
need_resched() against CPU0.

This problem can cause scheduling latencies of many tens of milliseconds on
SMP on kernels which handle UP quite happily.


This patch fixes the problem by changing the spin_lock() and write_lock()
contended slowpath to spin on the lock by hand, while polling for preemption
requests.

I would have done read_lock() too, but we don't seem to have read_trylock()
primitives.

The patch also shrinks the kernel by 30k due to not having separate
out-of-line spinning code for each spin_lock() callsite.

2faf4338

08 Jan, 2003 2 commits

[PATCH] add v4l1-compat module. · 3019e9c0

Gerd Knorr authored 22 years ago

This adds the v4l1-compat module. This is a module which can translate
most (old) v4l1 ioctls into the new v4l2 API. This makes it easier for
v4l2 drivers to present both old v4l and new v4l2 APIs to video4linux
applications. The saa7134 driver uses this for example.

3019e9c0

[PATCH] AIO support for raw/O_DIRECT · 08e6749e

Andrew Morton authored 22 years ago

Patch from Badari Pulavarty <pbadari@us.ibm.com> and myself

This patch adds the infrastructure for performing asynchronous (AIO) blockdev
direct-IO.

- Adds generic_file_aio_write_nolock() and make other
  generic_file_*_write() to use it.

- Modify generic_file_direct_IO() and ->direct_IO() functions to take
  "kiocb *" instead of "file *".

- Renames generic_direct_IO() to blockdev_direct_IO().

- Move generic_file_direct_IO() to mm/filemap.c (it is not
  blockdev-specific, whereas the rest of fs/direct-io.c is).

- Add AIO read/write support to the raw driver.

08e6749e

30 Dec, 2002 1 commit

[PATCH] kmalloc_percpu -- stripped down version · 29621f41

Andrew Morton authored 22 years ago

Patch from Ravikiran G Thirumalai <kiran@in.ibm.com>

Creates a simple "kmalloc for each CPU" API.  This will be used for net
statistics, disk statistics, etc.  (davem has acked the net patches which use
this code).

kmalloc_per_cpu() is available to modules, unlike the current static per-cpu
infrastructure.

29621f41

29 Dec, 2002 1 commit
- [PATCH] Simplify ramfs_getattr() and move it to the generic libfs.c · a869e179
  Hirofumi Ogawa authored 22 years ago
```
This moves ramfs_getattr() to fs/libfs.c as simple_getattr()
```
  a869e179
20 Dec, 2002 1 commit

[PATCH] Make some symbol exports conditional on CONFIG_MMU · 31c9fa59

Miles Bader authored 22 years ago

A few symbols are only defined when CONFIG_MMU=y, but are exported
(by kernel/ksyms.c) unconditionally.  This patch makes them conditional.

31c9fa59

15 Dec, 2002 1 commit

[PATCH] Fix filesystems that cannot do mmap writeback · 91ec8aa9

Andrew Morton authored 22 years ago

The writepage-removal patch broke filesystems which do not want to
support writeable mappings.

Fix that up by making those filesystems point their mmap vector at the
new generic_file_readonly_mmap().

91ec8aa9

11 Dec, 2002 1 commit
- share some code between get_sb_bdev and xfs log/rtdev handling · 8c88cd21
  Christoph Hellwig authored 22 years ago
  
  8c88cd21
04 Dec, 2002 1 commit

[PATCH] binfmt_* need ptrace_notify (resend) · 536f1067

Randy Dunlap authored 22 years ago

Originally by Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
who said on lkml:

binfmt_elf and binfmt_aout need this.

536f1067

01 Dec, 2002 1 commit

[SERIAL] uart_get_divisor() and uart_get_baud_rate() takes termios. · 493c6685

Russell King authored 22 years ago

Currently, uart_get_divisor() and uart_get_baud_rate() take a tty
structure.  We really want them to take a termios structure so we
can avoid passing a tty structure all  the way down to the low level
drivers.

In order to do this, we need to be able to convert a termios
structure to a numeric baud rate - we provide tty_termios_baud_rate() in
tty_io.c for this purpose.  It performs a subset of the
tty_get_baud_rate() functionality, but without any "alt_speed"
kludge.

We finally export uart_get_baud_rate() and uart_get_divisor() to for
low level drivers to use.  We now have all the functions in place
to support ports which want to have access to the real baud rate
rather than a divisor value.

493c6685

25 Nov, 2002 1 commit
- [PATCH] CONFIG_DEBUG_SPINLOCK_SLEEP · 6b5738f2
  Dave Jones authored 22 years ago
```
This makes the sleep-under-spinlock-held check a CONFIG_ option.
```
  6b5738f2
19 Nov, 2002 1 commit

[PATCH] rename get_lease to break_lease · 9de88958

Matthew Wilcox authored 22 years ago

Al pointed out that the current name of get_lease is extremely confusing
and I agree.

This (a) renames it to break_lease and (b) fixes a bug noticed by Dave
Hansen which could cause a NULL pointer dereference under high load.

9de88958

17 Nov, 2002 2 commits

[PATCH] nanosecond stat timefields · 5d62665d

Andi Kleen authored 22 years ago

stat64 has been changed to return jiffies granuality as nsec in previously
unused fields. This allows make to make better decisions on when
to recompile a file. Follows losely the Solaris API.

CURRENT_TIME has been redefined to return struct timespec. The users
who don't use it in a inode/attr context have been changed to use a new
get_seconds() function. CURRENT_TIME is implemented by an out-of-line
function.

There is a small performance penalty in this patch. The previous
filemap code had an optimization to flush atime only once a second.
This is currently gone, which will increase flushes a bit. I believe
the correct solution if it should be a problem is to have per super
block fields that give an arbitary atime flush granuality - so that you
can set it to be only flushed once a hour if you prefer that. I will
work on that later in separate patches if the need should arise.

struct inode and the attr struct has been changed to store struct
timespec instead of time_t for [cma]time. Not all file systems support
this granuality, but some like XFS,NFSv3,CIFS,JFS do. The others will
currently truncate the nsec part on flushing to disk. There was some
discussion on this rounding on l-k previously. I went for simple
truncation because there is not much evidence IMHO that the more
complicated roundings have any advantages. In practice application will
be rather unlikely to notice the rounding anyways - they can only see a
difference when an inode is flush from memory and reloaded in less than
a second, which is rather unlikely.

5d62665d

[PATCH] additional cleanup for f_op->poll · 1f688548

Manfred Spraul authored 22 years ago

This splits poll_table into one structure used by f_op->poll and one
structure used by the implemenation of sys_poll/sys_select: poll_table
contains just the callback function pointer.  struct poll_wrapper
additionally contains err and table, i.e.  the members used by the poll
implementation.

Changes:
- split poll_table into 2 structures
- reorder the declarations in <linux/poll.h> accordingly
- uninline poll_initwait().

1f688548

16 Nov, 2002 2 commits

[PATCH] include mount.h explicitly were needed · 754c5c66
Christoph Hellwig authored 22 years ago
```
This is a preparation to get rid of the implicit includes in
dcache.h and fs_struct.h.
```
754c5c66

[PATCH] Remove d_path from sched.h · cd574b74

Matthew Wilcox authored 22 years ago

This patch from William Lee Irwin III privatizes __d_path() to dcache.c,
uninlines d_path(), moves its declaration to dcache.h, moves it to
dcache.c, and exports d_path() instead of __d_path().

cd574b74

15 Nov, 2002 1 commit

[PATCH] epoll bits 0.46 ... · 424980a8

Davide Libenzi authored 22 years ago

- A more uniform poll queueing interface with tips from Manfred

- The f_op->poll() is done outside the irqlock to maintain compatibility
	with existing drivers that assume to be called with irq enabled

- Moved event mask setting inside ep_modify() with tips from John

- Fixed locking to fit the new "poll() outside the lock" approach

- Bufferd userspace event delivery to reduce irq_lock/irq_unlock switching
	rate and to reduce the number of __copy_to_user()

- Comments added

424980a8

11 Nov, 2002 1 commit

[PATCH] In-kernel Module Loader · aa65be3f

Rusty Russell authored 22 years ago

This is an implementation of the in-kernel module loader extending
the try_inc_mod_count() primitive and making its use compulsory.
This has the benifit of simplicity, and similarity to the existing
scheme.  To reduce the cost of the constant increments and
decrements, reference counters are lockless and per-cpu.

Eliminated (coming in following patches):
 o Modversions
 o Module parameters
 o kallsyms
 o EXPORT_SYMBOL_GPL and MODULE_LICENCE checks
 o DEVICE_TABLE support.

New features:
 o Typesafe symbol_get/symbol_put
 o Single "insert this module" syscall interface allows trivial userspace.
 o Raceless loading and unloading

You will need the trivial replacement module utilities from:
	http://ozlabs.org/~rusty/module-init-tools-0.6.tar.gz

aa65be3f

06 Nov, 2002 1 commit
- export find_trylock_page for XFS · e1488fb5
  Christoph Hellwig authored 22 years ago
  
  e1488fb5
05 Nov, 2002 2 commits

[PATCH] Convert NFS client to use ->readpages() · b9a2dd76

Trond Myklebust authored 22 years ago

  - Add the library function read_cache_pages(), which is used in a
    similar fashion to the single page 'read_cache_page()'. It hides
    the details of the LRU cache etc. from a filesystem that wants to
    to populate an address space with a list of pages.

  - Fix NFS so that readahead uses the ->readpages() interface. Means
    that we can immediately schedule an RPC call in order to complete
    the I/O, rather than relying on somebody later triggering it by
    calling lock_page() (and hence sync_page()). The sync_page()
    method is race-prone, since the waiting page may try to call it
    before we've finished initializing the 'struct nfs_page'.

  - Clear out nfs_sync_page(), the nfs_inode->read list, and
    friends. When the I/O completion gets scheduled in ->readpage(),
    ->readpages(), they have no reason to exist.

b9a2dd76

[PATCH] `event' removal: kill it · 3cf803fb

Andrew Morton authored 22 years ago

Final act, from Manfred:

The attached patch removes 'event' entirely from the kernel: it's not
used anymore.

All event users [vfat dentry revalidation; ext2/3 inode generation;
readdir() file position revalidation in several filesystems] were
converted to local counters.

3cf803fb

31 Oct, 2002 1 commit

[PATCH] make kernel_stat use per-cpu infrastructure · fd3e6205

Andrew Morton authored 22 years ago

Patch from Ravikiran G Thirumalai <kiran@in.ibm.com>

1. Break out disk stats from kernel_stat and move disk stat to blkdev.h

2. Group cpu stat in kernel_stat and make them "per_cpu" instead of
   the NR_CPUS array

3. Remove EXPORT_SYMBOL(kstat) from ksyms.c (as I noticed that no module is
   using kstat)

fd3e6205

30 Oct, 2002 1 commit

[PATCH] kNFSd: Convert nfsd to use a list of pages instead of one big buffer · a0e7d495

Neil Brown authored 22 years ago

This means:
  1/ We don't need an order-4 allocation for each nfsd that starts
  2/ We don't need an order-4 allocation in skb_linearize when
     we receive a 32K write request
  3/ It will be easier to incorporate the zero-copy read changes

The pages are handed around using an xdr_buf (instead of svc_buf)
much like the NFS client so future crypto code can use the same
data structure for both client and server.

The code assumes that most requests and replies fit in a single page.
The exceptions are assumed to have some largish 'data' bit, and the
rest must fit in a single page.
The 'data' bits are file data, readdir data, and symlinks.
There must be only one 'data' bit per request.
This is all fine for nfs/nlm.

This isn't complete:
  1/ NFSv4 hasn't been converted yet (it won't compile)
  2/ NFSv3 allows symlinks upto 4096, but the code will only support
     upto about 3800 at the moment
  3/ readdir responses are limited to about 3800.

but I thought that patch was big enough, and the rest can come
later.


This patch introduces vfs_readv and vfs_writev as parallels to
vfs_read and vfs_write.  This means there is a fair bit of
duplication in read_write.c that should probably be tidied up...

a0e7d495

29 Oct, 2002 3 commits

[PATCH] Get rid of check_resource() before it becomes a problem · 5b8e28f3

Rusty Russell authored 22 years ago

The new resource interface foolishly replicated the (obsolete,
racy) spirit of the check_region call as check_resource.  You
should use request_resource/release_resource instead.

5b8e28f3

[PATCH] add a file_ra_state init function · 6b390b3b

Andrew Morton authored 22 years ago

Provide a function in core kernel to initialise a file_ra_state structure.

Perviously this was all taken care of by the fact that new struct
file's are all zeroed out.  But now a file_ra_state may be
independently allocated, and we don't want users of it to have to know
how to initialise it.

6b390b3b

[PATCH] move ramfs a_ops into libfs · 3ee477f0

Andrew Morton authored 22 years ago

From Bill Irwin.

Abstract out ramfs readpage(), prepare_write(), and commit_write()
operations.

Ram-backed filesystems are going to be doing a lot of zero-filled read
and write operations.  So in this patch, ramfs' implementations are
moved to libfs in anticipation of other callers.

3ee477f0

28 Oct, 2002 2 commits
- [PATCH] r/o state moved to gendisks · 4d466c1f
  Alexander Viro authored 22 years ago
  
  4d466c1f
- [PATCH] blk_dev[] is gone · d5f24b98
  Alexander Viro authored 22 years ago
```
	* remove blk_dev[]
	* removed BLK_DEFAULT_QUEUE
	* moved definition of CURRENT into drivers that used it
	* removed definition of QUEUE from headers
```
  d5f24b98
17 Oct, 2002 1 commit

[PATCH] do_generic_file_read / readahead adjustments · 9de05205

David Howells authored 22 years ago

This does the following three things:

 (1) Makes the functions in mm/readahead.c only use struct file* to pass to
     readpage(). address_mapping* and file_ra_state* are used instead to keep
     track of readahead stuff.

 (2) Adds a new function do_generic_mapping_read() that is similar to
     do_generic_file_read(), except that it uses a mapping pointer and a
     readahead state pointer to access a file. The file* is only used to pass
     to readpage().

 (3) Turns do_generic_file_read() into an inline function in linux/fs.h that
     simply wraps do_generic_mapping_read().

This should mean that it is no longer necessary to have a struct file to
access a file in this manner. Just an inode or address space should be
sufficient.

It also means alternate read-ahead structures can be maintained.

The reason I want this is that I'm writing a general cache manager for
filesystems such as AFS, NFSv4, and Lustre. Block devices are made available
to the "cache manager" by means of a filesystem that can be mounted. I'm
storing meta data in an inode in the cache, but to scan this at the moment I
need to gain a "struct file" to use with do_generic_file_read().

This involves either creating a dummy dentry and struct file (which will cause
Al Viro to come looking for me with a shotgun), or to use an extra auxilliary
filesystem mounted with do_kern_mount(), neither of which are particularly
appealing.

This patch is the alternative... it provides a function that I can pass an
address_space to. This also allows me to make use of readahead semantics
without having to reinvent them for myself.

9de05205

16 Oct, 2002 1 commit
- [PATCH] make filemap_sync static · e57c2ae2
  Andrew Morton authored 22 years ago
```
From Christpoh Hellwig.

Make filemap_sync() static, and not exported to modules
```
  e57c2ae2
13 Oct, 2002 1 commit

[PATCH] remove kiobufs · 2dcb8ff9

Andrew Morton authored 22 years ago

This patch from Christoph Hellwig removes the kiobuf/kiovec
infrastructure.

This affects three subsystems:

video-buf.c:

   This patch includes an earlier diff from Gerd which converts
    video-buf.c to use get_user_pages() directly.

   Gerd has acked this patch.

LVM1:

   Is now even more broken.

drivers/mtd/devices/blkmtd.c:

   blkmtd is broken by this change.  I contacted Simon Evans, who
   said "I had done a rewrite of blkmtd anyway and just need to convert
   it to BIO.  Feel free to break it in the 2.5 tree, it will force me
   to finish my code."

Neither EVMS nor LVM2 use kiobufs.  The only remaining breakage
of which I am aware is a proprietary MPEG2 streaming module.  It
could use get_user_pages().

2dcb8ff9