Commits · 906355ca26698195c4a6f543ab78e461d078d437 · Kirill Smelkov / linux

14 May, 2004 40 commits

[PATCH] H/8300 pic support fix · 906355ca

Andrew Morton authored May 14, 2004

From: Yoshinori Sato <ysato@users.sourceforge.jp>

Sorry. There was the file which lacked.

906355ca

[PATCH] H8/300: pic support · ebd99675
Andrew Morton authored May 14, 2004
```
From: Yoshinori Sato <ysato@users.sourceforge.jp>

- add PIC binary support
```
ebd99675

[PATCH] H8/300: ldscripts fix · e51c5d04

Andrew Morton authored May 14, 2004

From: Yoshinori Sato <ysato@users.sourceforge.jp>

- symbol prefix (use h8300 and v850) support
- include headers

e51c5d04

[PATCH] H8/300: bitops.h add find_next_bit · dd2deeb9
Andrew Morton authored May 14, 2004
```
From: Yoshinori Sato <ysato@users.sourceforge.jp>

- add find_next_bit
```
dd2deeb9

[PATCH] dentry layout tweaks · 35181da9

Andrew Morton authored May 14, 2004

Lookup typically touches three fields of the dentry: d_bucket, d_name.hash and
d_parent.

Change the layout of things so that these will always be in the same
cacheline.

35181da9

[PATCH] more dentry shrinkage · 66ca0978

Andrew Morton authored May 14, 2004

- d_vfs_flags can be removed - just use d_flags.  All modifications of
  dentry->d_flags are under dentry->d_lock.

On x86 this takes the internal string size up to 40 bytes.  The
internal/external ratio on my 1.5M files hits 96%.

66ca0978

[PATCH] dentry d_bucket fix · b7b5563e

Andrew Morton authored May 14, 2004

The gap between checking d_bucket and sampling d_move_count looks like a bug
to me.

It feels safer to be checking d_bucket after taking the lock, when we know
that it is stable.

And it's a little faster to check d_bucket after having checked the hash
rather than before.

b7b5563e

[PATCH] dentry qstr consolidation · 90b163a4

Andrew Morton authored May 14, 2004

When dentries are given an external name we currently allocate an entire qstr
for the external name.

This isn't needed.  We can use the internal qstr and kmalloc only the string
itself.  This saves 12 bytes from externally-allocated names and 4 bytes from
the dentry itself.

The saving of 4 bytes from the dentry doesn't actually decrease the dentry's
storage requirements, but it makes four more bytes available for internal
names, taking the internal/external ratio from 89% up to 93% on my 1.5M files.


Fix:

The qstr consolidation wasn't quite right, because it can cause qstr->len to
be unstable during lookup lockless traverasl.

Fix that up by taking d_lock earlier in lookup.  This serialises against
d_move.

Take the lock after comparing the parent and hash to preserve the
mostly-lockless behaviour.

This obsoletes d_movecount, which is removed.

90b163a4

[PATCH] dentry shrinkage · fd2d8760

Andrew Morton authored May 14, 2004

Rework dentries so that the inline name length is between 31 and 48 bytes.

On SMP P4-compiled x86 each dentry consumes 160 bytes (24 per page).

Here's the histogram of name lengths on all 1.5M files on my workstation:

1:  0%
2:  0%
3:  1%
4:  5%
5:  8%
6:  13%
7:  19%
8:  26%
9:  33%
10:  42%
11:  49%
12:  55%
13:  60%
14:  64%
15:  67%
16:  69%
17:  71%
18:  73%
19:  75%
20:  76%
21:  78%
22:  79%
23:  80%
24:  81%
25:  82%
26:  83%
27:  85%
28:  86%
29:  87%
30:  88%
31:  89%
32:  90%
33:  91%
34:  92%
35:  93%
36:  94%
37:  95%
38:  96%
39:  96%
40:  96%
41:  96%
42:  96%
43:  96%
44:  97%
45:  97%
46:  97%
47:  97%
48:  97%
49:  98%
50:  98%
51:  98%
52:  98%
53:  98%
54:  98%
55:  98%
56:  98%
57:  98%
58:  98%
59:  98%
60:  99%
61:  99%
62:  99%
63:  99%
64:  99%

So on x86 we'll fit 89% of filenames into the inline name.


The patch also removes the NAME_ALLOC_LEN() rounding-up of the storage for the
out-of-line names.  That seems unnecessary.

fd2d8760

[PATCH] d_vfs_flags locking fix · 75fb13cd
Andrew Morton authored May 14, 2004
```
Be consistent about d_vfs_flags locking: take dentry->d_lock when modifying
it.
```
75fb13cd

[PATCH] d_flags locking fixes · 87ada13e

Andrew Morton authored May 14, 2004

A few filesystems modify dentry.d_flags under non-obvious locking.  To
consolidate that field wth d_vfs_flags they need to take ->d_lock

87ada13e

[PATCH] I2O subsystem fixing and cleanup for 2.6 - i2o-makefile-cleanup.patch · 12102e4e

Andrew Morton authored May 14, 2004

From: Markus Lidel <Markus.Lidel@shadowconnect.com>

* The Kconfig and Makefile in drivers/message/i2o still got a CONFIG_I2O_PCI
  entry, which is not used anymore.  This one is replaced by a
  CONFIG_I2O_CONFIG entry, which now builds the i2o_config module.

12102e4e

[PATCH] I2O subsystem fixing and cleanup for 2.6 - i2o-64-bit-fix.patch · 7c11ccc5

Andrew Morton authored May 14, 2004

From: Markus Lidel <Markus.Lidel@shadowconnect.com>

* provides i2o_context_list_*() functions, which maps 64-bit pointers to
  32-bit context id's in a dynamic list.  On 32-bit systems the functions are
  replaced with a static inline.

* i2o_scsi now uses the i2o_context_list_*() functions for transaction
  context, and therefore now work on 64-bit systems too.

7c11ccc5

[PATCH] I2O subsystem fixing and cleanup for 2.6 - i2o_block-cleanup.patch · 96005e5a

Andrew Morton authored May 14, 2004

From: Markus Lidel <Markus.Lidel@shadowconnect.com>

* more than 3 "visible" disks (hda, hdb, hdc, hdd) lead to kernel panics.

* removes some unused code with partitions.

* I2O_LOCK was often called with the addresses of the controller, and not
  with the address of the device.  Fixed.

* the cleanup function for gendisk (del_gendisk) doesn't work if the queue
  is shared between different devices.  To workaround the queue is removed
  before.

* redundant code removed in module initialization and remove, use
  i2ob_new_device and i2ob_del_device instead.

* removed atomic_t queue_depth

* removed unnecessary and bogus code for queue handling

96005e5a

[PATCH] i2o: 64-bit fixes · a94a0bb4
Andrew Morton authored May 14, 2004
```
From: Markus Lidel <Markus.Lidel@shadowconnect.com>

Fix 64-bit problems.
```
a94a0bb4

[PATCH] I2O subsystem fixing and cleanup for 2.6 - i2o-passthru.patch · 8c95df95

Andrew Morton authored May 14, 2004

From: Markus Lidel <Markus.Lidel@shadowconnect.com>

* Add a pass-thru ioctl to i2o_config, which is needed to work with the
  Adaptec management software.

8c95df95

[PATCH] I2O subsystem fixing and cleanup for 2.6 - i2o-config-clean.patch · 9bf41bd3
Andrew Morton authored May 14, 2004
```
From: Markus Lidel <Markus.Lidel@shadowconnect.com>

* Changes the formating of the header in i2o_config.c
```
9bf41bd3

[PATCH] Module ref counting for vt console drivers · 5770ced9

Andrew Morton authored May 14, 2004

From: Herbert Xu <herbert@gondor.apana.org.au>

The following patch adds basic module reference counting to vt console
drivers.  Currently modules like fbcon are not counted at all.

5770ced9

[PATCH] ia64 cpu hotplug: core · f887808c

Andrew Morton authored May 14, 2004

From: Ashok Raj <ashok.raj@intel.com>

Supports basic ability to enable hotplug functions for IA64.
Code is just evolving, and there are several loose ends to tie up.

What this code drop does
- Support logical online and offline
- Handles interrupt migration without loss of interrupts.
- Handles stress fine > 24+ hrs with make -j/ftp/rcp workloads
- Handles irq migration from a dying cpu without loss of interrupts.

What needs to be done
- Boot CPU removal support, with platform level authentication
- Putting cpu being removed in BOOT_RENDEZ mode.

f887808c

[PATCH] Revisited: ia64-cpu-hotplug-cpu_present.patch · fda94eff

Andrew Morton authored May 14, 2004

From: Paul Jackson <pj@sgi.com>

With a hotplug capable kernel, there is a requirement to distinguish a
possible CPU from one actually present.  The set of possible CPU numbers
doesn't change during a single system boot, but the set of present CPUs
changes as CPUs are physically inserted into or removed from a system.  The
cpu_possible_map does not change once initialized at boot, but the
cpu_present_map changes dynamically as CPUs are inserted or removed.


Paul Jackson <pj@sgi.com> provided an expanded explanation:


Ashok's cpu hot plug patch adds a cpu_present_map, resulting in the following
cpu maps being available.  All the following maps are fixed size bitmaps of
size NR_CPUS.

#ifdef CONFIG_HOTPLUG_CPU
	cpu_possible_map - map with all NR_CPUS bits set
	cpu_present_map - map with bit 'cpu' set iff cpu is populated
	cpu_online_map - map with bit 'cpu' set iff cpu available to scheduler
#else
	cpu_possible_map - map with bit 'cpu' set iff cpu is populated
	cpu_present_map - copy of cpu_possible_map
	cpu_online_map - map with bit 'cpu' set iff cpu available to scheduler
#endif

In either case, NR_CPUS is fixed at compile time, as the static size of these
bitmaps.  The cpu_possible_map is fixed at boot time, as the set of CPU id's
that it is possible might ever be plugged in at anytime during the life of
that system boot.  The cpu_present_map is dynamic(*), representing which CPUs
are currently plugged in.  And cpu_online_map is the dynamic subset of
cpu_present_map, indicating those CPUs available for scheduling.

If HOTPLUG is enabled, then cpu_possible_map is forced to have all NR_CPUS
bits set, otherwise it is just the set of CPUs that ACPI reports present at
boot.

If HOTPLUG is enabled, then cpu_present_map varies dynamically, depending on
what ACPI reports as currently plugged in, otherwise cpu_present_map is just a
copy of cpu_possible_map.

(*) Well, cpu_present_map is dynamic in the hotplug case.  If not hotplug,
    it's the same as cpu_possible_map, hence fixed at boot.

fda94eff

[PATCH] ia64 cpu hotplug: /proc rework · 4af52c23

Andrew Morton authored May 14, 2004

From: Ashok Raj <ashok.raj@intel.com>

Changes proc entries for cpu hotplug to be created via the cpu hotplug
notifier callbacks.  Also fixed a bug in the removal code that did not remove
proc entries as expected.

4af52c23

[PATCH] ia64 cpu hotplug: IRQ affinity work · f53c027a

Andrew Morton authored May 14, 2004

From: Ashok Raj <ashok.raj@intel.com>

irq affinity setting via /proc was forcing iosapic rte programming by force.
The correct way to do this is to perform this when a interrupt is pending.

f53c027a

[PATCH] ia64 cpu hotplug: sysfs additions · 68a50f57

Andrew Morton authored May 14, 2004

From: Ashok Raj <ashok.raj@intel.com>

Creation of sysfs via topology_init() creates sysfs entries.  The creation of
the online control file is created separately when the cpu_up is invoked in
arch independent code.

68a50f57

[PATCH] ia64 cpu hotplug: init section fixes · c4dff897

Andrew Morton authored May 14, 2004

From: Ashok Raj <ashok.raj@intel.com>

Contains changes from __init to __devinit to support cpu hotplug Changes only
arch/ia64 portions of the kernel tree.

c4dff897

[PATCH] ia64 cpu hotplug: core kernel initialisation · 8fe08444

Andrew Morton authored May 14, 2004

From: Ashok Raj <ashok.raj@intel.com>

This patch changes __init to __devinit to init_idle so that when a new cpu
arrives, it can call these functions at a later time.

8fe08444

[PATCH] swap speedups and fix · 2e27bd98

Andrew Morton authored May 14, 2004

From: Andrea Arcangeli <andrea@suse.de>

I don't think we need an install_swap_bdev/remove_swap_bdev anymore, we should
use the swap_info->bdev, not the swap_bdevs.  the swap_info already has a
->bdev field, the only point of remove_swap_bdev/install_swap_bdev was to
unplug all devices as efficiently as possible, we don't need that anymore with
the page parameter.

Plus the semaphore should be a rwsem to allow parallel unplug from multiple
pages.

After that I don't need to take the semaphore anymore during swapon, no
swapcache with swp_type() pointing to such bdev, will be allowed until swapon
is complete (SWP_ACTIVE is set a lot later after setting p->bdev).

In swapoff I only need a dummy serialization with the readers, after
try_to_unuse is complete:

 	err = try_to_unuse(type);
 	current->flags &= ~PF_SWAPOFF;

 	/* wait for any unplug function to finish */
 	down_write(&swap_unplug_sem);
 	up_write(&swap_unplug_sem);


that's all, no other locking and no install_swap_bdev/remove_swap_bdev.

(and the swap_bdevs[] compression code was busted)

2e27bd98

[PATCH] blk_run_page(): we don't trust bh->b_page · 4e36c118
Andrew Morton authored May 14, 2004
```
We don't trust bh->b_page to point to the right thing across all filesystems,
so revert this bit.
```
4e36c118
[PATCH] blk_run_page(): fixup for swap_unplug_io_fn() · 3a1e4697
Andrew Morton authored May 14, 2004

3a1e4697

[PATCH] Add blk_run_page() · e059d5da

Andrew Morton authored May 14, 2004

From: Andrea Arcangeli <andrea@suse.de>

From: Jens Axboe

Add blk_run_page() API.  This is so that we can pass the target page all the
way down to (for example) the swap unplug function.  So swap can work out
which blockdevs back this particular page.

e059d5da

[PATCH] rmap-5-swap_unplug-page-revert · 485ba3c3

Andrew Morton authored May 14, 2004

Revert the pre-2.6.6 per-address-space unplugging changes. This removes a
swapper_space exceptionality, syncs things with Andrea and provides for
simplification of the swap unplug function.

485ba3c3

[PATCH] rename rmap_lock to page_map_lock · c78a6f26
Andrew Morton authored May 14, 2004
```
Sync this up with Andrea's patches.
```
c78a6f26

[PATCH] filtered wakeups: apply to buffer_head functions · 70d1f017

Andrew Morton authored May 14, 2004

From: William Lee Irwin III <wli@holomorphy.com>

This patch implements wake-one semantics for buffer_head wakeups in a single
step. The buffer_head being waited on is passed to the waiter's wakeup
function by the waker, and the wakeup function compares that to the a pointer
stored in its on-stack structure and checking the readiness of the bit there
also. Wake-one semantics are achieved by using WQ_FLAG_EXCLUSIVE in the
codepaths waiting to acquire the bit for mutual exclusion.

70d1f017

[PATCH] filtered wakeups: apply to pagecache functions · 08aaf1cc

Andrew Morton authored May 14, 2004

From: William Lee Irwin III <wli@holomorphy.com>

This patch implements wake-one semantics for page wakeups in a single step.
Discrimination between distinct pages is achieved by passing the page to the
wakeup function, which compares it to a pointer in its own on-stack structure
containing the waitqueue element and the page. Bit discrimination is achieved
by storing the bit number in that same structure and testing the bit in the
wakeup function. Wake-one semantics are achieved by using WQ_FLAG_EXCLUSIVE
in the codepaths waiting to acquire the bit for mutual exclusion.

08aaf1cc

[PATCH] filtered wakeups: wakeup enhancements · 2afafa3b

Andrew Morton authored May 14, 2004

From: William Lee Irwin III <wli@holomorphy.com>

This patch provides an additional argument to __wake_up_common() so that the
information wakefunc.patch made waiters ready to receive may be passed to them
by wakers. This is provided as a separate patch so that the overhead of the
additional argument to __wake_up_common() can be measured in isolation. No
change in performance was observable here.

2afafa3b

[PATCH] filtered wakeups · 2f242854

Andrew Morton authored May 14, 2004

From: William Lee Irwin III <wli@holomorphy.com>

This patch series is solving the "thundering herd" problem that occurs in the
mainline implementation of hashed waitqueues.  There are two sources of
spurious wakeups in such arrangements:

(a) Hash collisions that place waiters on different objects on the same
    waitqueue, which wakes threads falsely when any of the objects hashed to
    the same queue receives a wakeup.  i.e.  loss of information about which
    object a wakeup event is related to.

(b) Loss of information about which object a given waiter is waiting on.
    This precludes wake-one semantics for mutual exclusion scenarios.  For
    instance, a lock bit may be slept on.  If there are any waiters on the
    object, a lock bit release event must wake at least one of them so as to
    prevent deadlock.  But without information as to which waiter is waiting
    on which object, we must resort to waking all waiters who could possibly
    be waiting on it.  Now, as the lock bit provides mutual exclusion, only
    one of the waiters woken can proceed, and the remainder will go back to
    sleep and wait for another event, creating unnecessary system load.  Once
    wake-one semantics are established, only one of the waiters waiting to
    acquire a lock bit need to be woken, which measurably reduces system load
    and improves efficiency (i.e.  it's the subject of the benchmarking I've
    been sending to you).

Even beyond the measurable efficiency gains, there are reasons of robustness
and responsiveness to motivate addressing the issue of thundering herds.  In a
real-life scenario I've been personally involved in resolving, the thundering
herd issue caused powerful modern SMP machines with fast IO systems to be
unresponsive to user input for a minute at a time or more.  Analogues of these
patches for the distro kernels involved fully resolved the issue to the
customer's satisfaction and obviated workarounds to limit the pagecache's
size.

The latest spin of these patches basically shoves more pieces of the logic
into the wakeup functions, with some efficiency gains from sharing the hot
codepath with the rest of the kernel, and a slightly larger diff than the
patches with the newly-introduced entrypoint.  Writing these was motivated by
the push to insulate sched.c from more of the details of wakeup semantics by
putting more of the logic into the wakeup functions.  In order to accomplish
this while still solving (b), the wakeup functions grew a new argument for
communication about what object a wakeup event is related to to be passed by
the waker.

=========

This patch provides an additional argument to wakeup functions so that
information may be passed from the waker to the waiter.  This is provided as a
separate patch so that the overhead of the additional argument can be measured
in isolation.  No change in performance was observable here.

2f242854

[PATCH] do_mounts_rd-malloc-fix · 5a930dd9

Andrew Morton authored May 14, 2004

gcc-3.4.0 sez:

init/do_mounts_rd.c:309: warning: conflicting types for built-in function 'malloc'

5a930dd9

[PATCH] VM accounting fix · e46bdb8d

Andrew Morton authored May 14, 2004

From: Hugh Dickins <hugh@veritas.com>

Stas Sergeev <stsp@aknet.ru> wrote:

   mprotect() fails to merge VMAs because one VMA can end up with
   VM_ACCOUNT flag set, and another without that flag.  That makes several
   apps of mine to malfuncate.


Great find!  Someone has got their test the wrong way round.  Since that
VM_MAYACCT macro is being used in one place only, and just hiding what it's
actually about, fold it into its callsite.

e46bdb8d

[PATCH] revert the process-migration-speedup patch · 64525acc

Andrew Morton authored May 14, 2004

David Mosberger asked that this be backed out:

"I do not believe that flushing the TLB before migration is be the right thing
to do on ia64 machines which support global TLB purges (i.e., all but SGI's
machines)."

It was of huge benefit for the SGI machines, so work is ongoing.

64525acc

[PATCH] MSEC_TO_JIFFIES to msec_to_jiffies · 5975a1db

Andrew Morton authored May 14, 2004

Switch all users of MSEC[S]_TO_JIFFIES and JIFFIES_TO_MSEC[S] over to use
jiffies_to_msecs() and msecs_to_jiffies(). Withdraw MSECS_TO_JIFFIES() and
JIFFIES_TO_MSECS() from the kernel API.

5975a1db

[PATCH] Covert drivers to use msec_to_jiffies · b3dafee7

Andrew Morton authored May 14, 2004

Remove various private implementations of msecs_to_jiffies() and
jiffies_to_msecs().

There are various uppercase versions which should be consolidated.

b3dafee7