Commits · 1e102760dc2928b11ebe694d7fb34a315d08fdb3 · Kirill Smelkov / linux

12 Mar, 2004 40 commits

Merge intel.com:/home/lenb/src/linux-acpi-test-2.6.4 · 1e102760
Len Brown authored Mar 12, 2004
```
into intel.com:/home/lenb/src/linux-acpi-test-2.6.5
```
1e102760

[ACPI] add boot parameters "acpi_osi=" and "acpi_serialize" · fea5b72f

Len Brown authored Mar 12, 2004

  acpi_osi= will disable the _OSI method -- which by default
	tells the BIOS to behave as if Windows is the OS.
  acpi_serialize is for debugging AE_ALREADY_EXISTS failures

fea5b72f

Merge intel.com:/home/lenb/src/linux-acpi-test-2.6.4 · 13b908cd
Len Brown authored Mar 12, 2004
```
into intel.com:/home/lenb/src/linux-acpi-test-2.6.5
```
13b908cd
Merge intel.com:/home/lenb/bk/linux-2.6.5 · 8e0088b5
Len Brown authored Mar 12, 2004
```
into intel.com:/home/lenb/src/linux-acpi-test-2.6.5
```
8e0088b5

[ACPI] ACPICA 20040311 from Bob Moore · 954c9189

Len Brown authored Mar 12, 2004

Fixed a problem where errors occurring during the parse phase of control
method execution did not abort cleanly.  For example, objects created
and installed in the namespace were not deleted.  This caused all
subsequent invocations of the method to return the AE_ALREADY_EXISTS
exception.

Implemented a mechanism to force a control method to "Serialized"
execution if the method attempts to create namespace objects.
(The root of the AE_ALREADY_EXISTS problem.)

Implemented support for the predefined _OSI "internal" control method.
Initial supported strings are "Linux", "Windows 2000", "Windows 2001",
and "Windows 2001.1", and can be easily upgraded for new strings as
necessary.  This feature allows Linux to execute
the fully tested, "Windows" code path through the ASL code

Global Lock Support:  Now allows multiple acquires and releases with any
internal thread.  Removed concept of "owning thread" for this special
mutex.

Fixed two functions that were inappropriately declaring large objects on
the CPU stack: ps_parse_loop() and ns_evaluate_relative().
Reduces the stack usage during method execution considerably.

Fixed a problem in the ACPI 2.0 FACS descriptor (actbl2.h) where the
S4Bios_f field was incorrectly defined as UINT32 instead of UINT32_BIT.

Fixed a problem where acpi_ev_gpe_detect() would fault
if there were no GPEs defined on the machine.

Implemented two runtime options:  One to force all control method
execution to "Serialized" to mimic Windows behavior, another to disable
_OSI support if it causes problems on a given machine.

954c9189

[ACPI] SMP poweroff (David Shaohua Li) · ae386697
Len Brown authored Mar 12, 2004
```
http://bugzilla.kernel.org/show_bug.cgi?id=1141
```
ae386697
Merge bk://gkernel.bkbits.net/libata-2.5 · a8b828f4
Linus Torvalds authored Mar 11, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.5/linux
```
a8b828f4
Merge redhat.com:/spare/repo/linux-2.5 · 3f9d4e0f
Jeff Garzik authored Mar 12, 2004
```
into redhat.com:/spare/repo/libata-2.5
```
3f9d4e0f
Merge bk://kernel.bkbits.net/jgarzik/netconsole-2.5 · 2d0512a4
Linus Torvalds authored Mar 11, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.5/linux
```
2d0512a4
Merge redhat.com:/spare/repo/netdev-2.6/netpoll · 9277cf69
Jeff Garzik authored Mar 12, 2004
```
into redhat.com:/spare/repo/netconsole-2.5
```
9277cf69
Add Promise SX8 (carmel) block driver. · 4061c061
Jeff Garzik authored Mar 12, 2004

4061c061
Merge bk://gkernel.bkbits.net/prism54-2.5 · 4d92fbee
Linus Torvalds authored Mar 11, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.5/linux
```
4d92fbee
[wireless prism54] remove WIRELESS_EXT ifdefs · 0c1e7e8e
Jeff Garzik authored Mar 12, 2004

0c1e7e8e
[wireless] Add new Prism54 wireless driver. · 8eae4cbf
Jeff Garzik authored Mar 12, 2004

8eae4cbf
Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.6 · aba7eead
Linus Torvalds authored Mar 11, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.5/linux
```
aba7eead
Merge http://lia64.bkbits.net/to-linus-2.5 · 60059a51
Linus Torvalds authored Mar 11, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.5/linux
```
60059a51
Revert attribute_used changes in module.h. They were wrong. · dede844e
Linus Torvalds authored Mar 11, 2004
```
Cset exclude: akpm@osdl.org|ChangeSet|20040312161945|47751
```
dede844e

[PATCH] slab: avoid higher-order allocations · 29d18b52

Andrew Morton authored Mar 11, 2004

From: Manfred Spraul <manfred@colorfullife.com>

At present slab is using 2-order allocations for the size-2048 cache.  Of
course, this can affect networking quite seriously.

The patch ensures that slab will never use more than a 1-order allocation
for objects which have a size of less than 2*PAGE_SIZE.

29d18b52

[PATCH] vmscan: add lru_to_page() helper · 349055d0

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

Add a little helper macro for a common list extraction operation in vmscan.c

349055d0

[PATCH] vm: balance inactive zone refill rates · fb5b4abe

Andrew Morton authored Mar 11, 2004

The current refill logic in refill_inactive_zone() takes an arbitrarily large
number of pages and chops it down to SWAP_CLUSTER_MAX*4, regardless of the
size of the zone.

This has the effect of reducing the amount of refilling of large zones
proportionately much more than of small zones.

We made this change in may 2003 and I'm damned if I remember why. let's put
it back so we don't truncate the refill count and see what happens.

fb5b4abe

[PATCH] fix vm-batch-inactive-scanning.patch · 07a25779

Andrew Morton authored Mar 11, 2004

- prevent nr_scan_inactive from going negative

- compare `count' with SWAP_CLUSTER_MAX, not `max_scan'

- Use ">= SWAP_CLUSTER_MAX", not "> SWAP_CLUSTER_MAX".

07a25779

[PATCH] vmscan: batch up inactive list scanning work · ceb37d32

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

Use a "refill_counter" for inactive list scanning, similar to the one used
for active list scanning.  This batches up scanning now that we precisely
balance ratios, and don't round up the amount to be done.

No observed benefits, but I imagine it would lower the acquisition
frequency of the lru locks in some cases, and make codepaths more efficient
in general due to cache niceness.

ceb37d32

[PATCH] vmscan: less throttling of page allocators and kswapd · 085b4897

Andrew Morton authored Mar 11, 2004

This is just a random unsubstantiated tuning tweak: don't immediately
throttle page allocators and kwapd when the going is getting heavier: scan a
bit more of the LRU before throttling.

085b4897

[PATCH] fix the kswapd zone scanning algorithm · ffa0fb78

Andrew Morton authored Mar 11, 2004

This removes a vestige of the old algorithm. We don't want to skip zones if
all_zones_ok is true: we've already precalculated which zones need scanning
and this just stops us from ever performing kswapd reclaim from the DMA zone.

ffa0fb78

[PATCH] kswapd: fix lumpy page reclaim · 519ab68b

Andrew Morton authored Mar 11, 2004

As kswapd is now scanning zones in the highmem->normal->dma direction it can
get into competition with the page allocator: kswapd keep on trying to free
pages from highmem, then kswapd moves onto lowmem. By the time kswapd has
done proportional scanning in lowmem, someone has come in and allocated a few
pages from highmem. So kswapd goes back and frees some highmem, then some
lowmem again. But nobody has allocated any lowmem yet. So we keep on and on
scanning lowmem in response to highmem page allocations.

With a simple `dd' on a 1G box we get:

r b swpd free buff cache si so bi bo in cs us sy wa id
0 3 0 59340 4628 922348 0 0 4 28188 1072 808 0 10 46 44
0 3 0 29932 4660 951760 0 0 0 30752 1078 441 1 6 30 64
0 3 0 57568 4556 924052 0 0 0 30748 1075 478 0 8 43 49
0 3 0 29664 4584 952176 0 0 0 30752 1075 472 0 6 34 60
0 3 0 5304 4620 976280 0 0 4 40484 1073 456 1 7 52 41
0 3 0 104856 4508 877112 0 0 0 18452 1074 97 0 7 67 26
0 3 0 70768 4540 911488 0 0 0 35876 1078 746 0 7 34 59
1 2 0 42544 4568 939680 0 0 0 21524 1073 556 0 5 43 51
0 3 0 5520 4608 976428 0 0 4 37924 1076 836 0 7 41 51
0 2 0 4848 4632 976812 0 0 32 12308 1092 94 0 1 33 66

Simple fix: go back to scanning the zones in the dma->normal->highmem
direction so we meet the page allocator in the middle somewhere.

r b swpd free buff cache si so bi bo in cs us sy wa id
1 3 0 5152 3468 976548 0 0 4 37924 1071 650 0 8 64 28
1 2 0 4888 3496 976588 0 0 0 23576 1075 726 0 6 66 27
0 3 0 5336 3532 976348 0 0 0 31264 1072 708 0 8 60 32
0 3 0 6168 3560 975504 0 0 0 40992 1072 683 0 6 63 31
0 3 0 4560 3580 976844 0 0 0 18448 1073 233 0 4 59 37
0 3 0 5840 3624 975712 0 0 4 26660 1072 800 1 8 46 45
0 3 0 4816 3648 976640 0 0 0 40992 1073 526 0 6 47 47
0 3 0 5456 3672 976072 0 0 0 19984 1070 320 0 5 60 35

519ab68b

[PATCH] kswapd: avoid unnecessary reclaiming from higher zones · 9ef935c2

Andrew Morton authored Mar 11, 2004

Currently kswapd walks across all zones in dma->normal->highmem order,
performing proportional scanning until all zones are OK. This means that
pressure against ZONE_NORMAL causes unnecessary reclaim of ZONE_HIGHMEM.

To fix that up we change kswapd so that it walks the zones in the
high->normal->dma direction, skipping zones which are OK. Once it encounters
a zone which needs some reclaim kswapd will perform proportional scanning
against that zone as well as all the succeeding lower zones.

We scan the lower zones even if they have sufficient free pages. This is
because

a) the lower zone may be above pages_high, but because of the incremental
min, the lower zone may still not be eligible for allocations. That's bad
because cache in that lower zone will then not be scanned at the correct
rate.

b) pages in this lower zone are usable for allocations against the higher
zone. So we do want to san all the relevant zones at an equal rate.

9ef935c2

[PATCH] vmscan: avoid bogus throttling · bcf2fb27

Andrew Morton authored Mar 11, 2004

- If max_scan evaluates to zero due to a very small inactive list and high
  `priority' numbers, we don't want to thrlttle yet.

- In balance_pgdat(), we may end up not scanning any pages because all
  zones happened to be above pages_high.  Avoid throttling in this case too.

bcf2fb27

[PATCH] Balance inter-zone scan rates · e5f02647

Andrew Morton authored Mar 11, 2004

When page reclaim is working out how many pages to san in a zone (max-scan)
it presently rounds that number up if it looks too small - for work batching.

Problem is, this can result in excessive scanning against small zones which
have few inactive pages. So remove it.

Not that it is possible for max_scan to be zero. That's OK - it'll become
non-zero as the priority increases.

e5f02647

[PATCH] vmscan: drive everything via nr_to_scan · 5954a8b0

Andrew Morton authored Mar 11, 2004

Page reclaim is currently a bit schitzo: sometimes we say "go and scan this
many pages and tell me how many pages were freed" and at other times we say
"go and scan this many pages, but stop if you freed this many".

It makes the logic harder to control and to understand. This patch coverts
everything into the "go and scan this many pages and tell me how many pages
were freed" model.

It doesn't seem to affect performance much either way.

5954a8b0

[PATCH] vmscan: zone balancing fix · b532f4af

Andrew Morton authored Mar 11, 2004

We currently have a problem with the balancing of reclaim between zones: much
more reclaim happens against highmem than against lowmem.

This patch partially fixes this by changing the direct reclaim path so it
does not bale out of the zone walk after having reclaimed sufficient pages
from highmem: go on to reclaim from lowmem regardless of how many pages we
reclaimed from lowmem.

b532f4af

[PATCH] vm: scan slab in response to highmem scanning · 768c4fcc

Andrew Morton authored Mar 11, 2004

The patch which went in six months or so back which said "only reclaim slab
if we're scanning lowmem pagecache" was wrong.  I must have been asleep at
the time.

We do need to scan slab in response to highmem page reclaim as well.  Because
all the math is based around the total amount of memory in the machine, and
we know that if we're performing highmem page reclaim then the lower zones
have no free memory.

768c4fcc

[PATCH] vmscan: fix calculation of number of pages scanned · a5cc10d5

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

The logic which calculates the numberof pages which were scanned is mucked
up.  Fix.

a5cc10d5

[PATCH] vm: shrink slab evenly in try_to_free_pages() · b488ea81

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

In try_to_free_pages(), put even pressure on the slab even if we have
reclaimed enough pages from the LRU.

b488ea81

[PATCH] shrink_slab: math precision fix · dee96113

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

In shrink_slab(), do the multiply before the divide to avoid losing
precision.

dee96113

[PATCH] vmscan: preserve page referenced info in refill_inactive() · 29d8c59c

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

If refill_inactive_zone() is running in its dont-reclaim-mapped-memory mode
we are tossing away the referenced infomation on active mapped pages.

So put that info back if we're not going to deactivate the page.

29d8c59c

[PATCH] kswapd throttling fixes · b6c1702e

Andrew Morton authored Mar 11, 2004

The logic in balance_pgdat() is all bollixed up.

- the incoming arg `nr_pages' should be used to determine if we're being
  asked to free a specific number of pages, not `to_free'.

- local variable `to_free' is not appropriate for the determination of
  whether we failed to bring all zones to appropriate free pages levels.

  Fix this by correctly calculating `all_zones_ok' and then use
  all_zones_ok to determine whether we need to throttle kswapd.

So the logic now is:


	for (increasing priority) {

		all_zones_ok = 1;

		for (all zones) {
			to_reclaim = number of pages to try to reclaim
				     from this zone;
			max_scan = number of pages to scan in this pass
				   (gets larger as `priority' decreases)
			/*
			 * set `reclaimed' to the number of pages which were
			 * actually freed up
			 */
			reclaimed = scan(max_scan pages);
			reclaimed += shrink_slab();

			to_free -= reclaimed;	/* for the `nr_pages>0' case */

			/*
			 * If this scan failed to reclaim `to_reclaim' or more
			 * pages, we're getting into trouble.  Need to scan
			 * some more, and throttle kswapd.   Note that this
			 * zone may now have sufficient free pages due to
			 * freeing activity by some other process.   That's
			 * OK - we'll pick that info up on the next pass
			 * through the loop.
			 */
			if (reclaimed < to_reclaim)
				all_zones_ok = 0;
		}
		if (to_free > 0)
			continue;	/* swsusp: need to do more work */
		if (all_zones_ok)
			break;		/* kswapd is done */
		/*
		 * OK, kswapd is getting into trouble.  Take a nap, then take
		 * another pass across the zones.
		 */
		blk_congestion_wait();
	}

b6c1702e

[PATCH] mm/vmscan.c: remove unused priority argument. · 13095f7a

Andrew Morton authored Mar 11, 2004

From: Nikita Danilov <Nikita@Namesys.COM>

Now that decision to reclaim mapped memory is taken on the basis of
zone->prev_priority, priority argument is no longer needed.

13095f7a

[PATCH] Narrow blk_congestion_wait races · c05d7ab9

Andrew Morton authored Mar 11, 2004

From: Nick Piggin <piggin@cyberone.com.au>

The addition of the smp_mb and the other change is to try to close the
window for races a bit.  Obviously they can still happen, it's a racy
interface and it doesn't matter much.

c05d7ab9

[PATCH] return remaining jiffies from blk_congestion_wait() · f3179458

Andrew Morton authored Mar 11, 2004

Teach blk_congestion_wait() to return the number of jiffies remaining.  This
is for debug, but it is also nicely consistent.

f3179458

[PATCH] vm: per-zone vmscan instrumentation · 760d95b5

Andrew Morton authored Mar 11, 2004

To check on zone balancing, split the /proc/vmstat:pgsteal, pgreclaim pgalloc
and pgscan stats into per-zone counters.

Additionally, split the pgscan stats into pgscan_direct and pgscan_kswapd to
see who's doing how much scanning.

And add a metric for the number of slab objects which were scanned.

760d95b5