Commits · 5c0a32fc2cd0be912511199449a37a4a6f0f582d · Kirill Smelkov / linux

27 Mar, 2006 33 commits

[PATCH] autofs4: add new packet type for v5 communications · 5c0a32fc

Ian Kent authored Mar 27, 2006

This patch define a new autofs packet for autofs v5 and updates the waitq.c
functions to handle the additional packet type.
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

5c0a32fc

[PATCH] autofs4: add v5 expire logic · 3a15e2ab

Ian Kent authored Mar 27, 2006

This patch adds expire logic for autofs direct mounts.
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

3a15e2ab

[PATCH] autofs4: add v5 follow_link mount trigger method · 34ca959c

Ian Kent authored Mar 27, 2006

This patch adds a follow_link inode method for the root of an autofs direct
mount trigger.  It also adds the corresponding mount options and updates the
show_mount method.
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

34ca959c

[PATCH] autofs4: nameidata needs to be up to date for follow_link · 051d3812

Ian Kent authored Mar 27, 2006

In order to be able to trigger a mount using the follow_link inode method the
nameidata struct that is passed in needs to have the vfsmount of the autofs
trigger not its parent.

During a path walk if an autofs trigger is mounted on a dentry, when the
follow_link method is called, the nameidata struct contains the vfsmount and
mountpoint dentry of the parent mount while the dentry that is passed in is
the root of the autofs trigger mount.  I believe it is impossible to get the
vfsmount of the trigger mount, within the follow_link method, when only the
parent vfsmount and the root dentry of the trigger mount are known.

This patch updates the nameidata struct on entry to __do_follow_link if it
detects that it is out of date.  It moves the path_to_nameidata to above
__do_follow_link to facilitate calling it from there.  The dput_path is moved
as well as that seemed sensible.  No changes are made to these two functions.
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

051d3812

[PATCH] autofs4: increase module version · f75ba3ad

Ian Kent authored Mar 27, 2006

Update autofs4 version.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f75ba3ad

[PATCH] autofs4: change may_umount* functions to boolean · e3474a8e

Ian Kent authored Mar 27, 2006

Change the functions may_umount and may_umount_tree to boolean functions to
aid code readability.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e3474a8e

[PATCH] autofs4: rename simple_empty_nolock function · 90a59c7c

Ian Kent authored Mar 27, 2006

Rename the function simple_empty_nolock to __simple_empty in line with kernel
naming conventions.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

90a59c7c

[PATCH] autofs4: white space cleanup for waitq.c · e77fbddf

Ian Kent authored Mar 27, 2006

Whitespace and formating changes to waitq code.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e77fbddf

[PATCH] autofs4: add a show mount options for proc filesystem · d7c4a5f1

Ian Kent authored Mar 27, 2006

Add show_options method to display autofs4 mount options in the proc
filesystem.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

d7c4a5f1

[PATCH] autofs4: remove update_atime unused function · 862b110f

Ian Kent authored Mar 27, 2006

Remove the update of i_atime from autofs4 in favour of having VFS update it.
i_atime is never used for expire in autofs4.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

862b110f

[PATCH] autofs4: expire mounts that hold no (extra) references only · e0a7aae9

Ian Kent authored Mar 27, 2006

Alter the expire semantics that define how "busyness" is determined.
Currently a last_used counter is updated on every revalidate from processes
other than the mount owner process group.

This patch changes that so that an expire candidate is busy only if it has a
reference count greater than the expected minimum, such as when there is an
open file or working directory in use.

This method is the only way that busyness can be established for direct mounts
within the new implementation.  For consistency the expire semantic is made
the same for all mounts.

A side effect of the patch is that mounts which remain mounted unessessarily
in the presence of some GUI programs that scan the filesystem should now
expire.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e0a7aae9

[PATCH] autofs4: fix false negative return from expire · 1aff3c8b

Ian Kent authored Mar 27, 2006

Fix the case where an expire returns busy on a tree mount when it is in fact
not busy. This case was overlooked when the patch to prevent the expiring
away of "scaffolding" directories for tree mounts was applied.

The problem arises when a tree of mounts is a member of a map with other keys.
The current logic will not expire the tree if any other mount in the map is
busy. The solution is to maintain a "minimum" use count for each autofs
dentry and compare this to the actual dentry usage count during expire.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1aff3c8b

[PATCH] autofs4: simplify expire tree traversal · 1ce12bad

Ian Kent authored Mar 27, 2006

Simplify the expire tree traversal code by using a function from namespace.c
to calculate the next entry in the top down tree traversals carried out during
the expire operation.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1ce12bad

[PATCH] autofs4: expire code readability cleanup · 1f5f2c30

Ian Kent authored Mar 27, 2006

Change the names of the boolean functions autofs4_check_mount and
autofs4_check_tree to autofs4_mount_busy and autofs4_tree_busy respectively
and alters their return codes to suit in order to aid code readabilty.

A couple of white space cleanups are included as well.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1f5f2c30

[PATCH] autofs4: can't mount due to mount point dir not empty · 2d753e62

Ian Kent authored Mar 27, 2006

Addresse a problem where stale dentrys stop mounts from happening.

When a mount point directory is pre-created and a non-existent entry within it
is requested a dentry ends up being created within the mount point directory
which stops future mounts.  The problem is solved by ignoring negative,
unhashed dentrys in the mount point d_subdirs list.

Additionally the apparent cacheing of -ENOENT returns from requests is
removed.  The test on d_time is a tautology and d_time is not initialised and
has an unexpected value.  In short it doesn't do what it's meant to.

The cacheing of failed requests to the daemon is important and will be
followed up later.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2d753e62

[PATCH] autofs4: use libfs routines for readdir · f360ce3b

Ian Kent authored Mar 27, 2006

Change readdir routines to use the cursor based routines in libfs.c.  This
removes reliance on old readdir code from 2.4 and should improve efficiency of
readdir in autofs4.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f360ce3b

[PATCH] autofs4: lookup white space cleanup · 718c604a

Ian Kent authored Mar 27, 2006

Whitespace and formating changes to lookup code.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

718c604a

[PATCH] uml: fix hostfs stack corruption · 86c79cbc

Jeff Dike authored Mar 27, 2006

Noted by Oleg Drokin:
We initialized an extra slot of struct kstatfs.spare, sometimes
causing stack corruption.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Oleg Drokin <green@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

86c79cbc

[PATCH] uml: fix thread startup race · 5f4e8fd0

Jeff Dike authored Mar 27, 2006

This fixes a race in the starting of write_sigio_thread.  Previously, some of
the data needed by the thread was initialized after the clone.  If the thread
ran immediately, it would see the uninitialized data, including an empty
pollfds, which would cause it to hang.

We move the data initialization to before the clone, and adjust the error
paths and cleanup accordingly.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

5f4e8fd0

[PATCH] uml: prevent umid theft · 1fbbd684

Jeff Dike authored Mar 27, 2006

Behavior when booting two UMLs with the same umid was broken.  The second one
would steal the umid.  This fixes that, making the second UML take a random
umid instead.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1fbbd684

[PATCH] uml: fix segfault on signal delivery · 98c18238

Jeff Dike authored Mar 27, 2006

This fixes a process segfault where a signal was being delivered such that a
new stack page needed to be allocated to hold the signal frame.  This was
tripping some logic in the page fault handler which wouldn't allocate the page
if the faulting address was more that 32 bytes lower than the current stack
pointer.  Since a signal frame is greater than 32 bytes, this exercised that
case.

It's fixed by updating the SP in the pt_regs before starting to copy the
signal frame.  Since those are the registers that will be copied on to the
stack, we have to be careful to put the original SP, not the new one which
points to the signal frame, on the stack.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

98c18238

[PATCH] uml: allow ubd devices to be shared in a cluster · 6c29256c

Jeff Dike authored Mar 27, 2006

This adds a 'c' option to the ubd switch which turns off host file locking so
that the device can be shared, as with a cluster.  There's also some
whitespace cleanup while I was in this file.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6c29256c

[PATCH] uml: oS header cleanups · cf9165a5

Jeff Dike authored Mar 27, 2006

This rearranges the OS declarations by moving some declarations into os.h.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cf9165a5

[PATCH] uml: move tty logging to os-Linux · c554f899

Jeff Dike authored Mar 27, 2006

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all systemcalls from tty_log.c file under os-Linux dir
Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c554f899

[PATCH] uml: more carefully test whether we are in a system call · 81efcd33

Bodo Stroesser authored Mar 27, 2006

For security reasons, UML in is_syscall() needs to have access to code in
vsyscall-page.  The current implementation grants this access by explicitly
allowing access to vsyscall in access_ok_skas().  With this change,
copy_from_user() may be used to read the code.  Ptrace access to vsyscall-page
for debugging already was implemented in get_user_pages() by mainline.  In
i386, copy_from_user can't access vsyscall-page, but returns EFAULT.

To make UML behave as i386 does, I changed is_syscall to use
access_process_vm(current) to read the code from vsyscall-page.  This doesn't
hurt security, but simplifies the code and prepares implementation of
stub-vmas.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

81efcd33

[PATCH] uml: move sigio_user.c to os-Linux/sigio.c · f206aabb

Jeff Dike authored Mar 27, 2006

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves sigio_user.c to os-Linux dir
Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f206aabb

[PATCH] uml: move SIGIO startup code to os-Linux/start_up.c · 8e367065

Jeff Dike authored Mar 27, 2006

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all startup code from sigio_user.c file under os-Linux dir
Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8e367065

[PATCH] uml: merge irq_user.c and irq.c · 9b4f018d

Jeff Dike authored Mar 27, 2006

The serial UML OS-abstraction layer patch (um/kernel dir).

This joins irq_user.c and irq.c files.
Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9b4f018d

[PATCH] uml: move libc-dependent irq code to os-Linux · 63ae2a94

Jeff Dike authored Mar 27, 2006

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all systemcalls from irq_user.c file under os-Linux dir
Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

63ae2a94

[PATCH] uml: fix some printf formats · d9f8b62a

Jeff Dike authored Mar 27, 2006

Some printf formats are incorrect for large memory sizes.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

d9f8b62a

[PATCH] uml: fix declaration of exit() · c90e12b8

Jeff Dike authored Mar 27, 2006

This fixes a conflict between a header and what gcc "knows" the declaration'
to be.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c90e12b8

[PATCH] uml: fix build warnings in __get_user · bb83da05

Jeff Dike authored Mar 27, 2006

Fix a gcc warning about losing qualifiers to the first argument of
copy_from_user.  The typeof change for correctness, and fixes a lot of the
warnings, but there are some cases where x has some extra qualifiers, like
volatile, which copy_from_user can't know about.  For these, the void * cast
seems to be necessary.

Also cleaned up some of the whitespace and got rid of the emacs comment at the
bottom.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bb83da05

[PATCH] PM-Timer: don't use workaround if chipset is not buggy · dbffa471

OGAWA Hirofumi authored Mar 27, 2006

Current timer_pm.c reads I/O port triple times, in order to avoid the bug
of chipset.  But I/O port is slow.

2.6.16 (pmtmr)
Simple gettimeofday: 3.6532 microseconds

2.6.16+patch (pmtmr)
Simple gettimeofday: 1.4582 microseconds

[if chip is buggy, probably it will be 7us or more in 4.2% of probability.]

This patch adds blacklist of buggy chip, and if chip is not buggy, this
uses fast normal version instead of slow workaround version.

If chip is buggy, warnings "pmtmr is slow".  But sounds like there is gray
zone.  I found the PIIX4 errata, but I couldn't find the ICH4 errata.  But
some motherboard seems to have problem.

So, if we found a ICH4, generate warnings, and use a workaround version.
If user's ICH4 is good, the user can specify the "pmtmr_good" boot
parameter to use fast version.
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

dbffa471

26 Mar, 2006 7 commits

[SPARC64]: Kill duplicate exports of string library functions. · 5d5d7727
David S. Miller authored Mar 26, 2006
```
Kbuild now points these out.
Signed-off-by: David S. Miller <davem@davemloft.net>
```
5d5d7727
[SPARC64]: Update defconfig. · e7104b67
David S. Miller authored Mar 26, 2006
```
Signed-off-by: David S. Miller <davem@davemloft.net>
```
e7104b67

[PATCH] one ipc/sem.c->mutex.c converstion too many.. · 64bc0430

Manfred Spraul authored Mar 26, 2006

Ingo's sem2mutex patch incorrectly replaced one reference to ipc/sem.c
with ipc/mutex.c in a comment.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

64bc0430

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial · 9ae21d1b

Linus Torvalds authored Mar 26, 2006

* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  drivers/char/ftape/lowlevel/fdc-io.c: Correct a comment
  Kconfig help: MTD_JEDECPROBE already supports Intel
  Remove ugly debugging stuff
  do_mounts.c: Minor ROOT_DEV comment cleanup
  BUG_ON() Conversion in drivers/s390/block/dasd_devmap.c
  BUG_ON() Conversion in mm/mempool.c
  BUG_ON() Conversion in mm/memory.c
  BUG_ON() Conversion in kernel/fork.c
  BUG_ON() Conversion in ipc/sem.c
  BUG_ON() Conversion in fs/ext2/
  BUG_ON() Conversion in fs/hfs/
  BUG_ON() Conversion in fs/dcache.c
  BUG_ON() Conversion in fs/buffer.c
  BUG_ON() Conversion in input/serio/hp_sdc_mlc.c
  BUG_ON() Conversion in md/dm-table.c
  BUG_ON() Conversion in md/dm-path-selector.c
  BUG_ON() Conversion in drivers/isdn
  BUG_ON() Conversion in drivers/char
  BUG_ON() Conversion in drivers/mtd/

9ae21d1b

drivers/char/ftape/lowlevel/fdc-io.c: Correct a comment · e9415777

Bastien Roucaries authored Mar 26, 2006

This patch correct a comment about cli().
Signed-off-by: Bastien Roucaries <roucaries.bastien@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

e9415777

Kconfig help: MTD_JEDECPROBE already supports Intel · 8917f6f7

Adrian Bunk authored Mar 26, 2006

Intel chips are already supported.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: David Woodhouse <dwmw2@infradead.org>

8917f6f7

[PATCH] bitops: hweight() speedup · f9b41929

Akinobu Mita authored Mar 26, 2006

<linux@horizon.com> wrote:

This is an extremely well-known technique.  You can see a similar version that
uses a multiply for the last few steps at
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel whch
refers to "Software Optimization Guide for AMD Athlon 64 and Opteron
Processors"
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF

It's section 8.6, "Efficient Implementation of Population-Count Function in
32-bit Mode", pages 179-180.

It uses the name that I am more familiar with, "popcount" (population count),
although "Hamming weight" also makes sense.

Anyway, the proof of correctness proceeds as follows:

	b = a - ((a >> 1) & 0x55555555);
	c = (b & 0x33333333) + ((b >> 2) & 0x33333333);
	d = (c + (c >> 4)) & 0x0f0f0f0f;
#if SLOW_MULTIPLY
	e = d + (d >> 8)
	f = e + (e >> 16);
	return f & 63;
#else
	/* Useful if multiply takes at most 4 cycles */
	return (d * 0x01010101) >> 24;
#endif

The input value a can be thought of as 32 1-bit fields each holding their own
hamming weight.  Now look at it as 16 2-bit fields.  Each 2-bit field a1..a0
has the value 2*a1 + a0.  This can be converted into the hamming weight of the
2-bit field a1+a0 by subtracting a1.

That's what the (a >> 1) & mask subtraction does.  Since there can be no
borrows, you can just do it all at once.

Enumerating the 4 possible cases:

0b00 = 0  ->  0 - 0 = 0
0b01 = 1  ->  1 - 0 = 1
0b10 = 2  ->  2 - 1 = 1
0b11 = 3  ->  3 - 1 = 2

The next step consists of breaking up b (made of 16 2-bir fields) into
even and odd halves and adding them into 4-bit fields.  Since the largest
possible sum is 2+2 = 4, which will not fit into a 4-bit field, the 2-bit
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                          "which will not fit into a 2-bit field"

fields have to be masked before they are added.

After this point, the masking can be delayed.  Each 4-bit field holds a
population count from 0..4, taking at most 3 bits.  These numbers can be added
without overflowing a 4-bit field, so we can compute c + (c >> 4), and only
then mask off the unwanted bits.

This produces d, a number of 4 8-bit fields, each in the range 0..8.  From
this point, we can shift and add d multiple times without overflowing an 8-bit
field, and only do a final mask at the end.

The number to mask with has to be at least 63 (so that 32 on't be truncated),
but can also be 128 or 255.  The x86 has a special encoding for signed
immediate byte values -128..127, so the value of 255 is slower.  On other
processors, a special "sign extend byte" instruction might be faster.

On a processor with fast integer multiplies (Athlon but not P4), you can
reduce the final few serially dependent instructions to a single integer
multiply.  Consider d to be 3 8-bit values d3, d2, d1 and d0, each in the
range 0..8.  The multiply forms the partial products:

	           d3 d2 d1 d0
	        d3 d2 d1 d0
	     d3 d2 d1 d0
	+ d3 d2 d1 d0
	----------------------
	           e3 e2 e1 e0

Where e3 = d3 + d2 + d1 + d0.   e2, e1 and e0 obviously cannot generate
any carries.
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f9b41929