1. 22 Sep, 2007 11 commits
  2. 20 Sep, 2007 11 commits
  3. 19 Sep, 2007 18 commits
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus · a88a8eff
      Linus Torvalds authored
      * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
        [MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround
        [MIPS] DEC: Initialise ioasic_ssr_lock
      a88a8eff
    • Linus Torvalds's avatar
      Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb · c39c06b9
      Linus Torvalds authored
      * master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
        V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option
        Revert "V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option"
      c39c06b9
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 · a78feb7c
      Linus Torvalds authored
      * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
        [XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer.
        [XFS] Ensure file size updates have been completed before writing inode to disk.
        [XFS] On-demand reaping of the MRU cache
      a78feb7c
    • Linus Torvalds's avatar
      Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 · 91fe7d7c
      Linus Torvalds authored
      * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
        [SUNSAB]: Fix several bugs.
      91fe7d7c
    • Linus Torvalds's avatar
      Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6 · d56c5c41
      Linus Torvalds authored
      * master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6:
        ide: remove unused variables from drivers/ide/ppc/pmac.c
        ide: ST320413A has the same problem as ST340823A
      d56c5c41
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · f15f4138
      Linus Torvalds authored
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
        [POWERPC] Fix timekeeping on PowerPC 601
        [POWERPC] Don't expose clock vDSO functions when CPU has no timebase
        [POWERPC] spusched: Fix null pointer dereference in find_victim
      f15f4138
    • Linus Torvalds's avatar
      x86-64: page faults from user mode are always user faults · dbe3ed1c
      Linus Torvalds authored
      Randy Dunlap noticed an interesting "crashme" behaviour on his dual
      Prescott Xeon setup, where he gets page faults with the error code
      having a zero "user" bit, but the register state points back to user
      mode.
      
      This may be a CPU microcode buglet triggered by some strange instruction
      pattern that crashme generates, and loading a microcode update seems to
      possibly have fixed it.
      
      Regardless, we really should trust the register state more than the
      error code, since it's really the register state that determines whether
      we can actually send a signal, or whether we're in kernel mode and need
      to oops/kill the process in the case of a page fault.
      
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbe3ed1c
    • Maciej W. Rozycki's avatar
      [MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround · 09abbcff
      Maciej W. Rozycki authored
      Add a workaround to address warnings generated on the "n" constraint by
      GCC 3.3 and below.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      09abbcff
    • Maciej W. Rozycki's avatar
      [MIPS] DEC: Initialise ioasic_ssr_lock · 68835999
      Maciej W. Rozycki authored
      Fix the definition of the ioasic_ssr_lock spinlock to include a proper 
      initialisation.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      68835999
    • Dmitry Torokhov's avatar
      Driver core: fix deprectated sysfs structure for nested class devices · 4f01a757
      Dmitry Torokhov authored
      Nested class devices used to have 'device' symlink point to a real
      (physical) device instead of a parent class device.  When converting
      subsystems to struct device we need to keep doing what class devices did if
      CONFIG_SYSFS_DEPRECATED is Y, otherwise parts of udev break.
      Signed-off-by: default avatarDmitry Torokhov <dtor@mail.ru>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Acked-by: default avatarGreg KH <greg@kroah.com>
      Tested-by: default avatarAnssi Hannula <anssi.hannula@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4f01a757
    • Jeff Dike's avatar
      uml: fix irqstack crash · 508a9274
      Jeff Dike authored
      This patch fixes a crash caused by an interrupt coming in when an IRQ stack
      is being torn down.  When this happens, handle_signal will loop, setting up
      the IRQ stack again because the tearing down had finished, and handling
      whatever signals had come in.
      
      However, to_irq_stack returns a mask of pending signals to be handled, plus
      bit zero is set if the IRQ stack was already active, and thus shouldn't be
      torn down.  This causes a problem because when handle_signal goes around
      the loop, sig will be zero, and to_irq_stack will duly set bit zero in the
      returned mask, faking handle_signal into believing that it shouldn't tear
      down the IRQ stack and return thread_info pointers back to their original
      values.
      
      This will eventually cause a crash, as the IRQ stack thread_info will
      continue pointing to the original task_struct and an interrupt will look
      into it after it has been freed.
      
      The fix is to stop passing a signal number into to_irq_stack.  Rather, the
      pending signals mask is initialized beforehand with the bit for sig already
      set.  References to sig in to_irq_stack can be replaced with references to
      the mask.
      
      [akpm@linux-foundation.org: use UL]
      Signed-off-by: default avatarJeff Dike <jdike@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      508a9274
    • Lee Schermerhorn's avatar
      Fix NUMA Memory Policy Reference Counting · 480eccf9
      Lee Schermerhorn authored
      This patch proposes fixes to the reference counting of memory policy in the
      page allocation paths and in show_numa_map().  Extracted from my "Memory
      Policy Cleanups and Enhancements" series as stand-alone.
      
      Shared policy lookup [shmem] has always added a reference to the policy,
      but this was never unrefed after page allocation or after formatting the
      numa map data.
      
      Default system policy should not require additional ref counting, nor
      should the current task's task policy.  However, show_numa_map() calls
      get_vma_policy() to examine what may be [likely is] another task's policy.
      The latter case needs protection against freeing of the policy.
      
      This patch adds a reference count to a mempolicy returned by
      get_vma_policy() when the policy is a vma policy or another task's
      mempolicy.  Again, shared policy is already reference counted on lookup.  A
      matching "unref" [__mpol_free()] is performed in alloc_page_vma() for
      shared and vma policies, and in show_numa_map() for shared and another
      task's mempolicy.  We can call __mpol_free() directly, saving an admittedly
      inexpensive inline NULL test, because we know we have a non-NULL policy.
      
      Handling policy ref counts for hugepages is a bit trickier.
      huge_zonelist() returns a zone list that might come from a shared or vma
      'BIND policy.  In this case, we should hold the reference until after the
      huge page allocation in dequeue_hugepage().  The patch modifies
      huge_zonelist() to return a pointer to the mempolicy if it needs to be
      unref'd after allocation.
      
      Kernel Build [16cpu, 32GB, ia64] - average of 10 runs:
      
      		w/o patch	w/ refcount patch
      	    Avg	  Std Devn	   Avg	  Std Devn
      Real:	 100.59	    0.38	 100.63	    0.43
      User:	1209.60	    0.37	1209.91	    0.31
      System:   81.52	    0.42	  81.64	    0.34
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Acked-by: default avatarAndi Kleen <ak@suse.de>
      Cc: Christoph Lameter <clameter@sgi.com>
      Acked-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      480eccf9
    • Pavel Emelyanov's avatar
      Fix user namespace exiting OOPs · 28f300d2
      Pavel Emelyanov authored
      It turned out, that the user namespace is released during the do_exit() in
      exit_task_namespaces(), but the struct user_struct is released only during the
      put_task_struct(), i.e.  MUCH later.
      
      On debug kernels with poisoned slabs this will cause the oops in
      uid_hash_remove() because the head of the chain, which resides inside the
      struct user_namespace, will be already freed and poisoned.
      
      Since the uid hash itself is required only when someone can search it, i.e.
      when the namespace is alive, we can safely unhash all the user_struct-s from
      it during the namespace exiting.  The subsequent free_uid() will complete the
      user_struct destruction.
      
      For example simple program
      
         #include <sched.h>
      
         char stack[2 * 1024 * 1024];
      
         int f(void *foo)
         {
         	return 0;
         }
      
         int main(void)
         {
         	clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0);
         	return 0;
         }
      
      run on kernel with CONFIG_USER_NS turned on will oops the
      kernel immediately.
      
      This was spotted during OpenVZ kernel testing.
      Signed-off-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@openvz.org>
      Acked-by: default avatar"Serge E. Hallyn" <serue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      28f300d2
    • Pavel Emelyanov's avatar
      Convert uid hash to hlist · 735de223
      Pavel Emelyanov authored
      Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses
      list_heads, thus occupying twice as much place as it could.  Convert it to
      hlist_heads.
      Signed-off-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@openvz.org>
      Acked-by: default avatarSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      735de223
    • Matthias Kaehlcke's avatar
      kernel/user.c: Use list_for_each_entry instead of list_for_each · d8a4821d
      Matthias Kaehlcke authored
      kernel/user.c: Convert list_for_each to list_for_each_entry in
      uid_hash_find()
      Signed-off-by: default avatarMatthias Kaehlcke <matthias.kaehlcke@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d8a4821d
    • Eric Sandeen's avatar
      ext34: ensure do_split leaves enough free space in both blocks · ef2b02d3
      Eric Sandeen authored
      The do_split() function for htree dir blocks is intended to split a leaf
      block to make room for a new entry.  It sorts the entries in the original
      block by hash value, then moves the last half of the entries to the new
      block - without accounting for how much space this actually moves.  (IOW,
      it moves half of the entry *count* not half of the entry *space*).  If by
      chance we have both large & small entries, and we move only the smallest
      entries, and we have a large new entry to insert, we may not have created
      enough space for it.
      
      The patch below stores each record size when calculating the dx_map, and
      then walks the hash-sorted dx_map, calculating how many entries must be
      moved to more evenly split the existing entries between the old block and
      the new block, guaranteeing enough space for the new entry.
      
      The dx_map "offs" member is reduced to u16 so that the overall map size
      does not change - it is temporarily stored at the end of the new block, and
      if it grows too large it may be overwritten.  By making offs and size both
      u16, we won't grow the map size.
      
      Also add a few comments to the functions involved.
      
      This fixes the testcase reported by hooanon05@yahoo.co.jp on the
      linux-ext4 list, "ext3 dir_index causes an error"
      
      Thanks to Andreas Dilger for discussing the problem & solution with me.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarAndreas Dilger <adilger@clusterfs.com>
      Tested-by: default avatarJunjiro Okajima <hooanon05@yahoo.co.jp>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ef2b02d3
    • Andrew Morton's avatar
      disable sys_timerfd() for 2.6.23 · e4260197
      Andrew Morton authored
      There is still some confusion and disagreement over what this interface should
      actually do.  So it is best that we disable it in 2.6.23 until we get that
      fully sorted out.
      
      (sys_timerfd() was present in 2.6.22 but it was apparently broken, so here we
      assume that nobody is using it yet).
      
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e4260197
    • Alexey Dobriyan's avatar
      nfs: fix oops re sysctls and V4 support · 49af7ee1
      Alexey Dobriyan authored
      NFS unregisters sysctls only if V4 support is compiled in.  However, sysctl
      table is not V4 specific, so unregister it always.
      
      Steps to reproduce:
      
      	[build nfs.ko with CONFIG_NFS_V4=n]
      	modrobe nfs
      	rmmod nfs
      	ls /proc/sys
      
      Unable to handle kernel paging request at ffffffff880661c0 RIP:
       [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
      PGD 203067 PUD 207063 PMD 7e216067 PTE 0
      Oops: 0000 [1] SMP
      CPU 1
      Modules linked in: lockd nfs_acl sunrpc
      Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2
      RIP: 0010:[<ffffffff802af8e3>]  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
      RSP: 0018:ffff81007fd93e78  EFLAGS: 00010286
      RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0
      RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40
      RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0
      R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280
      FS:  00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000)
      Stack:  ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40
       ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a
       2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640
      Call Trace:
       [<ffffffff80283f30>] filldir+0x0/0xf0
       [<ffffffff80283f30>] filldir+0x0/0xf0
       [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0
       [<ffffffff80284376>] sys_getdents+0x96/0xe0
       [<ffffffff8020bb3e>] system_call+0x7e/0x83
      
      Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b
      RIP  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
       RSP <ffff81007fd93e78>
      CR2: ffffffff880661c0
      Kernel panic - not syncing: Fatal exception
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarTrond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49af7ee1