1. 04 Jan, 2005 40 commits
    • Zwane Mwaikambo's avatar
      [PATCH] Remove RCU abuse in cpu_idle() · f2f1b44c
      Zwane Mwaikambo authored
      Introduce cpu_idle_wait() on architectures requiring modification of
      pm_idle from modules, this will ensure that all processors have updated
      their cached values of pm_idle upon exit.  This patch is to address the bug
      report at http://bugme.osdl.org/show_bug.cgi?id=1716 and replaces the
      current code fix which is in violation of normal RCU usage as pointed out
      by Stephen, Dipankar and Paul.
      Signed-off-by: default avatarZwane Mwaikambo <zwane@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f2f1b44c
    • Chris Mason's avatar
      [PATCH] __getblk_slow can loop forever when pages are partially mapped · a61e7286
      Chris Mason authored
      When a block device is accessed via read/write, it is possible for some of
      the buffers on a page to be mapped and others not.  __getblk and friends
      assume this can't happen, and can end up looping forever when pages have
      some unmapped buffers.  Picture:
      
      lseek(/dev/xxx, 2048, SEEK_SET)
      write(/dev/xxx, 2048 bytes)
      
      Assuming the block size is 1k, page 0 has 4 buffers, two are mapped by
      __block_prepare_write and two are not.  Next, another process triggers
      getblk(/dev/xxx, blocknr = 0);
      
      __getblk_slow will loop forever.  __find_get_block fails because the buffer
      isn't mapped.  grow_dev_page does nothing because there are buffers on the
      page with the correct size.  madhav@veritas.com and others at Veritas
      tracked this down.
      
      The fix below has two parts.  First, it changes __find_get_block to avoid
      the buffer_error warnings when it finds unmapped buffers on the page.
      
      Second, it changes grow_dev_page to map the buffers on the page by calling
      init_page_buffers.  init_page_buffers is changed so we don't stomp on
      uptodate bits for the buffers.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a61e7286
    • Kenji Kaneshige's avatar
      [PATCH] IRQ resource deallocation: ia64 · 5d25c798
      Kenji Kaneshige authored
      This is an ia64 portion of IRQ resource deallocation. It implements
      pcibios_disable_device() and acpi_unregister_gsi() for ia64.
      
          o acpi_unregister_gsi()
      
              Summary of changes for implementing this interface:
      
              - Add new function iosapic_unregister_intr() into
                arch/ia64/kernel/iosapic.c. This function frees an interrupt
                vector and related data structures.
      
              - Add new function free_irq_vector() into
                arch/ia64/kernel/irq_ia64.c. This frees an unused vector.
      
              - Change assign_irq_vector() to be able to support
                free_irq_vector().
      
          o pcibios_disable_device()
      
              This calls acpi_pci_irq_disable() to deallocate IRQ resources.
      Signed-off-by: default avatarKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5d25c798
    • Kenji Kaneshige's avatar
      [PATCH] IRQ resource deallocation: ACPI · 0090012b
      Kenji Kaneshige authored
      Architecture dependent IRQ resources such as interrupt vector for PCI
      devices are allocated at pci_enable_device() time on i386, x86-64 and
      ia64 platform. Today, however, these IRQ resources are never
      deallocated even if they are no longer used. The following set of
      patches adds supports to deallocate IRQ resources at
      pci_disable_device() time.
      
      The motivation of the set of patches is as follows:
      
          - IRQ resources such as interrupt vectors should be freed if they
            are no longer used because the amount of these resources are
            limited. By deallocating IRQ resources, we can recycle them.
      
          - I think some hardwares will support hot-pluggable I/O units with
            I/O xAPICs in the near future. So I/O xAPIC hot-plug support by
            OS will be needed soon. IRQ resouces deallocation will be one of
            the most important stuff for I/O xAPIC hot-plug.
      
      For now, the following set of patches has ia64 implementation only.
      i386 and x86_64 implementations are TBD.
      
      
      
      
      This patch is ACPI portion of IRQ deallocation. This patch defines the
      following new interface. The implementation of this interface depends
      on each platform.
      
          o void acpi_unregister_gsi(u32 gsi)
      
              This is a opposite portion of acpi_register_gsi(). This has a
              responsibility for deallocating IRQ resources associated with
              the specified GSI number.
      
              We need to consider the case of shared interrupt. In the case
              of shared interrupt, acpi_register_gsi() is called multiple
              times for one gsi. That is, registrations and unregistrations
              can be nested.
      
              This function undoes the effect of one call to
              acpi_register_gsi(). If this matches the last registration,
              IRQ resources associated with the specified GSI number are
              freed.
      
      This patch also adds the following new function.
      
          o void acpi_pci_irq_disable (struct pci_dev *dev)
      
              This function is a opposite portion of
              acpi_pci_enable_irq(). It clears the device's linux IRQ number
              and calls acpi_unregister_gsi() to deallocate IRQ resources.
      Signed-off-by: default avatarKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0090012b
    • Manfred Spraul's avatar
      [PATCH] fix missing wakeup in ipc/sem · 956cdd1b
      Manfred Spraul authored
      My patch that removed the spin_lock calls from the tail of sys_semtimedop
      introduced a bug:
      
      Before my patch was merged, every operation that altered an array called
      update_queue.  That call woke up threads that were waiting until a
      semaphore value becomes 0.  I've accidentially removed that call.
      
      The attached patch fixes that by modifying update_queue: the function now
      loops internally and wakes up all threads.  The patch also removes
      update_queue calls from the error path of sys_semtimedop: failed operations
      do not modify the array, no need to rescan the list of waiting threads.
      Signed-Off-By: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      956cdd1b
    • Andreas Gruenbacher's avatar
      [PATCH] Ext[23]: apply umask to symlinks with ACLs configured out · 79a35a44
      Andreas Gruenbacher authored
      Keith Young <stripyd@stripydog.com> has reported that when ACLs are not
      compiled in, the default implementation of ext[23]_init_acl applies the
      umask to all new files, including symlinks, which is wrong.  In this case
      the VFS already takes care of applying the umask when needed, so ext2 and
      ext3 need not bother about it.  Remove the superfluous statements.
      Signed-off-by: default avatarAndreas Gruenbacher <agruen@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      79a35a44
    • Andrew Morton's avatar
      [PATCH] get_blkdev_list() cleanup · eed6b962
      Andrew Morton authored
      - Move prototype to genhd.h
      
      - It is only needed for /proc
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      eed6b962
    • Stephen Rothwell's avatar
      [PATCH] noone uses HAVE_ARCH_SI_CODES or HAVE_ARCH_SIGEVENT_T · 06901504
      Stephen Rothwell authored
      Since asm-generic/siginfo.h was created, the architectures have been slowly
      fixed/modified until noone uses HAVE_ARCH_SI_CODES or HAVE_ARCH_SIGEVENT_T
      any more, so this patch removes the checks for them.
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      06901504
    • Franz Pletz's avatar
      [PATCH] loop device resursion avoidance · 577dfb53
      Franz Pletz authored
      With Andries Brouwer <Andries.Brouwer@cwi.nl>
      
      Fix various recursion scenarios wherein it was possible to mount a loop
      device on itself, either directly or via intermediate loops devices.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      577dfb53
    • Pekka Enberg's avatar
      [PATCH] noop iosched: remove unused includes · 24498885
      Pekka Enberg authored
      This patch removes unused includes from drivers/block/noop-iosched.c.
      Signed-off-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      24498885
    • Pekka Enberg's avatar
      [PATCH] noop iosched: make code static · e7e22a3a
      Pekka Enberg authored
      This patch makes code static in drivers/block/noop-iosched.c and adds
      __init and __exit for module initialization and cleanup functions.
      Signed-off-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e7e22a3a
    • Randy Dunlap's avatar
      [PATCH] cpumask: range check before using value · 920e3328
      Randy Dunlap authored
      When setting the 'cpu_isolated_map' mask, check that the user input value
      is valid (in range 0 ..  NR_CPUS - 1).  Also fix up kernel-parameters.txt
      for this parameter.
      Signed-off-by: default avatarRandy Dunlap <rddunlap@osdl.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      920e3328
    • Zwane Mwaikambo's avatar
      [PATCH] fix alt-sysrq deadlock · 68764ad9
      Zwane Mwaikambo authored
      __handle_sysrq was modified to do a spin_lock_irqsave so we were entering
      smp_send_stop with interrupts.  So reenable interrupts to prevent the
      possible smp_call_function() deadlock.
      
      (It's still deadlocky if the sysrq handler is against called via an
      interrupt from a different device, but that seems unlikely).
      Signed-off-by: default avatarZwane Mwaikambo <zwane@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      68764ad9
    • Prasanna Meda's avatar
      [PATCH] Add PR_GET_NAME · e2901099
      Prasanna Meda authored
      A while back we added the PR_SET_NAME prctl, but no PR_GET_NAME.  I guess
      we should add this, if only to enable testing of PR_SET_NAME.
      Signed-off-by: default avatarPrasanna Meda <pmeda@akamai.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e2901099
    • Randy Dunlap's avatar
      [PATCH] panic_timeout: move to kernel.h · 4ffd90a1
      Randy Dunlap authored
      Move 'panic_timeout' to linux/kernel.h.
      
      ipmi_watchdog.c wanted to know why panic_timeout isn't in some header file.
       However, ipmi_watchdog.c doesn't even use it, so that reference was
      deleted.  Other references now use kernel.h instead of straight extern int.
      Signed-off-by: default avatarRandy Dunlap <rddunlap@osdl.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4ffd90a1
    • Matt Domsch's avatar
      [PATCH] EDD: add edd=off and edd=skipmbr options · e9855e2c
      Matt Domsch authored
      EDD: add edd=off and edd=skipmbr command line options
         
      New command line options
      edd=off     (or edd=of)
      edd=skipmbr (or edd=sk)
      
      runtime options for disabling all EDD int13 calls completely, or for
      skipping the int13 READ SECTOR calls, respectively.
      
      These are provided to allow Linux distributions to include CONFIG_EDD=m, yet
      allow end-users to disable parts of EDD which may not work well with their
      system's BIOS.
      
      I incorporated comments from Randy Dunlap, and got an ack from Andi Kleen.
      Signed-off-by: default avatarMatt Domsch <Matt_Domsch@dell.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e9855e2c
    • J. A. Magallon's avatar
      [PATCH] make gconfig work with gtk-2.4 · 9506e197
      J. A. Magallon authored
      I need this to make gconfig work under gtk-2.4.  Without this, it just
      coredumps.  There is some problem with pixmap creation/usage from XPM in
      the way it is done in gconf, so I just added some stock icons.  It is even
      prettier..;)
      
      Could someone test this still works on gtk-2.0 or 2.2 ?
      
      Changes:
      
      - change the wiget class 'button' in glade files to something known to
        glade (GtkToolButton)
      - use 'stock-id' property for toolbar buttons instead of "stock_pixmap"
      - change unknown signal "pressed" to "clicked"
      - remove manual setting of icons in gconf.c
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9506e197
    • Rusty Russell's avatar
      [PATCH] sys_sched_setaffinity() on UP should fail for non-zero CPUs. · 07492792
      Rusty Russell authored
      Return EINVAL for invalid sched_setaffinity on UP.  I was a little
      surprised that sys_sched_setaffinity for CPU 1 didn't fail on my UP box.
      With CONFIG_SMP it would have.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      07492792
    • Tvrtko A. Ursulin's avatar
      [PATCH] smb_file_open() retval fix · c805134e
      Tvrtko A. Ursulin authored
      Correctly propagate the return value from smb_open(). 
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c805134e
    • Manfred Spraul's avatar
      [PATCH] rcu: simplify quiescent state detection · 17b3fed1
      Manfred Spraul authored
      Based on an initial patch from Oleg Nesterov <oleg@tv-sign.ru>
      
      rcu_data.last_qsctr is not needed.  Actually, not even a counter is needed,
      just a flag that indicates that there was a quiescent state.
      Signed-Off-By: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      17b3fed1
    • Manfred Spraul's avatar
      [PATCH] rcu: make two internal structs static · 2f803905
      Manfred Spraul authored
      The patch below makes two needlessly global structs static.
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2f803905
    • Oleg Nesterov's avatar
      [PATCH] rcu: eliminate rcu_ctrlblk.lock · a48d69a5
      Oleg Nesterov authored
      rcu_ctrlblk.lock is used to read the ->cur and ->next_pending
      atomically in __rcu_process_callbacks(). It can be replaced
      by a couple of memory barriers.
      
      rcu_start_batch:
      	rcp->next_pending = 0;
      	smp_wmb();
      	rcp->cur++;
      
      __rcu_process_callbacks:
      	rdp->batch = rcp->cur + 1;
      	smp_rmb();
      	if (!rcp->next_pending)
      		rcu_start_batch(rcp, rsp, 1);
      
      This way, if __rcu_process_callbacks() sees incremented ->cur value,
      it must also see that ->next_pending == 0 (or rcu_start_batch() is
      already in progress on another cpu).
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a48d69a5
    • Adrian Bunk's avatar
      [PATCH] remove ip2 programs · 38f808dd
      Adrian Bunk authored
      drivers/char/ip2/ contained three programs. Besides shipping programs at
      this place doesn't sound like a good idea, they didn't even all compile.
      
      The patch below removes them.
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      38f808dd
    • Andi Kleen's avatar
      [PATCH] Sync in core time granuality with filesystems · 8ce13b01
      Andi Kleen authored
      This patch corrects a problem that was originally added with the nanosecond
      timestamps in stat patch.  The problem is that some file systems don't have
      enough space in their on disk inode to save nanosecond timestamps, so they
      truncate the c/a/mtime to seconds when flushing an dirty node.  In core the
      inode would have full jiffies granuality.
      
      This can be observed by programs as a timestamp that jumps backwards under
      specific loads when an inode is flushed and then reloaded from disk.
      
      The problem was already known when the original patch went in, but it
      wasn't deemed important enough at that time.  So far there has been only
      one report of it causing problems.  Now Tridge is worried that it will
      break running Excel over samba4 because Excel seems to do very anal
      timestamp checking and samba4 will supply 100ns timestamps over the
      network.
      
      This patch solves it by putting the time resolution into the superblock of
      a fs and always rounding the in core timestamps to that granuality.
      
      This also supercedes some previous ext2/3 hacks to flush the inode less
      often when only the subsecond timestamp changes.
      
      I tried to keep the overhead low, in particular it tries to keep divisions
      out of fast paths as far as possible.
      
      The patch is quite big but 99% of it is just relatively straight forward
      search'n'replace in a lot of fs.  Unconverted filesystems will default to a
      1ns granuality, but may still show the problem if they continue to use
      CURRENT_TIME.  I converted all in tree fs.
      
      One possible future extension of this would be to have two time
      granualities per superblock - one that specifies the visible resolution,
      and the other to specify how often timestamps should be flushed to disk,
      which could be tuned with a mount option per fs (e.g.  often m/atimes don't
      need to be flushed every second).  Would be easy to do as an addon if
      someone is interested.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8ce13b01
    • Martin Schwidefsky's avatar
      [PATCH] sys_stime needs a compat function · 8fa29920
      Martin Schwidefsky authored
      I realized that the best way to get the sys_time/sys_stime problem fixed is
      to make sys_time 64 bit safe by using "time_t *" instead of "int *" and to
      introduce two proper compat functions compat_sys_time and compat_sys_stime.
      
      The prototype change of sys_time is transparent for 32 bit architectures
      because both "int" and "time_t" are 32 bit.  For 64 bit the type change
      would be wrong but luckily no 64 bit architecture uses sys_time/sys_stime
      in 64 bit mode.  The patch makes the following change:
      
      ia64     : Remove sys32_time, use compat_sys_time and
                 add (!!) compat_sys_stime to compat syscall table.
      mips     : Use compat_sys_time/compat_sys_stime in 32 bit syscall table.
                 Add #ifdef magic to compile sys_time/sys_stime and
                 compat_sys_time/compat_sys_stime only if needed.
      parisc   : Remove sys32_time, use compat_sys_time and compat_sys_stime.
      ppc64    : remove sys32_time, ppc64_sys32_stime and ppc64_sys_stime.
                 Use common compat_sys_time, compat_sys_stime and sys_stime.
      s390     : Use compat_sys_stime. Add #ifdef magic to compile
                 sys_time/sys_stime and compat_sys_time/compat_sys_stime only
                 if needed.
      sparc64  : Use compat_sys_time/compat_Sys_stime in 32 bit syscall table.
      um       : Remove um_time and um_stime. Use common functions sys_time and
                 sys_stime. This adds a CAP_SYS_TIME check to UMs stime call.
      x86_64   : Remove sys32_time. Use compat_sys_time and compat_sys_stime
                 in 32 bit syscall table.
      
      The original stime bug is fixed for mips, parisc, s390, sparc64 and
      x86_64. Can the arch-maintainers please take a look at this?
      
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      Convert compat_time_t to time_t in 32 bit emulation for sys_stime and
      consolidate all the different implementation of sys_time, sys_stime and
      their 32-bit emulation parts.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8fa29920
    • Adrian Bunk's avatar
      [PATCH] compile with -ffreestanding · d6326c18
      Adrian Bunk authored
      For the kernel, it would be logical to use -ffreestanding.  The kernel is
      not a hosted environment with a standard C library.
      
      The gcc option -ffreestanding is supported by both gcc 2.95 and 3.4, which
      covers the whole range of currently supported compilers.
      
      Regarding changes caused by this patch:
      
      Andi Kleen reported:
        Newer gcc rewrites sprintf(buf,"%s",str) to strcpy(buf,str) transparently.
      
      This is only true with unit-at-a-time (disabled on i386 but enabled on
      x86_64).  The Linux kernel doesn't offer a standard C library, and such
      transparent replacements of kernel functions with builtins are quite
      fragile.
      
      Even with -ffreestanding, it's still possilble to explicitely use a gcc
      builtin if desired.
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d6326c18
    • Alexander Nyberg's avatar
      [PATCH] Off by one in drivers/parport/probe.c · 064da5f6
      Alexander Nyberg authored
      This fixes a theoretical bug indicated in:
      http://bugme.osdl.org/show_bug.cgi?id=240
      
      It prevents overflow in case the required buffer is larger than the passed
      buffer.  This I found to be the minimally intrusive change.
      Signed-off-by: default avatarAlexander Nyberg <alexn@dsv.su.se>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      064da5f6
    • Alex Tomas's avatar
      [PATCH] ext3: support for EA in inode · 78085a46
      Alex Tomas authored
      1) intent of the patch is to get possibility to store EAs in the body of large
         inode. it saves space and improves performance in some cases
      
      2) the patch is quite simple: it works the same way original xattr does, but
         using other storage (inode body). body has priority over separate block.
         original routines (ext3_xattr_get, ext3_xattr_list, ext3_xattr_set) are
         renamed to ext3_xattr_block_*. new routines that handle inode storate are
         added (ext3_xattr_ibody_get, ext3_xattr_ibody_list, ext3_xattr_ibody_set).
         routines ext3_xattr_get, ext3_xattr_list and ext3_xattr_set allow user to
         accesss both the storages transparently
      
      3) the change makes sense on filesystem with inode size >= 256 bytes only.
         2.4 kernels don't support such a filesystems, AFAIK. 2.6 kernels do support
         and ignore EAs stored in a body w/o the patch
      
      4) debugfs and e2fsck need to be patched to deal with EAs in inode
         the patch will be sent later
      
      5) testing results:
      	a) Andrew Samba Master (tridge) has done successful tests
      	b) we've been using ea-in-inode feature in Lustre for many months
      Signed-off-by: default avatarAndreas Dilger <adilger@clusterfs.com>
      Signed-off-by: default avatarAlex Tomas <alex@clusterfs.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      78085a46
    • Andrew Morton's avatar
      [PATCH] Reduce i_sem usage during file sync operations · fbdce7d7
      Andrew Morton authored
      We hold i_sem during the various sync() operations to prevent livelocks:
      if another thread is dirtying the file, a sync() may never return.
      
      Or at least, that used to be true when we were using the per-address_space
      page lists.  Since writeback has used radix tree traversal it is not possible
      to livelock the sync() operations, because they only visit each page a single
      time.
      
      sync_page_range() (used by O_SYNC writes) has not been holding i_sem for quite
      some time, for the above reasons.
      
      The patch converts fsync(), fdatasync() and msync() to also not hold i_sem
      during the radix-tree-based writeback.
      
      Now, we _do_ still need to hold i_sem across the file->f_op->fsync() call,
      because that is still based on a list_head walk, and is still livelockable.
      
      But in the case of msync() I deliberately left i_sem untaken.  This is because
      we're currently deadlockable in msync, because mmap_sem is already held, and
      mmap_sem nexts inside i_sem, due to direct-io.c.
      
      And yes, the ranking of down_read() veruss down() does matter:
      
      	Task A			Task B		Task C
      
      	down_read(rwsem)
      				down(sem)
      						down_write(rwsem)
      	down(sem)
      				down_read(rwsem)
      
      
      C's down_write() will cause B's down_read to block.  B holds `sem', so A will
      never release `rwsem'.
      
      So the patch fixes a hard-to-hit triple-task deadlock, but adds a possible
      livelock in msync().  It is possible to fix sys_msync() so that it takes i_sem
      outside i_mmap_sem.  Later.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fbdce7d7
    • Andrew Morton's avatar
      [PATCH] suppress might_sleep() if oopsing · e486b6b7
      Andrew Morton authored
      We can call might_sleep() functions on the oops handling path (under do_exit).
      
      There seem little point in emitting spurious might_sleep() warnings into the
      logs after the kernel has oopsed.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e486b6b7
    • Prasanna Meda's avatar
      [PATCH] fork: total_forks not counted under tasklist_lock · fe52f966
      Prasanna Meda authored
      Bring the total_forks under tasklist_lock.  When most of the fork code
      icluding nr_threads is moved to copy_process() from do_fork() code in 2.6,
      this is left out.
      
      Althought accuracy of total_forks is not important, it would be nice to add
      this.  It does not involve additional cost, and the code will be cleaner if
      it is grouped with nr_threads.  The difference is, total_forks will
      increase on fork, but nr_threads will increase on fork and decrease on the
      exit.
      
      I also moved extern decleration to sched.h from proc_misc.c.
      Signed-off-by: default avatarPrasanna Meda <pmeda@akamai.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fe52f966
    • Li Shaohua's avatar
      [PATCH] time runx too fast after S3 · bb51bc59
      Li Shaohua authored
      After resume from S3, 'date' shows time run too fast.
      Signed-off-by: default avatarLi Shaohua <shaohua.li@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bb51bc59
    • Matthew Dobson's avatar
      [PATCH] cpumask_t initializers · 90b8f3ac
      Matthew Dobson authored
      In the course of another patch I've been working on, I stumbled across
      some weirdness with some of the SD_*_INIT sched_domains initializers.  A
      day or so of digging narrowed it down to the CPU_MASK_NONE initializer
      nested inside the sched_domain initializers.  The errors I got were:
      
      kernel/sched.c:4812: error: initializer element is not constant
      kernel/sched.c:4812: error: (near initialization for `sched_domain_dummy')
      kernel/sched.c:4812: error: initializer element is not constant
      
      which was this line:
      
      static struct sched_domain sched_domain_dummy = SD_CPU_INIT;
      
      Janis Johnson, a GCC hacker, told me the following:
      90b8f3ac
    • Stephen C. Tweedie's avatar
      [PATCH] ext3: handle attempted double-delete of metadata. · a3192788
      Stephen C. Tweedie authored
      This patch improves ext3's ability to deal with corruption on-disk.  If we
      try to delete a metadata block twice, we confuse ext3's internal revoke
      error-checking, resulting in a BUG().  But this can occur in practice due
      to a corrupt indirect block, so we should attempt to fail gracefully.
      
      Downgrade the assert failure to a JH_EXPECT_BH failure, and return EIO when
      it occurs.
      
      This is easily reproduced with a sample ext3 fs image containing an inode
      which references the same indirect block more than once.  Deleting that
      inode will BUG() an unfixed kernel with:
      
      Assertion failure in journal_revoke() at fs/jbd/revoke.c:379:
      "!buffer_revoked(bh)"
      
      With the fix, ext3 recovers gracefully.
      Signed-off-by: default avatarStephen Tweedie <sct@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a3192788
    • Stephen C. Tweedie's avatar
      [PATCH] ext3: handle attempted delete of bitmap blocks. · c579b4e2
      Stephen C. Tweedie authored
      This patch improves ext3's ability to deal with corruption on-disk.  If we
      ever get a corrupt inode or indirect block, then an attempt to delete it
      can end up trying to remove any block on the fs, including bitmap blocks.
      This can cause ext3 to assert-fail as we end up trying to do an ext3_forget
      on a buffer with b_committed_data set.
      
      The fix is to downgrade this to an IO error and journal abort, so that we
      take the filesystem readonly but don't bring down the whole kernel.
      
      Make J_EXPECT_JH() return a value so it can be easily tested and yet still
      retained as an assert failure if we build ext3 with full internal debugging
      enabled.  Make journal_forget() return an error code so that in this case
      the error can be passed up to the caller.
      
      This is easily reproduced with a sample ext3 fs image containing an inode
      whose direct and indirect blocks refer to a block bitmap block.  Allocating
      new blocks and then deleting that inode will BUG() with:
      
      Assertion failure in journal_forget() at fs/jbd/transaction.c:1228:
      "!jh->b_committed_data"
      
      With the fix, ext3 recovers gracefully.
      Signed-off-by: default avatarStephen Tweedie <sct@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c579b4e2
    • Stephen C. Tweedie's avatar
      [PATCH] ext3: cleanup handling of aborted transactions. · 046527de
      Stephen C. Tweedie authored
      This patch improves ext3's error logging when we encounter an on-disk
      corruption.  Previously, a transaction (such as a truncate) which encountered
      many corruptions (eg.  a single highly-corrupt indirect block) would emit
      copious "aborting transaction" errors to the log.
      
      Even worse, encountering an aborted journal can count as such an error,
      leading to a flood of spurious "aborting transaction: Journal has aborted"
      errors.
      
      With the fix, only emit that message on the first error.  The patch also
      restores a missing \n in that printk path.
      Signed-off-by: default avatarStephen Tweedie <sct@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      046527de
    • Adrian Bunk's avatar
      [PATCH] kill blk.h · 80ddd72d
      Adrian Bunk authored
      All blk.h users were converted in 2.5, and at the same time blk.h began 
      giving a warning.
      
      The patch below removes this obsolete file.
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      80ddd72d
    • Corey Minyard's avatar
      [PATCH] Cleanups for the IPMI driver · 4bf76b4a
      Corey Minyard authored
      This patch removes some unneeded cruft that Adrian found, and also turns
      off the shutdown of the timer when removing the module.  Since the timer is
      shutdown when the driver is closed (unless no way out is specified) this is
      unnecessary and defeats the no way out option.
      
      - remove some completely unused code
      - make some needlessly global code static
      - removal of some EXPORT_SYMBOL'ed code with zero users.
      - Removal of the timer shutdown on module removal
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarCorey Minyard <minyard@acm.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4bf76b4a
    • Robin Holt's avatar
      [PATCH] Hold BKL for shorter period in generic_shutdown_super(). · eb8c6834
      Robin Holt authored
      Testing revealed long pauses of the entire system while autofs initiated
      umounts as a result of timing out the mounts.
      
      It was noticed that during a umount, the BKL is held while scanning the
      inode_list and removing and inodes that are candidates.  This patch moves
      locking until after the first pass had gone through the inode_list.
      
      Testing revelead that on an ia64 machine with a filesystem that had 8.4
      Million inodes, there were no observable pauses during the umount.  This
      was down from over 4 seconds without this patch.
      Signed-Off-By: default avatarRobin Holt <holt@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      eb8c6834
    • Christoph Hellwig's avatar
      [PATCH] remove unused irq_cpustat fields · fe395411
      Christoph Hellwig authored
      The only common field in irq_cpustat is __softirq_pending, i386 and ppc
      have some of their own.
      
      Remove all unused obsolete fields from various architectures.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fe395411