1. 24 Jun, 2004 12 commits
    • Andrew Morton's avatar
      [PATCH] cpumask: rewrite cpumask.h - single bitmap based implementation · f3344dc3
      Andrew Morton authored
      From: Paul Jackson <pj@sgi.com>
      
      Major rewrite of cpumask to use a single implementation, as a struct-wrapped
      bitmap.
      
      This patch leaves some 26 include/asm-*/cpumask*.h header files orphaned - to
      be removed next patch.
      
      Some nine cpumask macros for const variants and to coerce and promote between
      an unsigned long and a cpumask are obsolete.  Simple emulation wrappers are
      provided in this patch for these obsolete macros, which can be removed once
      each of the 3 archs (i386, ppc64, x86_64) using them are recoded in follow-on
      patches to not need them.
      
      The CPU_MASK_ALL macro now avoids leaving possible garbage one bits in any
      unused portion of the high word.
      
      An inproved comment lists all available operators, for convenient browsing.
      
      From: Mikael Pettersson <mikpe@csd.uu.se>
      
        2.6.7-rc3-mm1 changed CPU_MASK_NONE into something that isn't a valid
        rvalue (it only works inside struct initializers).  This caused compile-time
        errors in perfctr in UP x86 builds.
      
      From: Arnd Bergmann <arnd@arndb.de>
      
        cpumask-5-10-rewrite-cpumaskh-single-bitmap-based from 2.6.7-rc3-mm1
        causes include2/asm/smp.h:54:1: warning: "cpu_online" redefined
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f3344dc3
    • Andrew Morton's avatar
      [PATCH] cpumask: bitmap inlining and optimizations · d6cf71d3
      Andrew Morton authored
      From: Paul Jackson <pj@sgi.com>
      
      These bitmap improvements make it a suitable basis for fully supporting
      cpumask_t and nodemask_t.  Inline macros with compile-time checks enable
      generating tight code on both small and large systems (large meaning cpumask_t
      requires more than one unsigned long's worth of bits).
      
      The existing bitmap_<op> macros in lib/bitmap.c are renamed to __bitmap_<op>,
      and wrappers for each bitmap_<op> are exposed in include/linux/bitmap.h
      
      This patch _includes_ Bill Irwins rewrite of the bitmap_shift operators to not
      require a fixed length intermediate bitmap.
      
      Improved comments list each available operator for easy browsing.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d6cf71d3
    • Andrew Morton's avatar
      [PATCH] cpumask: bitmap cleanup preparation for cpumask overhaul · ea0c1929
      Andrew Morton authored
      From: Paul Jackson <pj@sgi.com>
      
      Document the bitmap bit model and handling of unused bits.
      
      Tighten up bitmap so it does not generate nonzero bits in the unused tail if
      it is not given any on input.
      
      Add intersects, subset, xor and andnot operators.  Change bitmap_complement to
      take two operands.
      
      Add a couple of missing 'const' qualifiers on bitops test_bit and bitmap_equal
      args.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ea0c1929
    • Andrew Morton's avatar
      [PATCH] cpumask: make cpu_present_map real even on non-smp · d2cec97b
      Andrew Morton authored
      From: Paul Jackson <pj@sgi.com>
      
      This patch makes cpu_present_map a real map for all configurations, instead of
      a constant for non-SMP.  It also moves the definition of cpu_present_map out
      of kernel/cpu.c into kernel/sched.c, because cpu.c isn't compiled into non-SMP
      kernels.
      
      The pattern is that each of the possible, present and online cpu maps are
      actual kernel global cpumask_t variables, for all configurations.  They are
      documented in include/linux/cpumask.h.  Some of the UP (NR_CPUS=1) code
      cheats, and hardcodes the assumption that the single bit position of these
      maps is always set, as an optimization.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d2cec97b
    • Andrew Morton's avatar
      [PATCH] rcu: avoid passing an argument to the callback function · 8c1ce9d6
      Andrew Morton authored
      From: Dipankar Sarma <dipankar@in.ibm.com>
      
      This patch changes the call_rcu() API and avoids passing an argument to the
      callback function as suggested by Rusty.  Instead, it is assumed that the
      user has embedded the rcu head into a structure that is useful in the
      callback and the rcu_head pointer is passed to the callback.  The callback
      can use container_of() to get the pointer to its structure and work with
      it.  Together with the rcu-singly-link patch, it reduces the rcu_head size
      by 50%.  Considering that we use these in things like struct dentry and
      struct dst_entry, this is good savings in space.
      
      An example :
      
      struct my_struct {
      	struct rcu_head rcu;
      	int x;
      	int y;
      };
      
      void my_rcu_callback(struct rcu_head *head)
      {
      	struct my_struct *p = container_of(head, struct my_struct, rcu);
      	free(p);
      }
      
      void my_delete(struct my_struct *p)
      {
      	...
      	call_rcu(&p->rcu, my_rcu_callback);
      	...
      }
      Signed-Off-By: default avatarDipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8c1ce9d6
    • Andrew Morton's avatar
      [PATCH] reduce rcu_head size - core · b659a6fb
      Andrew Morton authored
      From: Dipankar Sarma <dipankar@in.ibm.com>
      
      This reduces the RCU head size by using a singly linked to maintain them.
      The ordering of the callbacks is still maintained as before by using a tail
      pointer for the next list.
      
      Signed-Off-By : Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b659a6fb
    • Andrew Morton's avatar
      [PATCH] rcu lock update: Code move & cleanup · 72914d30
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Step three for reducing cacheline trashing within rcupdate.c:
      
      Cleanup and code move from <linux/rcupdate.h> to kernel/rcupdate.c: Remove
      internal details from the header file.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      72914d30
    • Andrew Morton's avatar
      [PATCH] rcu lock update: Use a sequence lock for starting batches · 720e8a63
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Step two for reducing cacheline trashing within rcupdate.c:
      
      rcu_process_callbacks always acquires rcu_ctrlblk.state.mutex and calls
      rcu_start_batch, even if the batch is already running or already scheduled to
      run.
      
      This can be avoided with a sequence lock: A sequence lock allows to read the
      current batch number and next_pending atomically.  If next_pending is already
      set, then there is no need to acquire the global mutex.
      
      This means that for each grace period, there will be
      
      - one write access to the rcu_ctrlblk.batch cacheline
      
      - lots of read accesses to rcu_ctrlblk.batch (3-10*cpus_online()).  Behavior
        similar to the jiffies cacheline, shouldn't be a problem.
      
      - cpus_online()+1 write accesses to rcu_ctrlblk.state, all of them starting
        with spin_lock(&rcu_ctrlblk.state.mutex).
      
        For large enough cpus_online() this will be a problem, but all except two
        of the spin_lock calls only protect the rcu_cpu_mask bitmap, thus a
        hierarchical bitmap would allow to split the write accesses to multiple
        cachelines.
      
      Tested on an 8-way with reaim.  Unfortunately it probably won't help with Jack
      Steiner's 'ls' test since in this test only one cpu generates rcu entries.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      720e8a63
    • Andrew Morton's avatar
      [PATCH] rcu lock update: Add per-cpu batch counter · 5c60169a
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Below is the one of my patches from my rcu lock update.  Jack Steiner tested
      the first one on a 512p and it resolved the rcu cache line trashing.  All were
      tested on osdl with STP.
      
      Step one for reducing cacheline trashing within rcupdate.c:
      
      The current code uses the rcu_cpu_mask bitmap both for keeping track of the
      cpus that haven't gone through a quiescent state and for checking if a cpu
      should look for quiescent states.  The bitmap is frequently changed and the
      check is done by polling - together this causes cache line trashing.
      
      If it's cheaper to access a (mostly) read-only cacheline than a cacheline that
      is frequently dirtied, then it's possible to reduce the trashing by splitting
      the rcu_cpu_mask bitmap into two cachelines:
      
      The patch adds a generation counter and moves it into a separate cacheline.
      This allows to removes all accesses to rcu_cpumask (in the read-write
      cacheline) from rcu_pending and at least 50% of the accesses from
      rcu_check_quiescent_state.  rcu_pending and all but one call per cpu to
      rcu_check_quiescent_state access the read-only cacheline.  Probably not enough
      for 512p, but it's a start, just for 128 byte more memory use, without slowing
      down rcu grace periods.  Obviously the read-only cacheline is not really
      read-only: it's written once per grace period to indicate that a new grace
      period is running.
      
      Tests on an 8-way Pentium III with reaim showed some improvement:
      
      oprofile hits:
      Reference: http://khack.osdl.org/stp/293075/
      Hits	   %
      23741     0.0994  rcu_pending
      19057     0.0798  rcu_check_quiescent_state
      6530      0.0273  rcu_check_callbacks
      
      Patched: http://khack.osdl.org/stp/293076/
      8291      0.0579  rcu_pending
      5475      0.0382  rcu_check_quiescent_state
      3604      0.0252  rcu_check_callbacks
      
      The total runtime differs between both runs, thus the % number must
      be compared: Around 50% faster. I've uninlined rcu_pending for the
      test.
      
      Tested with reaim and kernbench.
      
      Description:
      
      - per-cpu quiescbatch and qs_pending fields introduced: quiescbatch contains
        the number of the last quiescent period that the cpu has seen and qs_pending
        is set if the cpu has not yet reported the quiescent state for the current
        period.  With these two fields a cpu can test if it should report a
        quiescent state without having to look at the frequently written
        rcu_cpu_mask bitmap.
      
      - curbatch split into two fields: rcu_ctrlblk.batch.completed and
        rcu_ctrlblk.batch.cur.  This makes it possible to figure out if a grace
        period is running (completed != cur) without accessing the rcu_cpu_mask
        bitmap.
      
      - rcu_ctrlblk.maxbatch removed and replaced with a true/false next_pending
        flag: next_pending=1 means that another grace period should be started
        immediately after the end of the current period.  Previously, this was
        achieved by maxbatch: curbatch==maxbatch means don't start, curbatch!=
        maxbatch means start.  A flag improves the readability: The only possible
        values for maxbatch were curbatch and curbatch+1.
      
      - rcu_ctrlblk split into two cachelines for better performance.
      
      - common code from rcu_offline_cpu and rcu_check_quiescent_state merged into
        cpu_quiet.
      
      - rcu_offline_cpu: replace spin_lock_irq with spin_lock_bh, there are no
        accesses from irq context (and there are accesses to the spinlock with
        enabled interrupts from tasklet context).
      
      - rcu_restart_cpu introduced, s390 should call it after changing nohz:
        Theoretically the global batch counter could wrap around and end up at
        RCU_quiescbatch(cpu).  Then the cpu would not look for a quiescent state and
        rcu would lock up.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5c60169a
    • Andrew Morton's avatar
      [PATCH] Move saved_command_line to init/main.c · b884e838
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      Currently every arch declares its own char saved_command_line[].  Make sure
      every arch defines COMMAND_LINE_SIZE in asm/setup.h, and declare
      saved_command_line in linux/init.h (init/main.c contains the definition).
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b884e838
    • Andrew Morton's avatar
      [PATCH] jbd needs to wait for locked buffers · 4d4f4cc4
      Andrew Morton authored
      From: Chris Mason <mason@suse.com>
      
      jbd needs to wait for any io to complete on the buffer before changing the
      end_io function.  Using set_buffer_locked means that it can change the
      end_io function while the page is in the middle of writeback, and the
      writeback bit on the page will never get cleared.
      
      Since we set the buffer dirty earlier on, if the page was previously dirty,
      pdflush or memory pressure might trigger a writepage call, which will race
      with jbd's set_buffer_locked.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4d4f4cc4
    • Andrew Morton's avatar
      [PATCH] Allow i386 to reenable interrupts on lock contention · 36f9f209
      Andrew Morton authored
      From: Zwane Mwaikambo <zwane@linuxpower.ca>
      
      Following up on Keith's code, I adapted the i386 code to allow enabling
      interrupts during contested locks depending on previous interrupt
      enable status. Obviously there will be a text increase (only for non
      CONFIG_SPINLINE case), although it doesn't seem so bad, there will be an
      increased exit latency when we attempt a lock acquisition after spinning
      due to the extra instructions. How much this will affect performance I'm
      not sure yet as I haven't had time to micro bench.
      
         text    data     bss     dec     hex filename
      2628024  921731       0 3549755  362a3b vmlinux-after
      2621369  921731       0 3543100  36103c vmlinux-before
      2618313  919222       0 3537535  35fa7f vmlinux-spinline
      
      The code has been stress tested on a 16x NUMAQ (courtesy OSDL).
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      36f9f209
  2. 23 Jun, 2004 4 commits
  3. 22 Jun, 2004 9 commits
    • Jesse Barnes's avatar
      [PATCH] ppc32: Support for new Apple laptop models · f4897eb3
      Jesse Barnes authored
      This adds sound support for some of the newer PowerBooks.  It appears
      that this chip supports the AWACS sample rates, but has a snapper-style
      mixer.  Tested and works on my PowerBook5,4. 
      Signed-off-by: default avatarJesse Barnes <jbarnes@sgi.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f4897eb3
    • Paul Mackerras's avatar
      [PATCH] Handle altivec assist exception properly · 7a08473b
      Paul Mackerras authored
      This is the PPC64 counterpart of the PPC32 Altivec assist exception
      handler that went in recently.
      
      On PPC64 machines with Altivec (i.e.  machines that use the PPC970 chip,
      such as the G5 powermac), the altivec floating-point instructions can
      operate in two modes: one where denormalized inputs or outputs are
      truncated to zero, and one where they aren't.  In the latter mode the
      processor can take an exception when it encounters denormalized
      floating-point inputs or outputs rather than dealing with them in
      hardware.
      
      This patch adds code to deal properly with the exception, by emulating
      the instruction that caused the exception.  Previously the kernel just
      switched the altivec unit into the truncate-to-zero mode, which works
      but is a bit gross.  Fortunately there are only a limited set of altivec
      instructions which can generate the assist exception, so we don't have
      to emulate the whole altivec instruction set.
      
      Note that Altivec is Motorola's name for the PowerPC vector/SIMD
      instructions; IBM calls the same thing VMX, and currently only IBM makes
      64-bit PowerPC CPU chips.  Nevertheless, I have used the term Altivec in
      the PPC64 code for consistency with the PPC32 code.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7a08473b
    • Benjamin Herrenschmidt's avatar
      [PATCH] radeonfb: Fix panel detection on some laptops · 6340e7ba
      Benjamin Herrenschmidt authored
      The code in radeonfb looking for the BIOS image currently uses the BIOS
      ROM if any, and falls back to the RAM image if not found.  This is
      unfortunatly not correct for a bunch of laptops where the real panel
      data are only present in the RAM image.
      
      This works around this problem by preferring the RAM image on mobility
      chipsets.  This is definitely not the best workaround, we need some arch
      support for linking the RAM image to the PCI ID (preferrably by having
      the arch snapshot it during boot, isolating us completely from the
      details of where this image is in memory).  I'll see how we can get such
      an improvement later.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6340e7ba
    • Benjamin Herrenschmidt's avatar
      [PATCH] ppc32: Support for new Apple laptop models · ca216b8a
      Benjamin Herrenschmidt authored
      This adds support for newer Apple laptop models.  It adds the basic
      identification for the new motherboards and the cpufreq support for
      models using the new 7447A CPU from Motorola.
      
      This is mostly the work of John Steele Scott <toojays@toojays.net> with
      some bits from Sebastian Henschel <linux@kodeaffe.de> and some rework by
      myself.  Please apply,
      Signed-off-by: default avatarJohn Steele Scott <toojays@toojays.net>
      Signed-off-by: default avatarSebastian Henschel <linux@kodeaffe.de>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ca216b8a
    • Benjamin Herrenschmidt's avatar
      [PATCH] ppc32: oprofile support · e5603f99
      Benjamin Herrenschmidt authored
      This adds basic oprofile support to ppc32.  Originally from Anton
      Blanchard, I just re-diffed it against current kernels.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e5603f99
    • Benjamin Herrenschmidt's avatar
      [PATCH] ppc32: Cleanups & warning fixes of traps.c · b62102f6
      Benjamin Herrenschmidt authored
      This cleans up arch/ppc/kernel/traps.c and vecemu.c to use the same
      formatting style for all functions, and fixes 2 warnings in the altivec
      floating point emulation code.  No functional change. 
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b62102f6
    • Linus Torvalds's avatar
      Merge bk://gkernel.bkbits.net/libata-2.6 · bd67d886
      Linus Torvalds authored
      into ppc970.osdl.org:/home/torvalds/v2.6/linux
      bd67d886
    • Jeff Garzik's avatar
      [libata sata_sil] Re-fix mod15write bug · 48c1a573
      Jeff Garzik authored
      Certain early SATA drives have problems with write requests whose
      length satisfy the equation "sectors % 15 == 1", on the SiI 3112.
      Other drives, and other SiI controllers, are not affected.
      
      The fix for this problem is to avoid such requests, in one of three
      ways, for the affect drive+controller combos:
      1) Limit all writes to 15 sectors
      2) Use block layer features to avoid creating requests whose
         length satisfies the above equation.
      3) When a request satisfies the above equation, split the request
         into two writes, neither of which satisfies the equation.
      
      I chose fix #1, the most simple to implement.  After discussion with
      Silicon Image and others regarding the impact of this fix, I have
      decided to remain with fix #1, and will not be implementing a
      "better fix".  This means that the affected SATA drives will see
      decreased performance, but set of affected drives is small and will
      never grow larger.
      
      Further, the complexity of implementing solution #2 or
      solution #3 is rather large.
      
      When implementing lba48 'large request' support, I unintentionally
      broke the fix for these affected drives.  Kudos to Ricky Beam for
      noticing this.
      
      This change restores the fix, by adding a flag ATA_DFLAG_LOCK_SECTORS
      to indicate that the max_sectors value set by the low-level driver
      should never be changed.
      48c1a573
    • Linus Torvalds's avatar
      Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · 30c0d5b0
      Linus Torvalds authored
      into ppc970.osdl.org:/home/torvalds/v2.6/linux
      30c0d5b0
  4. 23 Jun, 2004 5 commits
  5. 22 Jun, 2004 3 commits
  6. 23 Jun, 2004 2 commits
  7. 22 Jun, 2004 5 commits