1. 21 Sep, 2003 22 commits
    • Andrew Morton's avatar
      [PATCH] Move slab objects to the end of the real allocation · e0c22e53
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      The real memory allocation is usually larger than the actual object size:
      either due to L1 cache line padding, or due to page padding with
      CONFIG_DEBUG_PAGEALLOC.  Right now objects are placed to the beginning of
      the real allocation, but to trigger bugs it's better to move objects to the
      end of the real allocation: that way accesses behind the end of the
      allocation have a larger chance of hitting the (unmapped) next page.  The
      attached patch moves the objects to align them with the end of the real
      allocation.
      
      Actually it contains 4 seperate changes:
      
      - Do not page-pad allocations that are <= SMP_CACHE_LINE_SIZE.  This
        crashes.  Right now the limit is hardcoded to 128 bytes, but sooner or
        later an arch will appear with 256 byte cache lines.
      
      - cleanup: redzone bytes are not accessed with inline helper functions,
        instead of magic offsets scattered throughout slab.c
      
      - main change: move objects to the end of the allocation - trivial after
        the cleanup.
      
      - Print old redzone value if a redzone mismatch happens: This makes it
        simpler to figure out what happened [single bit error, wrong redzone
        code, overwritten]
      e0c22e53
    • Andrew Morton's avatar
      [PATCH] might_sleep diagnostics · d6dbfa23
      Andrew Morton authored
      might_sleep() can be triggered by either local interrupts being disabled or
      by elevated preempt count.  Disambiguate them.
      d6dbfa23
    • Andrew Morton's avatar
      [PATCH] CPU scheduler interactivity changes · 2cf13d58
      Andrew Morton authored
      From: Con Kolivas <kernel@kolivas.org>
      
      Interactivity scheduler tweaks on top of Ingo's A3 interactivity patch.
      
      Interactive credit added to task struct to find truly interactive tasks and
      treat them differently.
      
      Extra #defines included as helpers for conversion to/from nanosecond timing,
      to work out an average timeslice for nice 0 tasks, and the effective dynamic
      priority bonuses that will be given to tasks.
      
      MAX_SLEEP_AVG modified to change dynamic priority by one for a nice 0 task
      sleeping or running for one full timeslice.
      
      CREDIT_LIMIT is the number of times a task earns sleep_avg over MAX_SLEEP_AVG
      before it is considered HIGH_CREDIT (truly interactive); and -CREDIT_LIMIT is
      LOW_CREDIT
      
      TIMESLICE GRANULARITY is modified to be more frequent for more
      interactivetasks (10 ms for top 2 dynamic priorities and then halving each
      priority belowthat) and less frequent per extra cpu.
      
      JUST_INTERACTIVE_SLEEP logic created to be a sleep_avg consistent with giving
      a task enough dynamic priority to remain on the active array.
      
      Task preemption of equal priority tasks is dropped as requeuing with
      TIMESLICE_GRANULARITY makes this unecessary.
      
      Dynamic priority bonus simplified.
      
      User tasks that sleep a long time and not waking from uninterruptible sleep
      are sought and categorised as idle. Their sleep avg is limited in it's rise to
      prevent them becoming high priority and suddenly turning into cpu hogs.
      
      Bonus for sleeping is proportionately higher the lower the dynamic priority of
      a task is; this allows for very rapid escalation to interactive status.
      
      Tasks that are LOW_CREDIT are limited in rise per sleep to one priority level.
      
      Non HIGH_CREDIT tasks waking from uninterruptible sleep are sought to detect
      cpu hogs waiting on I/O and their sleep_avg rise is limited to just
      interactive state to prevent cpu bound tasks from becoming interactive during
      I/O wait.
      
      Tasks that earn sleep_avg over MAX_SLEEP_AVG get interactive credits.
      
      On runqueue bonus is not given to non HIGH_CREDIT tasks waking from
      uninterruptible sleep.
      
      Forked tasks and their parents get sleep_avg limited to the minimum necessary
      to maintain their effective dynamic priority thus preventing repeated forking
      from being a way to get highly interactive, but not penalise them noticably
      otherwise.
      
      CAN_MIGRATE_TASK cleaned up and modified to work with nanosecond timestamps.
      
      Reverted Ingo's A3 Starvation limit change - it was making interactive tasks
      suffer more under increasing load. If a cpu is grossly overloaded and
      everyone is going to starve it may as well run interactive tasks
      preferentially.
      
      Task requeuing is limited to interactive tasks only (cpu bound tasks dont need
      low latency and derive benefit from longer timeslices), and they must have at
      least TIMESLICE_GRANULARITY remaining.
      
      HIGH_CREDIT tasks get penalised less sleep_avg the more interactive they are
      thus keeping them interactive for bursts but if they become sustained cpu hogs
      they will slide increasingly rapidly down the dynamic priority scale.
      
      Tasks that run out of sleep_avg, are still using up cpu time and are not high
      or low credit yet get penalised interactive credits to determine LOW_CREDIT
      tasks (cpu bound ones).
      2cf13d58
    • Andrew Morton's avatar
      [PATCH] CPU scheduler balancing fix · 875ee1e1
      Andrew Morton authored
      From: Nick Piggin <piggin@cyberone.com.au>
      
      The patch changes the imbalance required before a balance to 25% from 50% -
      as the comments intend.  It also changes a case where the balancing
      wouldn't be done if the imbalance was >= 25% but only 1 task difference.
      
      The downside of the second change is that one task may bounce from one cpu
      to another for some loads.  This will only bounce once every 200ms, so it
      shouldn't be a big problem.
      
      (Benchmarking results are basically a wash - SDET is increased maybe 0.5%)
      875ee1e1
    • Andrew Morton's avatar
      [PATCH] sched_clock() for ppc, ppc64, x86_64 and sparc64 · 2b7e8ff7
      Andrew Morton authored
      Ingo's CPU scheduler update (in -mm kernels) needs a new sched_clock()
      function which returns nanoseconds.
      
      The patch provides implementations for ppc, ppc64, x86_64 and sparc64.
      
      The x86_64 version could have overflow issues, the calculation is done in
      32bits only with an multiply.  But I hope it's good enough for the scheduler
      
      The ppc64 version needs scaling: it's only accurate for 1GHz CPUs.
      2b7e8ff7
    • Andrew Morton's avatar
      [PATCH] scheduler infrastructure · f221af36
      Andrew Morton authored
      From: Ingo Molnar <mingo@elte.hu>
      
      the attached scheduler patch (against test2-mm2) adds the scheduling
      infrastructure items discussed on lkml. I got good feedback - and while i
      dont expect it to solve all problems, it does solve a number of bad ones:
      
       - test_starve.c code from David Mosberger
      
       - thud.c making the system unusuable due to unfairness
      
       - fair/accurate sleep average based on a finegrained clock
      
       - audio skipping way too easily
      
      other changes in sched-test2-mm2-A3:
      
       - ia64 sched_clock() code, from David Mosberger.
      
       - migration thread startup without relying on implicit scheduling
         behavior. While the current 2.6 code is correct (due to the cpu-up code
         adding CPUs one by one), but it's also fragile - and this code cannot
         be carried over into the 2.4 backports. So adding this method would
         clean up the startup and would make it easier to have 2.4 backports.
      
      and here's the original changelog for the scheduler changes:
      
       - cycle accuracy (nanosec resolution) timekeeping within the scheduler.
         This fixes a number of audio artifacts (skipping) i've reproduced. I
         dont think we can get away without going cycle accuracy - reading the
         cycle counter adds some overhead, but it's acceptable. The first
         nanosec-accuracy patch was done by Mike Galbraith - this patch is
         different but similar in nature. I went further in also changing the
         sleep_avg to be of nanosec resolution.
      
       - more finegrained timeslices: there's now a timeslice 'sub unit' of 50
         usecs (TIMESLICE_GRANULARITY) - CPU hogs on the same priority level
         will roundrobin with this unit. This change is intended to make gaming
         latencies shorter.
      
       - include scheduling latency in sleep bonus calculation. This change
         extends the sleep-average calculation to the period of time a task
         spends on the runqueue but doesnt get scheduled yet, right after
         wakeup. Note that tasks that were preempted (ie. not woken up) and are
         still on the runqueue do not get this benefit. This change closes one
         of the last hole in the dynamic priority estimation, it should result
         in interactive tasks getting more priority under heavy load. This
         change also fixes the test-starve.c testcase from David Mosberger.
      
      
      The TSC-based scheduler clock is disabled on ia32 NUMA platforms.  (ie. 
      platforms that have unsynched TSC for sure.) Those platforms should provide
      the proper code to rely on the TSC in a global way.  (no such infrastructure
      exists at the moment - the monotonic TSC-based clock doesnt deal with TSC
      offsets either, as far as i can tell.)
      f221af36
    • Andrew Morton's avatar
      [PATCH] NLS: remove emacs metadata · 1dffaaf7
      Andrew Morton authored
      From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      
      This elisp was obsolete on recently emacs's cc-mode. And this should
      be personally set.
      1dffaaf7
    • Andrew Morton's avatar
      [PATCH] NLS: Remove the nls modules for only alias · 644f9658
      Andrew Morton authored
      From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      
      This does the following,
      
      1) This removes the nls modules for only alias. For backward
         compatible, this adds ->alias, and ->alias provides alias of charset.
      
      2) For autoloading the module by the alias, use MODULE_ALIAS mechanism.
      
      3) From changelog of module-init-tools, looks like MODULE_ALIAS needs
         module-init-tools 0.9.10 or later. So change the "Documentation/Changes".
      644f9658
    • Andrew Morton's avatar
      [PATCH] mtrr warning fix w/o proc_fs · 03329f13
      Andrew Morton authored
      From: Stephen Hemminger <shemminger@osdl.org>
      
      Get rid of warnings (and dead code) if MTRR is compiled without /proc
      03329f13
    • Andrew Morton's avatar
      [PATCH] Overflow check for i386 assign_irq_vector · f9e416d3
      Andrew Morton authored
      From: James Cleverdon <jamesclv@us.ibm.com>
      
      Some very large systems overflow the array and corrupt memory.  A BUG_ON will 
      at least flag the problem until dynamic irq_vector allocation is added.
      f9e416d3
    • Andrew Morton's avatar
      [PATCH] reiserfs: large file 32/64-bit truncation fix · 9edad7f8
      Andrew Morton authored
      From: Oleg Drokin <green@namesys.com>
      
      Fix trucation-induced large file corruption in reiserfs.
      9edad7f8
    • Andrew Morton's avatar
      [PATCH] Fix setpgid and threads · feaecce4
      Andrew Morton authored
      From: Jeremy Fitzhardinge <jeremy@goop.org>
      
      I'm resending my patch to fix this problem.  To recap: every task_struct
      has its own copy of the thread group's pgrp.  Only the thread group
      leader is allowed to change the tgrp's pgrp, but it only updates its own
      copy of pgrp, while all the other threads in the tgrp use the old value
      they inherited on creation.
      
      This patch simply updates all the other thread's pgrp when the tgrp
      leader changes pgrp.  Ulrich has already expressed reservations about
      this patch since it is (1) incomplete (it doesn't cover the case of
      other ids which have similar problems), (2) racy (it doesn't synchronize
      with other threads looking at the task pgrp, so they could see an
      inconsistent view) and (3) slow (it takes linear time with respect to
      the number of threads in the tgrp).
      
      My reaction is that (1) it fixes the actual bug I'm encountering in a
      real program.  (2) doesn't really matter for pgrp, since it is mostly an
      issue with respect to the terminal job-control code (which is even more
      broken without this patch.  Regarding (3), I think there are very few
      programs which have a large number of threads which change process group
      id on a regular basis (a heavily multi-threaded job-control shell?).
      
      Ulrich also said he has a (proposed?) much better fix, which I've been
      looking forward to.  I'm submitting this patch as a stop-gap fix for a
      real bug, and perhaps to prompt the improved patch.
      
      An alternative fix, at least for pgrp, is to change all references to
      ->pgrp to group_leader->pgrp.  This may be sufficient on its own, but it
      would be a reasonably intrusive patch (I count 95 instances in 32 files
      in the 2.6.0-test3-mm3 tree).
      feaecce4
    • Andrew Morton's avatar
      [PATCH] real-time enhanced page allocator and throttling · 55b50278
      Andrew Morton authored
      From: Robert Love <rml@tech9.net>
      
      - Let real-time tasks dip further into the reserves than usual in
        __alloc_pages().  There are a lot of ways to special case this.  This
        patch just cuts z->pages_low in half, before doing the incremental min
        thing, for real-time tasks.  I do not do anything in the low memory slow
        path.  We can be a _lot_ more aggressive if we want.  Right now, we just
        give real-time tasks a little help.
      
      - Never ever call balance_dirty_pages() on a real-time task.  Where and
        how exactly we handle this is up for debate.  We could, for example,
        special case real-time tasks inside balance_dirty_pages().  This would
        allow us to perform some of the work (say, waking up pdflush) but not
        other work (say, the active throttling).  As it stands now, we do the
        per-processor accounting in balance_dirty_pages_ratelimited() but we
        never call balance_dirty_pages().  Lots of approaches work.  What we want
        to do is never engage the real-time task in forced writeback.
      55b50278
    • Andrew Morton's avatar
      [PATCH] ECC support · 5fc4d839
      Andrew Morton authored
      From: "Nakajima, Jun" <jun.nakajima@intel.com>
      
      Split the increasingly messy compiler.h file into per-compiler files and also
      add support for non-gcc compilers.  
      
      With the current implementation:
      
        include/linux/compiler.h defines the compiler-dependent abstractions
        which can be overwritten by per-compiler definitions.
      
        include/linux/compiler-gcc.h contains the common definitions for all gcc
        versions.
      
        include/linux/compiler-gcc[2,3,+].h contains gcc major version specific
        definitions.
      
        include/linux/compiler-intel.h contains intel compiler specific
        definitions."
      5fc4d839
    • Andrew Morton's avatar
      [PATCH] procfs build fix for older gcc · 0bfc934b
      Andrew Morton authored
      - declarations come first
      
      - fix bizarre coding style.
      0bfc934b
    • Linus Torvalds's avatar
      Merge bk://bk.arm.linux.org.uk/linux-2.6-pcmcia · 05ea2914
      Linus Torvalds authored
      into home.osdl.org:/home/torvalds/v2.5/linux
      05ea2914
    • Russell King's avatar
      [PCMCIA] Fix deadlocks caused between PCMCIA card fix and device model · 1d921834
      Russell King authored
      The problem was that the semaphore which prevents ds interfering with
      the sleepy card initialisation (skt_sem in pccardd) is blocking insmod
      of the socket driver.  However, the socket driver is being called with
      the PCI bus semaphore held by the driver model.
      
      pccardd in turn discovered a cardbus card (with skt_sem held), so it
      is trying to add the PCI devices to the PCI bus, and this requires the
      driver model to grab the PCI bus semaphore, but its already locked.
      
      We move the class device register into pccardd so we get a natural
      ordering between the ds socket initialisation and pccardd trying to
      detect inserted cards.
      
      We also fix a potential use-after-free caused by rmmod'ing the socket
      driver before ds has shut down.
      1d921834
    • Linus Torvalds's avatar
      Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · ba8a0415
      Linus Torvalds authored
      into home.osdl.org:/home/torvalds/v2.5/linux
      ba8a0415
    • Russell King's avatar
      [ARM] Avoid using clone syscall from kernel_thread() · 384151a9
      Russell King authored
      Don't issue a system call from kernel_thread(), but call do_fork()
      directly.  This avoids all the unnecessary syscall overhead.
      384151a9
    • Albert Cahalan's avatar
      [PATCH] fix for hidden-task problem · 01660410
      Albert Cahalan authored
      It's bad to make (CLONE_THREAD | CLONE_DETACHED) tasks
      be _completely_ hidden. Resource consumption is hard
      to track down if a user can hide a task from /bin/ps.
      
      This patch, supported by the procps-3.1.13 release,
      gives admins the ability to search for such tasks.
      The top-level /proc directory remains uncontaminated.
      01660410
    • Linus Torvalds's avatar
    • Matthew Wilcox's avatar
      [PATCH] Move EISA_bus · 972b4a74
      Matthew Wilcox authored
      When I change the setting of CONFIG_EISA, everything rebuilds.  This is
      because EISA_bus is declared in <asm/processor.h> which is implicitly
      included by just about everything.  This is a silly place to declare it,
      so this patch moves it to include/linux/eisa.h.
      
      While I'm at it, I also move the variable definition to
      drivers/eisa/eisa-bus.c.  The rest of this patch is fixing up the fallout
      from having to include <linux/eisa.h> if you use EISA_bus.
      972b4a74
  2. 20 Sep, 2003 18 commits