1. 19 Nov, 2006 23 commits
  2. 04 Nov, 2006 17 commits
    • Chris Wright's avatar
      Linux 2.6.18.2 · b4d85466
      Chris Wright authored
      b4d85466
    • Alan Stern's avatar
      [PATCH] usbfs: private mutex for open, release, and remove · 108d51a5
      Alan Stern authored
      The usbfs code doesn't provide sufficient mutual exclusion among open,
      release, and remove.  Release vs. remove is okay because they both
      acquire the device lock, but open is not exclusive with either one.  All
      three routines modify the udev->filelist linked list, so they must not
      run concurrently.
      
      Apparently someone gave this a minimum amount of thought in the past by
      explicitly acquiring the BKL at the start of the usbdev_open routine.
      Oddly enough, there's a comment pointing out that locking is unnecessary
      because chrdev_open already has acquired the BKL.
      
      But this ignores the point that the files in /proc/bus/usb/* are not
      char device files; they are regular files and so they don't get any
      special locking.  Furthermore it's necessary to acquire the same lock in
      the release and remove routines, which the code does not do.
      
      Yet another problem arises because the same file_operations structure is
      accessible through both the /proc/bus/usb/* and /dev/usb/usbdev* file
      nodes.  Even when one of them has been removed, it's still possible for
      userspace to open the other.  So simple locking around the individual
      remove routines is insufficient; we need to lock the entire
      usb_notify_remove_device notifier chain.
      
      Rather than rely on the BKL, this patch (as723) introduces a new private
      mutex for the purpose.  Holding the BKL while invoking a notifier chain
      doesn't seem like a good idea.
      
      Cc: Dave Jones <davej@redhat.com>
      [https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212952]
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      108d51a5
    • NeilBrown's avatar
      [PATCH] md: check bio address after mapping through partitions. · 3b076a94
      NeilBrown authored
      Partitions are not limited to live within a device.  So
      we should range check after partition mapping.
      
      Note that 'maxsector' was being used for two different things.  I have
      split off the second usage into 'old_sector' so that maxsector can be
      still be used for it's primary usage later in the function.
      
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      3b076a94
    • James Morris's avatar
      [PATCH] IPV6: fix lockup via /proc/net/ip6_flowlabel [CVE-2006-5619] · d0239f35
      James Morris authored
      There's a bug in the seqfile handling for /proc/net/ip6_flowlabel, where,
      after finding a flowlabel, the code will loop forever not finding any
      further flowlabels, first traversing the rest of the hash bucket then just
      looping.
      
      This patch fixes the problem by breaking after the hash bucket has been
      traversed.
      
      Note that this bug can cause lockups and oopses, and is trivially invoked
      by an unpriveleged user.
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      d0239f35
    • Stephen Hemminger's avatar
      [PATCH] tcp: cubic scaling error · f3fcd7f6
      Stephen Hemminger authored
      Doug Leith observed a discrepancy between the version of CUBIC described
      in the papers and the version in 2.6.18. A math error related to scaling
      causes Cubic to grow too slowly.
      
      Patch is from "Sangtae Ha" <sha2@ncsu.edu>. I validated that
      it does fix the problems.
      
      See the following to show behavior over 500ms 100 Mbit link.
      
      Sender (2.6.19-rc3) ---  Bridge (2.6.18-rt7) ------- Receiver (2.6.19-rc3)
                          1G      [netem]           100M
      
      	http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-orig.png
      	http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-fix.pngSigned-off-by: default avatarStephen Hemminger <shemminger@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      f3fcd7f6
    • Alan Cox's avatar
      [PATCH] JMB 368 PATA detection · a5f1d1d1
      Alan Cox authored
      The Jmicron JMB368 is PATA only so has the PATA on function zero.  Don't
      therefore skip function zero on this device when probing
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      a5f1d1d1
    • Oleg Nesterov's avatar
      [PATCH] fill_tgid: fix task_struct leak and possible oops · e17f8851
      Oleg Nesterov authored
      1. fill_tgid() forgets to do put_task_struct(first).
      
      2. release_task(first) can happen after fill_tgid() drops tasklist_lock,
         it is unsafe to dereference first->signal.
      
      This is a temporary fix, imho the locking should be reworked.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jay Lan <jlan@sgi.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      e17f8851
    • Martin Bligh's avatar
      [PATCH] Use min of two prio settings in calculating distress for reclaim · 1406fd4e
      Martin Bligh authored
      If try_to_free_pages / balance_pgdat are called with a gfp_mask specifying
      GFP_IO and/or GFP_FS, they will reclaim the requisite number of pages, and the
      reset prev_priority to DEF_PRIORITY (or to some other high (ie: unurgent)
      value).
      
      However, another reclaimer without those gfp_mask flags set (say, GFP_NOIO)
      may still be struggling to reclaim pages.  The concurrent overwrite of
      zone->prev_priority will cause this GFP_NOIO thread to unexpectedly cease
      deactivating mapped pages, thus causing reclaim difficulties.
      
      Fix this is to key the distress calculation not off zone->prev_priority, but
      also take into account the local caller's priority by using
      min(zone->prev_priority, sc->priority)
      Signed-off-by: default avatarMartin J. Bligh <mbligh@google.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      1406fd4e
    • Martin Bligh's avatar
      [PATCH] vmscan: Fix temp_priority race · 252287f4
      Martin Bligh authored
      The temp_priority field in zone is racy, as we can walk through a reclaim
      path, and just before we copy it into prev_priority, it can be overwritten
      (say with DEF_PRIORITY) by another reclaimer.
      
      The same bug is contained in both try_to_free_pages and balance_pgdat, but
      it is fixed slightly differently.  In balance_pgdat, we keep a separate
      priority record per zone in a local array.  In try_to_free_pages there is
      no need to do this, as the priority level is the same for all zones that we
      reclaim from.
      
      Impact of this bug is that temp_priority is copied into prev_priority, and
      setting this artificially high causes reclaimers to set distress
      artificially low.  They then fail to reclaim mapped pages, when they are,
      in fact, under severe memory pressure (their priority may be as low as 0).
      This causes the OOM killer to fire incorrectly.
      
      From: Andrew Morton <akpm@osdl.org>
      
      __zone_reclaim() isn't modifying zone->prev_priority.  But zone->prev_priority
      is used in the decision whether or not to bring mapped pages onto the inactive
      list.  Hence there's a risk here that __zone_reclaim() will fail because
      zone->prev_priority ir large (ie: low urgency) and lots of mapped pages end up
      stuck on the active list.
      
      Fix that up by decreasing (ie making more urgent) zone->prev_priority as
      __zone_reclaim() scans the zone's pages.
      
      This bug perhaps explains why ZONE_RECLAIM_PRIORITY was created.  It should be
      possible to remove that now, and to just start out at DEF_PRIORITY?
      
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Christoph Lameter <clameter@engr.sgi.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      [chrisw: minor wiggle to fit -stable]
      252287f4
    • Trond Myklebust's avatar
      [PATCH] NFS: nfs_lookup - don't hash dentry when optimising away the lookup · 0f899fb7
      Trond Myklebust authored
      If the open intents tell us that a given lookup is going to result in a,
      exclusive create, we currently optimize away the lookup call itself. The
      reason is that the lookup would not be atomic with the create RPC call, so
      why do it in the first place?
      
      A problem occurs, however, if the VFS aborts the exclusive create operation
      after the lookup, but before the call to create the file/directory: in this
      case we will end up with a hashed negative dentry in the dcache that has
      never been looked up.
      Fix this by only actually hashing the dentry once the create operation has
      been successfully completed.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      0f899fb7
    • Andy Whitcroft's avatar
      [PATCH] Reintroduce NODES_SPAN_OTHER_NODES for powerpc · d940c78f
      Andy Whitcroft authored
      Revert "[PATCH] Remove SPAN_OTHER_NODES config definition"
          This reverts commit f62859bb.
      Revert "[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES"
          This reverts commit a94b3ab7.
      
      Also update the comments to indicate that this is still required
      and where its used.
      Signed-off-by: default avatarAndy Whitcroft <apw@shadowen.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Kravetz <kravetz@us.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: default avatarMel Gorman <mel@csn.ul.ie>
      Acked-by: default avatarWill Schmidt <will_schmidt@vnet.ibm.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      d940c78f
    • Karsten Wiese's avatar
      [PATCH] PCI: Remove quirk_via_abnormal_poweroff · 53f916eb
      Karsten Wiese authored
      My K8T800 mobo resumes fine from suspend to ram with and without patch
      applied against 2.6.18.
      
      quirk_via_abnormal_poweroff makes some boards not boot 2.6.18, so IMO patch
      should go to head, 2.6.18.2 and everywhere "ACPI: ACPICA 20060623" has been
      applied.
      
      Remove quirk_via_abnormal_poweroff
      
      Obsoleted by "ACPI: ACPICA 20060623":
      <snip>
          Implemented support for "ignored" bits in the ACPI
          registers.  According to the ACPI specification, these
          bits should be preserved when writing the registers via
          a read/modify/write cycle. There are 3 bits preserved
          in this manner: PM1_CONTROL[0] (SCI_EN), PM1_CONTROL[9],
          and PM1_STATUS[11].
          http://bugzilla.kernel.org/show_bug.cgi?id=3691
      </snip>
      Signed-off-by: default avatarKarsten Wiese <fzu@wemgehoertderstaat.de>
      Cc: Bob Moore <robert.moore@intel.com>
      Acked-by: default avatarLen Brown <len.brown@intel.com>
      Acked-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      53f916eb
    • David Miller's avatar
      [PATCH] SPARC64: Fix PCI memory space root resource on Hummingbird. · 54a17702
      David Miller authored
      For Hummingbird PCI controllers, we should create the root
      PCI memory space resource as the full 4GB area, and then
      allocate the IOMMU DMA translation window out of there.
      
      The old code just assumed that the IOMMU DMA translation base
      to the top of the 4GB area was unusable.  This is not true on
      many systems such as SB100 and SB150, where the IOMMU DMA
      translation window sits at 0xc0000000->0xdfffffff.
      
      So what would happen is that any device mapped by the firmware
      at the top section 0xe0000000->0xffffffff would get remapped
      by Linux somewhere else leading to all kinds of problems and
      boot failures.
      
      While we're here, report more cases of OBP resource assignment
      conflicts.  The only truly valid ones are ROM resource conflicts.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      54a17702
    • Jeff Garzik's avatar
      [PATCH] ISDN: fix drivers, by handling errors thrown by ->readstat() · 97b60140
      Jeff Garzik authored
      This is a particularly ugly on-failure bug, possibly security, since the
      lack of error handling here is covering up another class of bug: failure to
      handle copy_to_user() return values.
      
      The I4L API function ->readstat() returns an integer, and by looking at
      several existing driver implementations, it is clear that a negative return
      value was meant to indicate an error.
      
      Given that several drivers already return a negative value indicating an
      errno-style error, the current code would blindly accept that [negative]
      value as a valid amount of bytes read.  Obvious damage ensues.
      
      Correcting ->readstat() handling to properly notice errors fixes the
      existing code to work correctly on error, and enables future patches to
      more easily indicate errors during operation.
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      Cc: Karsten Keil <kkeil@suse.de>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      97b60140
    • Jeff Garzik's avatar
      [PATCH] ISDN: check for userspace copy faults · ee8a3629
      Jeff Garzik authored
      Most of the ISDN ->readstat() implementations needed to check
      copy_to_user() and put_user() return values.
      Signed-off-by: default avatarJeff Garzik <jeff@garzik.org>
      Cc: Karsten Keil <kkeil@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      ee8a3629
    • Francisco Larramendi's avatar
      [PATCH] rtc-max6902: month conversion fix · a9258b48
      Francisco Larramendi authored
      Fix October-only BCD-to-binary conversion bug:
      
      	0x08 -> 7
      	0x09 -> 8
      	0x10 -> 15 (!)
      	0x11 -> 19
      
      Fixes http://bugzilla.kernel.org/show_bug.cgi?id=7361
      
      Cc: Raphael Assenat <raph@raphnet.net>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      a9258b48
    • Thomas Gleixner's avatar
      [PATCH] posix-cpu-timers: prevent signal delivery starvation · 9140a25c
      Thomas Gleixner authored
      The integer divisions in the timer accounting code can round the result
      down to 0.  Adding 0 is without effect and the signal delivery stops.
      
      Clamp the division result to minimum 1 to avoid this.
      
      Problem was reported by Seongbae Park <spark@google.com>, who provided
      also an inital patch.
      
      Roland sayeth:
      
        I have had some more time to think about the problem, and to reproduce it
        using Toyo's test case.  For the record, if my understanding of the problem
        is correct, this happens only in one very particular case.  First, the
        expiry time has to be so soon that in cputime_t units (usually 1s/HZ ticks)
        it's < nthreads so the division yields zero.  Second, it only affects each
        thread that is so new that its CPU time accumulation is zero so now+0 is
        still zero and ->it_*_expires winds up staying zero.  For the VIRT and PROF
        clocks when cputime_t is tick granularity (or the SCHED clock on
        configurations where sched_clock's value only advances on clock ticks), this
        is not hard to arrange with new threads starting up and blocking before they
        accumulate a whole tick of CPU time.  That's what happens in Toyo's test
        case.
      
        Note that in general it is fine for that division to round down to zero,
        and set each thread's expiry time to its "now" time.  The problem only
        arises with thread's whose "now" value is still zero, so that now+0 winds up
        0 and is interpreted as "not set" instead of ">= now".  So it would be a
        sufficient and more precise fix to just use max(ticks, 1) inside the loop
        when setting each it_*_expires value.
      
        But, it does no harm to round the division up to one and always advance
        every thread's expiry time.  If the thread didn't already fire timers for
        the expiry time of "now", there is no expectation that it will do so before
        the next tick anyway.  So I followed Thomas's patch in lifting the max out
        of the loops.
      
        This patch also covers the reload cases, which are harder to write a test
        for (and I didn't try).  I've tested it with Toyo's case and it fixes that.
      
      [toyoa@mvista.com: fix: min_t -> max_t]
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Cc: Daniel Walker <dwalker@mvista.com>
      Cc: Toyo Abe <toyoa@mvista.com>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Seongbae Park <spark@google.com>
      Cc: Peter Mattis <pmattis@google.com>
      Cc: Rohit Seth <rohitseth@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      9140a25c