• Alexey Starikovskiy's avatar
    ACPI: created a dedicated workqueue for notify() execution · 88db5e14
    Alexey Starikovskiy authored
    HP nx6125/nx6325/... machines have a _GPE handler with an infinite
    loop sending Notify() events to different ACPI subsystems.
    
    Notify handler in ACPI driver is a C-routine, which may call ACPI
    interpreter again to get access to some ACPI variables
    (acpi_evaluate_xxx).
    On these HP machines such an evaluation changes state of some variable
    and lets the loop above break.
    
    In the current ACPI implementation Notify requests are being deferred
    to the same kacpid workqueue on which the above GPE handler with
    infinite loop is executing. Thus we have a deadlock -- loop will
    continue to spin, sending notify events, and at the same time
    preventing these notify events from being run on a workqueue. All
    notify events are deferred, thus we see increase in memory consumption
    noticed by author of the thread. Also as GPE handling is bloked,
    machines overheat. Eventually by external poll of the same
    acpi_evaluate, kacpid is released and all the queued notify events are
    free to run, thus 100% cpu utilization by kacpid for several seconds
    or more.
    
    To prevent all these horrors it's needed to not put notify events to
    kacpid workqueue by either executing them immediately or putting them
    on some other thread. It's dangerous to execute notify events in
    place, as it will put several ACPI interpreter stacks on top of each
    other (at least 4 in case of nx6125), thus causing kernel  stack
    overflow.
    
    First attempt to create a new thread was done by Peter Wainwright
    He created a bunch of threads, which were stealing work from a kacpid
    workqueue.
    This patch appeared in 2.6.15 kernel shipped with Ubuntu 6.06 LTS.
    
    Second attempt was done by me, I created a new thread for each Notify
    event. This worked OK on HP nx machines, but broke Linus' Compaq
    n620c, by producing threads with a speed what they stopped the machine
    completely. Thus this patch was reverted from 18-rc2 as I remember.
    I re-made the patch to create second workqueue just for notify events,
    thus hopping it will not break Linus' machine. Patch was tested on the
    same HP nx machines in #5534 and #7122, but I did not received reply
    from Linus on a test patch sent to him.
    Patch went to 19-rc and was rejected with much fanfare again.
    There was 4th patch, which inserted schedule_timeout(1) into deferred
    execution of kacpid, if we had any notify requests pending, but Linus
    decided that it was too complex (involved either changes to workqueue
    to see if it's empty or atomic inc/dec).
    Now you see last variant which adds yield() to every GPE execution.
    
    http://bugzilla.kernel.org/show_bug.cgi?id=5534
    http://bugzilla.kernel.org/show_bug.cgi?id=8385Signed-off-by: default avatarAlexey Starikovskiy <alexey.y.starikovskiy@intel.com>
    Signed-off-by: default avatarLen Brown <len.brown@intel.com>
    88db5e14
osl.c 26 KB