1. 21 Nov, 2011 14 commits
    • Tejun Heo's avatar
      freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE · a3201227
      Tejun Heo authored
      Using TIF_FREEZE for freezing worked when there was only single
      freezing condition (the PM one); however, now there is also the
      cgroup_freezer and single bit flag is getting clumsy.
      thaw_processes() is already testing whether cgroup freezing in in
      effect to avoid thawing tasks which were frozen by both PM and cgroup
      freezers.
      
      This is racy (nothing prevents race against cgroup freezing) and
      fragile.  A much simpler way is to test actual freeze conditions from
      freezing() - ie. directly test whether PM or cgroup freezing is in
      effect.
      
      This patch adds variables to indicate whether and what type of
      freezing conditions are in effect and reimplements freezing() such
      that it directly tests whether any of the two freezing conditions is
      active and the task should freeze.  On fast path, freezing() is still
      very cheap - it only tests system_freezing_cnt.
      
      This makes the clumsy dancing aroung TIF_FREEZE unnecessary and
      freeze/thaw operations more usual - updating state variables for the
      new state and nudging target tasks so that they notice the new state
      and comply.  As long as the nudging happens after state update, it's
      race-free.
      
      * This allows use of freezing() in freeze_task().  Replace the open
        coded tests with freezing().
      
      * p != current test is added to warning printing conditions in
        try_to_freeze_tasks() failure path.  This is necessary as freezing()
        is now true for the task which initiated freezing too.
      
      -v2: Oleg pointed out that re-freezing FROZEN cgroup could increment
           system_freezing_cnt.  Fixed.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: Paul Menage <paul@paulmenage.org>  (for the cgroup portions)
      a3201227
    • Tejun Heo's avatar
      cgroup_freezer: prepare for removal of TIF_FREEZE · 22b4e111
      Tejun Heo authored
      TIF_FREEZE will be removed soon and freezing() will directly test
      whether any freezing condition is in effect.  Make the following
      changes in preparation.
      
      * Rename cgroup_freezing_or_frozen() to cgroup_freezing() and make it
        return bool.
      
      * Make cgroup_freezing() access task_freezer() under rcu read lock
        instead of task_lock().  This makes the state dereferencing racy
        against task moving to another cgroup; however, it was already racy
        without this change as ->state dereference wasn't synchronized.
        This will be later dealt with using attach hooks.
      
      * freezer->state is now set before trying to push tasks into the
        target state.
      
      -v2: Oleg pointed out that freeze_change_state() was setting
           freeze->state incorrectly to CGROUP_FROZEN instead of
           CGROUP_FREEZING.  Fixed.
      
      -v3: Matt pointed out that setting CGROUP_FROZEN used to always invoke
           try_to_freeze_cgroup() regardless of the current state.  Patch
           updated such that the actual freeze/thaw operations are always
           performed on invocation.  This shouldn't make any difference
           unless something is broken.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPaul Menage <paul@paulmenage.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      22b4e111
    • Tejun Heo's avatar
      freezer: clean up freeze_processes() failure path · 03afed8b
      Tejun Heo authored
      freeze_processes() failure path is rather messy.  Freezing is canceled
      for workqueues and tasks which aren't frozen yet but frozen tasks are
      left alone and should be thawed by the caller and of course some
      callers (xen and kexec) didn't do it.
      
      This patch updates __thaw_task() to handle cancelation correctly and
      makes freeze_processes() and freeze_kernel_threads() call
      thaw_processes() on failure instead so that the system is fully thawed
      on failure.  Unnecessary [suspend_]thaw_processes() calls are removed
      from kernel/power/hibernate.c, suspend.c and user.c.
      
      While at it, restructure error checking if clause in suspend_prepare()
      to be less weird.
      
      -v2: Srivatsa spotted missing removal of suspend_thaw_processes() in
           suspend_prepare() and error in commit message.  Updated.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      03afed8b
    • Tejun Heo's avatar
      freezer: kill PF_FREEZING · 376fede8
      Tejun Heo authored
      With the previous changes, there's no meaningful difference between
      PF_FREEZING and PF_FROZEN.  Remove PF_FREEZING and use PF_FROZEN
      instead in task_contributes_to_load().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      376fede8
    • Tejun Heo's avatar
      freezer: test freezable conditions while holding freezer_lock · 85f1d476
      Tejun Heo authored
      try_to_freeze_tasks() and thaw_processes() use freezable() and
      frozen() as preliminary tests before initiating operations on a task.
      These are done without any synchronization and hinder with
      synchronization cleanup without any real performance benefits.
      
      In try_to_freeze_tasks(), open code self test and move PF_NOFREEZE and
      frozen() tests inside freezer_lock in freeze_task().
      
      thaw_processes() can simply drop freezable() test as frozen() test in
      __thaw_task() is enough.
      
      Note: This used to be a part of larger patch to fix set_freezable()
            race.  Separated out to satisfy ordering among dependent fixes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      85f1d476
    • Tejun Heo's avatar
      freezer: make freezing indicate freeze condition in effect · 6907483b
      Tejun Heo authored
      Currently freezing (TIF_FREEZE) and frozen (PF_FROZEN) states are
      interlocked - freezing is set to request freeze and when the task
      actually freezes, it clears freezing and sets frozen.
      
      This interlocking makes things more complex than necessary - freezing
      doesn't mean there's freezing condition in effect and frozen doesn't
      match the task actually entering and leaving frozen state (it's
      cleared by the thawing task).
      
      This patch makes freezing indicate that freeze condition is in effect.
      A task enters and stays frozen if freezing.  This makes PF_FROZEN
      manipulation done only by the task itself and prevents wakeup from
      __thaw_task() leaking outside of refrigerator.
      
      The only place which needs to tell freezing && !frozen is
      try_to_freeze_task() to whine about tasks which don't enter frozen.
      It's updated to test the condition explicitly.
      
      With the change, frozen() state my linger after __thaw_task() until
      the task wakes up and exits fridge.  This can trigger BUG_ON() in
      update_if_frozen().  Work it around by testing freezing() && frozen()
      instead of frozen().
      
      -v2: Oleg pointed out missing re-check of freezing() when trying to
           clear FROZEN and possible spurious BUG_ON() trigger in
           update_if_frozen().  Both fixed.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paul Menage <paul@paulmenage.org>
      6907483b
    • Tejun Heo's avatar
      freezer: use dedicated lock instead of task_lock() + memory barrier · 0c9af092
      Tejun Heo authored
      Freezer synchronization is needlessly complicated - it's by no means a
      hot path and the priority is staying unintrusive and safe.  This patch
      makes it simply use a dedicated lock instead of piggy-backing on
      task_lock() and playing with memory barriers.
      
      On the failure path of try_to_freeze_tasks(), locking is moved from it
      to cancel_freezing().  This makes the frozen() test racy but the race
      here is a non-issue as the warning is printed for tasks which failed
      to enter frozen for 20 seconds and race on PF_FROZEN at the last
      moment doesn't change anything.
      
      This simplifies freezer implementation and eases further changes
      including some race fixes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      0c9af092
    • Tejun Heo's avatar
      freezer: don't distinguish nosig tasks on thaw · 6cd8dedc
      Tejun Heo authored
      There's no point in thawing nosig tasks before others.  There's no
      ordering requirement between the two groups on thaw, which the staged
      thawing can't guarantee anyway.  Simplify thaw_processes() by removing
      the distinction and collapsing thaw_tasks() into thaw_processes().
      This will help further updates to freezer.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      6cd8dedc
    • Tejun Heo's avatar
      freezer: remove racy clear_freeze_flag() and set PF_NOFREEZE on dead tasks · a585042f
      Tejun Heo authored
      clear_freeze_flag() in exit_mm() is racy.  Freezing can start
      afterwards.  Remove it.  Skipping freezer for exiting task will be
      properly implemented later.
      
      Also, freezable() was testing exit_state directly to make system
      freezer ignore dead tasks.  Let the exiting task set PF_NOFREEZE after
      entering TASK_DEAD instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      a585042f
    • Tejun Heo's avatar
      freezer: rename thaw_process() to __thaw_task() and simplify the implementation · a5be2d0d
      Tejun Heo authored
      thaw_process() now has only internal users - system and cgroup
      freezers.  Remove the unnecessary return value, rename, unexport and
      collapse __thaw_process() into it.  This will help further updates to
      the freezer code.
      
      -v3: oom_kill grew a use of thaw_process() while this patch was
           pending.  Convert it to use __thaw_task() for now.  In the longer
           term, this should be handled by allowing tasks to die if killed
           even if it's frozen.
      
      -v2: minor style update as suggested by Matt.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Paul Menage <menage@google.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      a5be2d0d
    • Tejun Heo's avatar
      freezer: implement and use kthread_freezable_should_stop() · 8a32c441
      Tejun Heo authored
      Writeback and thinkpad_acpi have been using thaw_process() to prevent
      deadlock between the freezer and kthread_stop(); unfortunately, this
      is inherently racy - nothing prevents freezing from happening between
      thaw_process() and kthread_stop().
      
      This patch implements kthread_freezable_should_stop() which enters
      refrigerator if necessary but is guaranteed to return if
      kthread_stop() is invoked.  Both thaw_process() users are converted to
      use the new function.
      
      Note that this deadlock condition exists for many of freezable
      kthreads.  They need to be converted to use the new should_stop or
      freezable workqueue.
      
      Tested with synthetic test case.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarHenrique de Moraes Holschuh <ibm-acpi@hmh.eng.br>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      8a32c441
    • Tejun Heo's avatar
      freezer: unexport refrigerator() and update try_to_freeze() slightly · a0acae0e
      Tejun Heo authored
      There is no reason to export two functions for entering the
      refrigerator.  Calling refrigerator() instead of try_to_freeze()
      doesn't save anything noticeable or removes any race condition.
      
      * Rename refrigerator() to __refrigerator() and make it return bool
        indicating whether it scheduled out for freezing.
      
      * Update try_to_freeze() to return bool and relay the return value of
        __refrigerator() if freezing().
      
      * Convert all refrigerator() users to try_to_freeze().
      
      * Update documentation accordingly.
      
      * While at it, add might_sleep() to try_to_freeze().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Samuel Ortiz <samuel@sortiz.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Christoph Hellwig <hch@infradead.org>
      a0acae0e
    • Tejun Heo's avatar
      freezer: don't unnecessarily set PF_NOFREEZE explicitly · 3a7cbd50
      Tejun Heo authored
      Some drivers set PF_NOFREEZE in their kthread functions which is
      completely unnecessary and racy - some part of freezer code doesn't
      consider cases where PF_NOFREEZE is set asynchronous to freezer
      operations.
      
      In general, there's no reason to allow setting PF_NOFREEZE explicitly.
      Remove them and change the documentation to note that setting
      PF_NOFREEZE directly isn't allowed.
      
      -v2: Dropped change to twl4030-irq.c as it no longer uses PF_NOFREEZE.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatar"Gustavo F. Padovan" <padovan@profusion.mobi>
      Acked-by: default avatarSamuel Ortiz <sameo@linux.intel.com>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Cc: wwang <wei_wang@realsil.com.cn>
      3a7cbd50
    • Tejun Heo's avatar
      freezer: fix current->state restoration race in refrigerator() · 50fb4f7f
      Tejun Heo authored
      refrigerator() saves current->state before entering frozen state and
      restores it before returning using __set_current_state(); however,
      this is racy, for example, please consider the following sequence.
      
      	set_current_state(TASK_INTERRUPTIBLE);
      	try_to_freeze();
      	if (kthread_should_stop())
      		break;
      	schedule();
      
      If kthread_stop() races with ->state restoration, the restoration can
      restore ->state to TASK_INTERRUPTIBLE after kthread_stop() sets it to
      TASK_RUNNING but kthread_should_stop() may still see zero
      ->should_stop because there's no memory barrier between restoring
      TASK_INTERRUPTIBLE and kthread_should_stop() test.
      
      This isn't restricted to kthread_should_stop().  current->state is
      often used in memory barrier based synchronization and silently
      restoring it w/o mb breaks them.
      
      Use set_current_state() instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      50fb4f7f
  2. 20 Nov, 2011 5 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 6fe4c6d4
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (86 commits)
        ipv4: fix redirect handling
        ping: dont increment ICMP_MIB_INERRORS
        sky2: fix hang in napi_disable
        sky2: enforce minimum ring size
        bonding: Don't allow mode change via sysfs with slaves present
        f_phonet: fix page offset of first received fragment
        stmmac: fix pm functions avoiding sleep on spinlock
        stmmac: remove spin_lock in stmmac_ioctl.
        stmmac: parameters auto-tuning through HW cap reg
        stmmac: fix advertising 1000Base capabilties for non GMII iface
        stmmac: use mdelay on timeout of sw reset
        sky2: version 1.30
        sky2: used fixed RSS key
        sky2: reduce default Tx ring size
        sky2: rename up/down functions
        sky2: pci posting issues
        sky2: fix hang on shutdown (and other irq issues)
        r6040: fix check against MCRO_HASHEN bit in r6040_multicast_list
        MAINTAINERS: change email address for shemminger
        pch_gbe: Move #include of module.h
        ...
      6fe4c6d4
    • Linus Torvalds's avatar
      Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a4cc3889
      Linus Torvalds authored
      * 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM guest: prevent tracing recursion with kvmclock
        Revert "KVM: PPC: Add support for explicit HIOR setting"
        KVM: VMX: Check for automatic switch msr table overflow
        KVM: VMX: Add support for guest/host-only profiling
        KVM: VMX: add support for switching of PERF_GLOBAL_CTRL
        KVM: s390: announce SYNC_MMU
        KVM: s390: Fix tprot locking
        KVM: s390: handle SIGP sense running intercepts
        KVM: s390: Fix RUNNING flag misinterpretation
      a4cc3889
    • Linus Torvalds's avatar
      Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm · bb893d15
      Linus Torvalds authored
      * 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
        ARM: wire up process_vm_writev and process_vm_readv syscalls
        ARM: 7160/1: setup: avoid overflowing {elf,arch}_name from proc_info_list
        ARM: 7158/1: add new MFP implement for NUC900
        ARM: 7157/1: fix a building WARNING for nuc900
        ARM: 7156/1: l2x0: fix compile error on !CONFIG_USE_OF
        ARM: 7155/1: arch.h: Declare 'pt_regs' locally
        ARM: 7154/1: mach-bcmring: fix build error in dma.c
        ARM: 7153/1: mach-bcmring: fix build error in core.c
        ARM: 7152/1: distclean: Remove generated .dtb files
        ARM: 7150/1: Allow kernel unaligned accesses on ARMv6+ processors
        ARM: 7149/1: spi/pl022: Enable clock in probe
        Revert "ARM: 7098/1: kdump: copy kernel relocation code at the kexec prepare stage"
      bb893d15
    • Linus Torvalds's avatar
      Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2d360fcb
      Linus Torvalds authored
      * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM / Suspend: Fix bug in suspend statistics update
        PM / Hibernate: Fix the early termination of test modes
        PM / shmobile: Fix build of sh7372_pm_init() for CONFIG_PM unset
        PM Sleep: Do not extend wakeup paths to devices with ignore_children set
        PM / driver core: disable device's runtime PM during shutdown
        PM / devfreq: correct Kconfig dependency
        PM / devfreq: fix use after free in devfreq_remove_device
        PM / shmobile: Avoid restoring the INTCS state during initialization
        PM / devfreq: Remove compiler error after irq.h update
        PM / QoS: Properly use the WARN() macro in dev_pm_qos_add_request()
        PM / Clocks: Only disable enabled clocks in pm_clk_suspend()
        ARM: mach-shmobile: sh7372 A3SP no_suspend_console fix
        PM / shmobile: Don't skip debugging output in pd_power_up()
      2d360fcb
    • Avi Kivity's avatar
      KVM guest: prevent tracing recursion with kvmclock · 95ef1e52
      Avi Kivity authored
      Prevent tracing of preempt_disable() in get_cpu_var() in
      kvm_clock_read(). When CONFIG_DEBUG_PREEMPT is enabled,
      preempt_disable/enable() are traced and this causes the function_graph
      tracer to go into an infinite recursion. By open coding the
      preempt_disable() around the get_cpu_var(), we can use the notrace
      version which prevents preempt_disable/enable() from being traced and
      prevents the recursion.
      
      Based on a similar patch for Xen from Jeremy Fitzhardinge.
      Tested-by: default avatarGleb Natapov <gleb@redhat.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      95ef1e52
  3. 19 Nov, 2011 10 commits
  4. 18 Nov, 2011 11 commits