1. 26 Oct, 2012 1 commit
    • Oleg Nesterov's avatar
      freezer: change ptrace_stop/do_signal_stop to use freezable_schedule() · 5d8f72b5
      Oleg Nesterov authored
      try_to_freeze_tasks() and cgroup_freezer rely on scheduler locks
      to ensure that a task doing STOPPED/TRACED -> RUNNING transition
      can't escape freezing. This mostly works, but ptrace_stop() does
      not necessarily call schedule(), it can change task->state back to
      RUNNING and check freezing() without any lock/barrier in between.
      
      We could add the necessary barrier, but this patch changes
      ptrace_stop() and do_signal_stop() to use freezable_schedule().
      This fixes the race, freezer_count() and freezer_should_skip()
      carefully avoid the race.
      
      And this simplifies the code, try_to_freeze_tasks/update_if_frozen
      no longer need to use task_is_stopped_or_traced() checks with the
      non trivial assumptions. We can rely on the mechanism which was
      specially designed to mark the sleeping task as "frozen enough".
      
      v2: As Tejun pointed out, we can also change get_signal_to_deliver()
      and move try_to_freeze() up before 'relock' label.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      5d8f72b5
  2. 20 Oct, 2012 3 commits
    • Tejun Heo's avatar
      cgroup_freezer: don't use cgroup_lock_live_group() · ead5c473
      Tejun Heo authored
      freezer_read/write() used cgroup_lock_live_group() to synchronize
      against task migration into and out of the target cgroup.
      cgroup_lock_live_group() grabs the internal cgroup lock and using it
      from outside cgroup core leads to complex and fragile locking
      dependency issues which are difficult to resolve.
      
      Now that freezer_can_attach() is replaced with freezer_attach() and
      update_if_frozen() updated, nothing requires excluding migration
      against freezer state reads and changes.
      
      This patch removes cgroup_lock_live_group() and the matching
      cgroup_unlock() usages.  The prone-to-bitrot, already outdated and
      unnecessary global lock hierarchy documentation is replaced with
      documentation in local scope.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Li Zefan <lizefan@huawei.com>
      ead5c473
    • Tejun Heo's avatar
      cgroup_freezer: prepare update_if_frozen() for locking change · b4d18311
      Tejun Heo authored
      Locking will change such that migration can happen while
      freezer_read/write() is in progress.  This means that
      update_if_frozen() can no longer assume that all tasks in the cgroup
      coform to the current freezer state - newly migrated tasks which
      haven't finished freezer_attach() yet might be in any state.
      
      This patch updates update_if_frozen() such that it no longer verifies
      task states against freezer state.  It now simply decides whether
      FREEZING stage is complete.
      
      This removal of verification makes it meaningless to call from
      freezer_change_state().  Drop it and move the fast exit test from
      freezer_read() - the only left caller - to update_if_frozen().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Li Zefan <lizefan@huawei.com>
      b4d18311
    • Tejun Heo's avatar
      cgroup_freezer: allow moving tasks in and out of a frozen cgroup · 8755ade6
      Tejun Heo authored
      cgroup_freezer is one of the few users of cgroup_subsys->can_attach()
      and uses it to prevent tasks from being migrated into or out of a
      frozen cgroup.  This makes cgroup_freezer cumbersome to use especially
      when co-mounted with other controllers.
      
      ->can_attach() is problematic in general as it can make co-mounting
      multiple cgroups difficult - migrating tasks may fail for reasons
      completely irrelevant for other controllers.  freezer_can_attach() in
      particular is more problematic because it messes with cgroup internal
      locking to ensure that the state verification performed at
      freezer_can_attach() stays valid until migration is complete.
      
      This patch replaces freezer_can_attach() with freezer_attach() so that
      tasks are always allowed to migrate - they are nudged into the
      conforming state from freezer_attach().  This means that there can be
      tasks which are being migrated which don't conform to the current
      cgroup_freezer state until freezer_attach() is complete.  Under the
      current locking scheme, the only such place is freezer_fork() which is
      updated to handle such window.
      
      While this patch doesn't remove the use of internal cgroup locking
      from freezer_read/write() paths, it removes the requirement to keep
      the freezer state constant while migrating and enables such change.
      
      Note that this creates a userland visible behavior change - FROZEN
      cgroup can no longer be used to lock migrations in and out of the
      cgroup.  This behavior change is intended.  I don't think the feature
      is necessary - userland should coordinate accesses to cgroup fs anyway
      - and even if the feature is needed cgroup_freezer is the completely
      wrong place to implement it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      LKML-Reference: <1350426526-14254-1-git-send-email-tj@kernel.org>
      Cc: Matt Helsley <matthltc@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Li Zefan <lizefan@huawei.com>
      8755ade6
  3. 16 Oct, 2012 4 commits
    • Tejun Heo's avatar
      cgroup_freezer: don't stall transition to FROZEN for PF_NOFREEZE or PF_FREEZER_SKIP tasks · 3c426d5e
      Tejun Heo authored
      cgroup_freezer doesn't transition from FREEZING to FROZEN if the
      cgroup contains PF_NOFREEZE tasks or tasks sleeping with
      PF_FREEZER_SKIP set.
      
      Only kernel tasks can be non-freezable (PF_NOFREEZE) and there's
      nothing cgroup_freezer or userland can do about or to it.  It's
      pointless to stall the transition for PF_NOFREEZE tasks.
      
      PF_FREEZER_SKIP indicates that the task can be skipped when
      determining whether frozen state is reached.  A task with
      PF_FREEZER_SKIP is guaranteed to perform try_to_freeze() after it
      wakes up and can be considered frozen much like stopped or traced
      tasks.  Note that a vfork parent uses PF_FREEZER_SKIP while waiting
      for the child.
      
      This updates update_if_frozen() such that it only considers freezable
      tasks and treats %true freezer_should_skip() tasks as frozen.
      
      This allows cgroups w/ kthreads and vfork parents successfully reach
      FROZEN state.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      3c426d5e
    • Tejun Heo's avatar
      cgroup_freezer: make it official that writes to freezer.state don't fail · 51f246ed
      Tejun Heo authored
      try_to_freeze_cgroup() has condition checks which are intended to fail
      the write operation to freezer.state if there are tasks which can't be
      frozen.  The condition checks have been broken for quite some time
      now.  freeze_task() returns %false if the target task can't be frozen,
      so num_cant_freeze_now is never incremented.
      
      In addition, strangely, cgroup freezing proceeds even after the write
      is failed, which is rather broken.
      
      This patch rips out the non-working code intended to fail the write to
      freezer.state when the cgroup contains non-freezable tasks and makes
      it official that writes to freezer.state succeed whether there are
      non-freezable tasks in the cgroup or not.
      
      This leaves is_task_frozen_enough() with only one user -
      upste_if_frozen().  Collapse it into the caller.  Note that this
      removes an extra call to freezing().
      
      This doesn't cause any userland behavior changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      51f246ed
    • Tejun Heo's avatar
      freezer: add missing mb's to freezer_count() and freezer_should_skip() · dd67d32d
      Tejun Heo authored
      A task is considered frozen enough between freezer_do_not_count() and
      freezer_count() and freezers use freezer_should_skip() to test this
      condition.  This supposedly works because freezer_count() always calls
      try_to_freezer() after clearing %PF_FREEZER_SKIP.
      
      However, there currently is nothing which guarantees that
      freezer_count() sees %true freezing() after clearing %PF_FREEZER_SKIP
      when freezing is in progress, and vice-versa.  A task can escape the
      freezing condition in effect by freezer_count() seeing !freezing() and
      freezer_should_skip() seeing %PF_FREEZER_SKIP.
      
      This patch adds smp_mb()'s to freezer_count() and
      freezer_should_skip() such that either %true freezing() is visible to
      freezer_count() or !PF_FREEZER_SKIP is visible to
      freezer_should_skip().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@vger.kernel.org
      dd67d32d
    • Tejun Heo's avatar
      cgroup: cgroup_subsys->fork() should be called after the task is added to css_set · 5edee61e
      Tejun Heo authored
      cgroup core has a bug which violates a basic rule about event
      notifications - when a new entity needs to be added, you add that to
      the notification list first and then make the new entity conform to
      the current state.  If done in the reverse order, an event happening
      inbetween will be lost.
      
      cgroup_subsys->fork() is invoked way before the new task is added to
      the css_set.  Currently, cgroup_freezer is the only user of ->fork()
      and uses it to make new tasks conform to the current state of the
      freezer.  If FROZEN state is requested while fork is in progress
      between cgroup_fork_callbacks() and cgroup_post_fork(), the child
      could escape freezing - the cgroup isn't frozen when ->fork() is
      called and the freezer couldn't see the new task on the css_set.
      
      This patch moves cgroup_subsys->fork() invocation to
      cgroup_post_fork() after the new task is added to the css_set.
      cgroup_fork_callbacks() is removed.
      
      Because now a task may be migrated during cgroup_subsys->fork(),
      freezer_fork() is updated so that it adheres to the usual RCU locking
      and the rather pointless comment on why locking can be different there
      is removed (if it doesn't make anything simpler, why even bother?).
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@vger.kernel.org
      5edee61e
  4. 14 Oct, 2012 6 commits
    • Linus Torvalds's avatar
      Linux 3.7-rc1 · ddffeb8c
      Linus Torvalds authored
      ddffeb8c
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · a5ef3f7d
      Linus Torvalds authored
      Pull MIPS update from Ralf Baechle:
       "Cleanups and fixes for breakage that occured earlier during this merge
        phase.  Also a few patches that didn't make the first pull request.
        Of those is the Alchemy work that merges code for many of the SOCs and
        evaluation boards thus among other code shrinkage, reduces the number
        of MIPS defconfigs by 5."
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (22 commits)
        MIPS: SNI: Switch RM400 serial to SCCNXP driver
        MIPS: Remove unused empty_bad_pmd_table[] declaration.
        MIPS: MT: Remove kspd.
        MIPS: Malta: Fix section mismatch.
        MIPS: asm-offset.c: Delete unused irq_cpustat_t struct offsets.
        MIPS: Alchemy: Merge PB1100/1500 support into DB1000 code.
        MIPS: Alchemy: merge PB1550 support into DB1550 code
        MIPS: Alchemy: Single kernel for DB1200/1300/1550
        MIPS: Optimize TLB refill for RI/XI configurations.
        MIPS: proc: Cleanup printing of ASEs.
        MIPS: Hardwire detection of DSP ASE Rev 2 for systems, as required.
        MIPS: Add detection of DSP ASE Revision 2.
        MIPS: Optimize pgd_init and pmd_init
        MIPS: perf: Add perf functionality for BMIPS5000
        MIPS: perf: Split the Kconfig option CONFIG_MIPS_MT_SMP
        MIPS: perf: Remove unnecessary #ifdef
        MIPS: perf: Add cpu feature bit for PCI (performance counter interrupt)
        MIPS: perf: Change the "mips_perf_event" table unsupported indicator.
        MIPS: Align swapper_pg_dir to 64K for better TLB Refill code.
        vmlinux.lds.h: Allow architectures to add sections to the front of .bss
        ...
      a5ef3f7d
    • Linus Torvalds's avatar
      Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · d25282d1
      Linus Torvalds authored
      Pull module signing support from Rusty Russell:
       "module signing is the highlight, but it's an all-over David Howells frenzy..."
      
      Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.
      
      * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
        X.509: Fix indefinite length element skip error handling
        X.509: Convert some printk calls to pr_devel
        asymmetric keys: fix printk format warning
        MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
        MODSIGN: Make mrproper should remove generated files.
        MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
        MODSIGN: Use the same digest for the autogen key sig as for the module sig
        MODSIGN: Sign modules during the build process
        MODSIGN: Provide a script for generating a key ID from an X.509 cert
        MODSIGN: Implement module signature checking
        MODSIGN: Provide module signing public keys to the kernel
        MODSIGN: Automatically generate module signing keys if missing
        MODSIGN: Provide Kconfig options
        MODSIGN: Provide gitignore and make clean rules for extra files
        MODSIGN: Add FIPS policy
        module: signature checking hook
        X.509: Add a crypto key parser for binary (DER) X.509 certificates
        MPILIB: Provide a function to read raw data into an MPI
        X.509: Add an ASN.1 decoder
        X.509: Add simple ASN.1 grammar compiler
        ...
      d25282d1
    • Matt Fleming's avatar
      x86, boot: Explicitly include autoconf.h for hostprogs · b6eea87f
      Matt Fleming authored
      The hostprogs need access to the CONFIG_* symbols found in
      include/generated/autoconf.h.  But commit abbf1590 ("UAPI: Partition
      the header include path sets and add uapi/ header directories") replaced
      $(LINUXINCLUDE) with $(USERINCLUDE) which doesn't contain the necessary
      include paths.
      
      This has the undesirable effect of breaking the EFI boot stub because
      the #ifdef CONFIG_EFI_STUB code in arch/x86/boot/tools/build.c is
      never compiled.
      
      It should also be noted that because $(USERINCLUDE) isn't exported by
      the top-level Makefile it's actually empty in arch/x86/boot/Makefile.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b6eea87f
    • Ingo Molnar's avatar
      perf: Fix UAPI fallout · 7d380c8f
      Ingo Molnar authored
      The UAPI commits forgot to test tooling builds such as tools/perf/,
      and this fixes the fallout.
      
      Manual conversion.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7d380c8f
    • Linus Torvalds's avatar
      Merge branch 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm · 3d6ee36d
      Linus Torvalds authored
      Pull ARM update from Russell King:
       "This is the final round of stuff for ARM, left until the end of the
        merge window to reduce the number of conflicts.  This set contains the
        ARM part of David Howells UAPI changes, and a fix to the ordering of
        'select' statements in ARM Kconfig files (see the appropriate commit
        for why this happened - thanks to Andrew Morton for pointing out the
        problem.)
      
        I've left this as long as I dare for this window to avoid conflicts,
        and I regenerated the config patch yesterday, posting it to our
        mailing list for review and testing.  I have several acks which
        include successful test reports for it.
      
        However, today I notice we've got new conflicts with previously unseen
        code...  though that conflict should be trivial (it's my changes vs a
        one liner.)"
      
      * 'late-for-linus' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: config: make sure that platforms are ordered by option string
        ARM: config: sort select statements alphanumerically
        UAPI: (Scripted) Disintegrate arch/arm/include/asm
      
      Fix up fairly conflict in arch/arm/Kconfig (the select re-organization
      vs recent addition of GENERIC_KERNEL_EXECVE)
      3d6ee36d
  5. 13 Oct, 2012 26 commits