1. 25 Apr, 2014 4 commits
  2. 23 Apr, 2014 13 commits
    • Tejun Heo's avatar
      cgroup: implement dynamic subtree controller enable/disable on the default hierarchy · f8f22e53
      Tejun Heo authored
      cgroup is switching away from multiple hierarchies and will use one
      unified default hierarchy where controllers can be dynamically enabled
      and disabled per subtree.  The default hierarchy will serve as the
      unified hierarchy to which all controllers are attached and a css on
      the default hierarchy would need to also serve the tasks of descendant
      cgroups which don't have the controller enabled - ie. the tree may be
      collapsed from leaf towards root when viewed from specific
      controllers.  This has been implemented through effective css in the
      previous patches.
      
      This patch finally implements dynamic subtree controller
      enable/disable on the default hierarchy via a new knob -
      "cgroup.subtree_control" which controls which controllers are enabled
      on the child cgroups.  Let's assume a hierarchy like the following.
      
        root - A - B - C
                     \ D
      
      root's "cgroup.subtree_control" determines which controllers are
      enabled on A.  A's on B.  B's on C and D.  This coincides with the
      fact that controllers on the immediate sub-level are used to
      distribute the resources of the parent.  In fact, it's natural to
      assume that resource control knobs of a child belong to its parent.
      Enabling a controller in "cgroup.subtree_control" declares that
      distribution of the respective resources of the cgroup will be
      controlled.  Note that this means that controller enable states are
      shared among siblings.
      
      The default hierarchy has an extra restriction - only cgroups which
      don't contain any task may have controllers enabled in
      "cgroup.subtree_control".  Combined with the other properties of the
      default hierarchy, this guarantees that, from the view point of
      controllers, tasks are only on the leaf cgroups.  In other words, only
      leaf csses may contain tasks.  This rules out situations where child
      cgroups compete against internal tasks of the parent, which is a
      competition between two different types of entities without any clear
      way to determine resource distribution between the two.  Different
      controllers handle it differently and all the implemented behaviors
      are ambiguous, ad-hoc, cumbersome and/or just wrong.  Having this
      structural constraints imposed from cgroup core removes the burden
      from controller implementations and enables showing one consistent
      behavior across all controllers.
      
      When a controller is enabled or disabled, css associations for the
      controller in the subtrees of each child should be updated.  After
      enabling, the whole subtree of a child should point to the new css of
      the child.  After disabling, the whole subtree of a child should point
      to the cgroup's css.  This is implemented by first updating cgroup
      states such that cgroup_e_css() result points to the appropriate css
      and then invoking cgroup_update_dfl_csses() which migrates all tasks
      in the affected subtrees to the self cgroup on the default hierarchy.
      
      * When read, "cgroup.subtree_control" lists all the currently enabled
        controllers on the children of the cgroup.
      
      * White-space separated list of controller names prefixed with either
        '+' or '-' can be written to "cgroup.subtree_control".  The ones
        prefixed with '+' are enabled on the controller and '-' disabled.
      
      * A controller can be enabled iff the parent's
        "cgroup.subtree_control" enables it and disabled iff no child's
        "cgroup.subtree_control" has it enabled.
      
      * If a cgroup has tasks, no controller can be enabled via
        "cgroup.subtree_control".  Likewise, if "cgroup.subtree_control" has
        some controllers enabled, tasks can't be migrated into the cgroup.
      
      * All controllers which aren't bound on other hierarchies are
        automatically associated with the root cgroup of the default
        hierarchy.  All the controllers which are bound to the default
        hierarchy are listed in the read-only file "cgroup.controllers" in
        the root directory.
      
      * "cgroup.controllers" in all non-root cgroups is read-only file whose
        content is equal to that of "cgroup.subtree_control" of the parent.
        This indicates which controllers can be used in the cgroup's
        "cgroup.subtree_control".
      
      This is still experimental and there are some holes, one of which is
      that ->can_attach() failure during cgroup_update_dfl_csses() may leave
      the cgroups in an undefined state.  The issues will be addressed by
      future patches.
      
      v2: Non-root cgroups now also have "cgroup.controllers".
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      f8f22e53
    • Tejun Heo's avatar
      cgroup: prepare migration path for unified hierarchy · f817de98
      Tejun Heo authored
      Unified hierarchy implementation would require re-migrating tasks onto
      the same cgroup on the default hierarchy to reflect updated effective
      csses.  Update cgroup_migrate_prepare_dst() so that it accepts NULL as
      the destination cgrp.  When NULL is specified, the destination is
      considered to be the cgroup on the default hierarchy associated with
      each css_set.
      
      After this change, the identity check in cgroup_migrate_add_src()
      isn't sufficient for noop detection as the associated csses may change
      without any cgroup association changing.  The only way to tell whether
      a migration is noop or not is testing whether the source and
      destination csets are identical.  The noop check in
      cgroup_migrate_add_src() is removed and cset identity test is added to
      cgroup_migreate_prepare_dst().  If it's detected that source and
      destination csets are identical, the cset is removed removed from
      @preloaded_csets and all the migration nodes are cleared which makes
      cgroup_migrate() ignore the cset.
      
      Also, make the function append the destination css_sets to
      @preloaded_list so that destination css_sets always come after source
      css_sets.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      f817de98
    • Tejun Heo's avatar
      cgroup: update subsystem rebind restrictions · 7fd8c565
      Tejun Heo authored
      Because the default root couldn't have any non-root csses attached to
      it, rebinding away from it was always allowed; however, the default
      hierarchy will soon host the unified hierarchy and have non-root csses
      so the rebind restrictions need to be updated accordingly.
      
      Instead of special casing rebinding from the default hierarchy and
      then checking whether the source hierarchy has children cgroups, which
      implies non-root csses for !dfl hierarchies, simply check whether the
      source hierarchy has non-root csses for the subsystem using
      css_next_child().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      7fd8c565
    • Tejun Heo's avatar
      cgroup: add css_set->dfl_cgrp · 6803c006
      Tejun Heo authored
      To implement the unified hierarchy behavior, we'll need to be able to
      determine the associated cgroup on the default hierarchy from css_set.
      Let's add css_set->dfl_cgrp so that it can be accessed conveniently
      and efficiently.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      6803c006
    • Tejun Heo's avatar
      cgroup: allow cgroup creation and suppress automatic css creation in the unified hierarchy · bd53d617
      Tejun Heo authored
      Now that effective css handling has been added and iterators updated
      accordingly, it's safe to allow cgroup creation in the default
      hierarchy.  Unblock cgroup creation in the default hierarchy.
      
      As the default hierarchy will implement explicit enabling and
      disabling of controllers on each cgroup, suppress automatic css
      enabling on cgroup creation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      bd53d617
    • Tejun Heo's avatar
      cgroup: cgroup->subsys[] should be cleared after the css is offlined · e3297803
      Tejun Heo authored
      After a css finishes offlining, offline_css() mistakenly performs
      RCU_INIT_POINTER(css->cgroup->subsys[ss->id], css) which just sets the
      cgroup->subsys[] pointer to the current value.  The intention was to
      clear it after offline is complete, not reassign the same value.
      
      Update it to assign NULL instead of the current value.  This makes
      cgroup_css() to return NULL once offline is complete.  All the
      existing users of the function either can handle NULL return already
      or guarantee that the css doesn't get offlined.
      
      While this is a bugfix, as css lifetime is currently tied to the
      cgroup it belongs to, this bug doesn't cause any actual problems.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      e3297803
    • Tejun Heo's avatar
      cgroup: teach css_task_iter about effective csses · 3ebb2b6e
      Tejun Heo authored
      Currently, css_task_iter iterates tasks associated with a css by
      visiting each css_set associated with the owning cgroup and walking
      tasks of each of them.  This works fine for !unified hierarchies as
      each cgroup has its own css for each associated subsystem on the
      hierarchy; however, on the planned unified hierarchy, a cgroup may not
      have csses associated and its tasks would be considered associated
      with the matching css of the nearest ancestor which has the subsystem
      enabled.
      
      This means that on the default unified hierarchy, just walking all
      tasks associated with a cgroup isn't enough to walk all tasks which
      are associated with the specified css.  If any of its children doesn't
      have the matching css enabled, task iteration should also include all
      tasks from the subtree.  We already added cgroup->e_csets[] to list
      all css_sets effectively associated with a given css and walk css_sets
      on that list instead to achieve such iteration.
      
      This patch updates css_task_iter iteration such that it walks css_sets
      on cgroup->e_csets[] instead of cgroup->cset_links if iteration is
      requested on an non-dummy css.  Thanks to the previous iteration
      update, this change can be achieved with the addition of
      css_task_iter->ss and minimal updates to css_advance_task_iter() and
      css_task_iter_start().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      3ebb2b6e
    • Tejun Heo's avatar
      cgroup: reorganize css_task_iter · 0f0a2b4f
      Tejun Heo authored
      This patch reorganizes css_task_iter so that adding effective css
      support is easier.
      
      * s/->cset_link/->cset_pos/ and s/->task/->task_pos/ for consistency
      
      * ->origin_css is used to determine whether the iteration reached the
        last css_set.  Replace it with explicit ->cset_head so that
        css_advance_task_iter() doesn't have to know the termination
        condition directly.
      
      * css_task_iter_next() currently assumes that it's walking list of
        cgrp_cset_link and reaches into the current cset through the current
        link to determine the termination conditions for task walking.  As
        this won't always be true for effective css walking, add
        ->tasks_head and ->mg_tasks_head and use them to control task
        walking so that css_task_iter_next() doesn't have to know how
        css_sets are being walked.
      
      This patch doesn't make any behavior changes.  The iteration logic
      stays unchanged after the patch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      0f0a2b4f
    • Tejun Heo's avatar
      cgroup: make css_next_child() skip missing csses · 3b281afb
      Tejun Heo authored
      css_next_child() walks the children of the specified css.  It does
      this by finding the next cgroup and then returning the requested css.
      On the default unified hierarchy, a cgroup may not have a css
      associated with it even if the hierarchy has the subsystem enabled.
      This patch updates css_next_child() so that it skips children without
      the requested css associated.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      3b281afb
    • Tejun Heo's avatar
      cgroup: implement cgroup->e_csets[] · 2d8f243a
      Tejun Heo authored
      On the default unified hierarchy, a cgroup may be associated with
      csses of its ancestors, which means that a css of a given cgroup may
      be associated with css_sets of descendant cgroups.  This means that we
      can't walk all tasks associated with a css by iterating the css_sets
      associated with the cgroup as there are css_sets which are pointing to
      the css but linked on the descendants.
      
      This patch adds per-subsystem list heads cgroup->e_csets[].  Any
      css_set which is pointing to a css is linked to
      css->cgroup->e_csets[$SUBSYS_ID] through
      css_set->e_cset_node[$SUBSYS_ID].  The lists are protected by
      css_set_rwsem and will allow us to walk all css_sets associated with a
      given css so that we can find out all associated tasks.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      2d8f243a
    • Tejun Heo's avatar
      cgroup: introduce effective cgroup_subsys_state · aec3dfcb
      Tejun Heo authored
      In the planned default unified hierarchy, controllers may get
      dynamically attached to and detached from a cgroup and a cgroup may
      not have csses for all the controllers associated with the hierarchy.
      
      When a cgroup doesn't have its own css for a given controller, the css
      of the nearest ancestor with the controller enabled will be used,
      which is called the effective css.  This patch introduces
      cgroup_e_css() and for_each_e_css() to access the effective csses and
      convert compare_css_sets(), find_existing_css_set() and
      cgroup_migrate() to use the effective csses so that they can handle
      cgroups with partial csses correctly.
      
      This means that for two css_sets to be considered identical, they
      should have both matching csses and cgroups.  compare_css_sets()
      already compares both, not for correctness but for optimization.  As
      this now becomes a matter of correctness, update the comments
      accordingly.
      
      For all !default hierarchies, cgroup_e_css() always equals
      cgroup_css(), so this patch doesn't change behavior.
      
      While at it, fix incorrect locking comment for for_each_css().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      aec3dfcb
    • Tejun Heo's avatar
      cgroup: update cgroup->subsys_mask to ->child_subsys_mask and restore cgroup_root->subsys_mask · f392e51c
      Tejun Heo authored
      94419627 ("cgroup: move ->subsys_mask from cgroupfs_root to
      cgroup") moved ->subsys_mask from cgroup_root to cgroup to prepare for
      the unified hierarhcy; however, it turns out that carrying the
      subsys_mask of the children in the parent, instead of itself, is a lot
      more natural.  This patch restores cgroup_root->subsys_mask and morphs
      cgroup->subsys_mask into cgroup->child_subsys_mask.
      
      * Uses of root->cgrp.subsys_mask are restored to root->subsys_mask.
      
      * Remove automatic setting and clearing of cgrp->subsys_mask and
        instead just inherit ->child_subsys_mask from the parent during
        cgroup creation.  Note that this doesn't affect any current
        behaviors.
      
      * Undo __kill_css() separation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      f392e51c
    • Tejun Heo's avatar
      cgroup: cgroup_apply_cftypes() shouldn't skip the default hierarhcy · ea8fd3b4
      Tejun Heo authored
      cgroup_apply_cftypes() skip creating or removing files if the
      subsystem is attached to the default hierarchy, which led to missing
      files in the root of the default hierarchy.
      
      Skipping made sense when the default hierarchy was dummy; however, now
      that the default hierarchy is full functional and planned to be used
      as the unified hierarchy, it shouldn't be skipped over.
      Reported-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      ea8fd3b4
  3. 20 Apr, 2014 5 commits
  4. 19 Apr, 2014 12 commits
    • Adrien BAK's avatar
      perf tools: Improve error reporting · ffa91880
      Adrien BAK authored
      In the current version, when using perf record, if something goes
      wrong in tools/perf/builtin-record.c:375
        session = perf_session__new(file, false, NULL);
      
      The error message:
      "Not enough memory for reading per file header"
      
      is issued. This error message seems to be outdated and is not very
      helpful. This patch proposes to replace this error message by
      "Perf session creation failed"
      
      I believe this issue has been brought to lkml:
      https://lkml.org/lkml/2014/2/24/458
      although this patch only tackles a (small) part of the issue.
      
      Additionnaly, this patch improves error reporting in
      tools/perf/util/data.c open_file_write.
      
      Currently, if the call to open fails, the user is unaware of it.
      This patch logs the error, before returning the error code to
      the caller.
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAdrien BAK <adrien.bak@metascale.org>
      Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
      [ Reorganize the changelog into paragraphs ]
      [ Added empty line after fd declaration in open_file_write ]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      ffa91880
    • Vladimir Nikulichev's avatar
      perf tools: Adjust symbols in VDSO · 922d0e4d
      Vladimir Nikulichev authored
      pert-report doesn't resolve function names in VDSO:
      
      $ perf report --stdio -g flat,0.0,15,callee --sort pid
      ...
                  8.76%
                     0x7fff6b1fe861
                     __gettimeofday
                     ACE_OS::gettimeofday()
      ...
      
      In this case symbol values should be adjusted the same way as for executables,
      relocatable objects and prelinked libraries.
      
      After fix:
      
      $ perf report --stdio -g flat,0.0,15,callee --sort pid
      ...
                  8.76%
                     __vdso_gettimeofday
                     __gettimeofday
                     ACE_OS::gettimeofday()
      Signed-off-by: default avatarVladimir Nikulichev <nvs@tbricks.com>
      Tested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvsSigned-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      922d0e4d
    • Alexander Yarygin's avatar
      perf kvm: Fix 'Min time' counting in report command · acb61fc8
      Alexander Yarygin authored
      Every event in the perf-kvm has a 'stats' structure, which contains
      max/min/average/etc times of handling this event.
      The problem is that the 'perf-kvm stat report' command always shows
      that 'min time' is 0us for every event. Example:
      
       # perf kvm stat report
      
       Analyze events for all VCPUs:
      
          VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
        [..]
        0xB2 MSCH         12     0.07%     0.00%        0us        8us 7.31us ( +-   2.11% )
        0xB2 CHSC         12     0.07%     0.00%        0us       18us 9.39us ( +-   9.49% )
        0xB2 STPX          8     0.05%     0.00%        0us        2us 1.88us ( +-   7.18% )
        0xB2 STSI          7     0.04%     0.00%        0us       44us 16.49us ( +-  38.20% )
        [..]
      
      This happens because the 'stats' structure is not initialized and
      stats->min equals to 0. Lets initialize the structure for every
      event after its allocation using init_stats() function. This initializes
      stats->min to -1 and makes 'Min time' statistics counting work:
      
       # perf kvm stat report
      
       Analyze events for all VCPUs:
      
          VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
        [..]
        0xB2 MSCH         12     0.07%     0.00%        6us        8us 7.31us ( +-   2.11% )
        0xB2 CHSC         12     0.07%     0.00%        7us       18us 9.39us ( +-   9.49% )
        0xB2 STPX          8     0.05%     0.00%        1us        2us 1.88us ( +-   7.18% )
        0xB2 STSI          7     0.04%     0.00%        1us       44us 16.49us ( +-  38.20% )
        [..]
      Signed-off-by: default avatarAlexander Yarygin <yarygin@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com
      [ Fixing the perf examples changelog output ]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      acb61fc8
    • Eric Dumazet's avatar
      coredump: fix va_list corruption · 404ca80e
      Eric Dumazet authored
      A va_list needs to be copied in case it needs to be used twice.
      
      Thanks to Hugh for debugging this issue, leading to various panics.
      
      Tested:
      
        lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
      
      'produce_core' is simply : main() { *(int *)0 = 1;}
      
        lpq84:~# ./produce_core
        Segmentation fault (core dumped)
        lpq84:~# dmesg | tail -1
        [  614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed
      
      Notice the last argument was replaced by a NULL (we were lucky enough to
      not crash, but do not try this on your production machine !)
      
      After fix :
      
        lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
        lpq83:~# ./produce_core
        Segmentation fault
        lpq83:~# dmesg | tail -1
        [  740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed
      
      Fixes: 5fe9d8ca ("coredump: cn_vprintf() has no reason to call vsnprintf() twice")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      404ca80e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6d459690
      Linus Torvalds authored
      Pull x86 fix from Ingo Molnar:
       "This fixes the preemption-count imbalance crash reported by Owen
        Kibel"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Fix CMCI preemption bugs
      6d459690
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8f98f6f5
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes:
      
         - a SCHED_DEADLINE task selection fix
         - a sched/numa related lockdep splat fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Check for stop task appearance when balancing happens
        sched/numa: Fix task_numa_free() lockdep splat
      8f98f6f5
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8de3f7a7
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Two kernel side fixes:
      
         - an Intel uncore PMU driver potential crash fix
         - a kprobes/perf-call-graph interaction fix"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU
        kprobes/x86: Fix page-fault handling logic
      8de3f7a7
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · b9312420
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Unfortunately this contains no easter eggs, its a bit larger than I'd
        like, but I included a patch that just moves code from one file to
        another and I'd like to avoid merge conflicts with that later, so it
        makes it seem worse than it is,
      
        Otherwise:
         - radeon: fixes to use new microcode to stabilise some cards, use
           some common displayport code, some runtime pm fixes, pll regression
           fixes
         - i915: fix for some context oopses, a warn in a used path, backlight
           fixes
         - nouveau: regression fix
         - omap: a bunch of fixes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (51 commits)
        drm: bochs: drop unused struct fields
        drm: bochs: add power management support
        drm: cirrus: add power management support
        drm: Split out drm_probe_helper.c from drm_crtc_helper.c
        drm/plane-helper: Don't fake-implement primary plane disabling
        drm/ast: fix value check in cbr_scan2
        drm/nouveau/bios: fix a bit shift error introduced by 457e77b2
        drm/radeon/ci: make sure mc ucode is loaded before checking the size
        drm/radeon/si: make sure mc ucode is loaded before checking the size
        drm/radeon: improve PLL params if we don't match exactly v2
        drm/radeon: memory leak on bo reservation failure. v2
        drm/radeon: fix VCE fence command
        drm/radeon: re-enable mclk dpm on R7 260X asics
        drm/radeon: add support for newer mc ucode on CI (v2)
        drm/radeon: add support for newer mc ucode on SI (v2)
        drm/radeon: apply more strict limits for PLL params v2
        drm/radeon: update CI DPM powertune settings
        drm/radeon: fix runpm handling on APUs (v4)
        drm/radeon: disable mclk dpm on R7 260X
        drm/tegra: Remove gratuitous pad field
        ...
      b9312420
    • Dave Airlie's avatar
      Merge branch 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux into drm-next · a42892ed
      Dave Airlie authored
      Some i2c fixes over DisplayPort.
      
      * 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux:
        drm/radeon: Improve vramlimit module param documentation
        drm/radeon: fix audio pin counts for DCE6+ (v2)
        drm/radeon/dp: switch to the common i2c over aux code
        drm/dp/i2c: Update comments about common i2c over dp assumptions (v3)
        drm/dp/i2c: send bare addresses to properly reset i2c connections (v4)
        drm/radeon/dp: handle zero sized i2c over aux transactions (v2)
        drm/i915: support address only i2c-over-aux transactions
        drm/tegra: dp: Support address-only I2C-over-AUX transactions
      a42892ed
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ebfc45ee
      Linus Torvalds authored
      Pull more networking fixes from David Miller:
      
       1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI
          context, not synchronize it.  From Chris Mason.
      
       2) Ipv4 flow input interface should never be zero, it should be
          LOOPBACK_IFINDEX instead.  From Cong Wang and Julian Anastasov.
      
       3) Properly configure MAC to PHY connection in mvneta devices, from
          Thomas Petazzoni.
      
       4) sys_recv should use SYSCALL_DEFINE.  From Jan Glauber.
      
       5) Tunnel driver ioctls do not use the correct namespace, fix from
          Nicolas Dichtel.
      
       6) Fix memory leak on seccomp filter attach, from Kees Cook.
      
       7) Fix lockdep warning for nested vlans, from Ding Tianhong.
      
       8) Crashes can happen in SCTP due to how the auth_enable value is
          managed, fix from Vlad Yasevich.
      
       9) Wireless fixes from John W Linville and co.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
        net: sctp: cache auth_enable per endpoint
        tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled
        vlan: Fix lockdep warning when vlan dev handle notification
        seccomp: fix memory leak on filter attach
        isdn: icn: buffer overflow in icn_command()
        ip6_tunnel: use the right netns in ioctl handler
        sit: use the right netns in ioctl handler
        ip_tunnel: use the right netns in ioctl handler
        net: use SYSCALL_DEFINEx for sys_recv
        net: mdio-gpio: Add support for separate MDI and MDO gpio pins
        net: mdio-gpio: Add support for active low gpio pins
        net: mdio-gpio: Use devm_ functions where possible
        ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source()
        ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
        mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll
        net: mvneta: properly configure the MAC <-> PHY connection in all situations
        net: phy: add minimal support for QSGMII PHY
        sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)
        mwifiex: fix hung task on command timeout
        mwifiex: process event before command response
        ...
      ebfc45ee
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · 6e66d5da
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "A set of 5 small cifs fixes"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        cif: fix dead code
        cifs: fix error handling cifs_user_readv
        fs: cifs: remove unused variable.
        Return correct error on query of xattr on file with empty xattrs
        cifs: Wait for writebacks to complete before attempting write.
      6e66d5da
    • Linus Torvalds's avatar
      Merge tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 25bfe4f5
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a few driver fixes for char/misc drivers that resolve
        reported issues.
      
        All have been in linux-next successfully for a few days"
      
      * tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
        Tools: hv: Handle the case when the target file exists correctly
        vme_tsi148: Utilize to_pci_dev() macro
        vme_tsi148: Fix PCI address mapping assumption
        vme_tsi148: Fix typo in tsi148_slave_get()
        w1: avoid recursive device_add
        w1: fix netlink refcnt leak on error path
        misc: Grammar s/addition/additional/
        drivers: mcb: fix memory leak in chameleon_parse_cells() error path
        mei: ignore client writing state during cb completion
        mei: me: do not load the driver if the FW doesn't support MEI interface
        GenWQE: Increase driver version number
        GenWQE: Fix multithreading problems
        GenWQE: Ensure rc is not returning an uninitialized value
        GenWQE: Add wmb before DDCB is started
        GenWQE: Enable access to VPD flash area
      25bfe4f5
  5. 18 Apr, 2014 6 commits
    • Linus Torvalds's avatar
      Merge tag 'driver-core-3.15-rc2' of... · 60fbf2bd
      Linus Torvalds authored
      Merge tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fixes from Greg KH:
       "Here are some driver core fixes for 3.15-rc2.  Also in here are some
        documentation updates, as well as an API removal that had to wait for
        after -rc1 due to the cleanups coming into you from multiple developer
        trees (this one and the PPC tree.)
      
        All have been in linux next successfully"
      
      * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers/base/dd.c incorrect pr_debug() parameters
        Documentation: Update stable address in Chinese and Japanese translations
        topology: Fix compilation warning when not in SMP
        Chinese: add translation of io_ordering.txt
        stable_kernel_rules: spelling/word usage
        sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()
        kernfs: protect lazy kernfs_iattrs allocation with mutex
        fs: Don't return 0 from get_anon_bdev
      60fbf2bd
    • Linus Torvalds's avatar
      Merge tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 8cb652bb
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are a few staging driver fixes for issues that have been reported
        for 3.15-rc2.
      
        Also dominating the diffstat for the pull request is the removal of
        the rtl8187se driver.  It's no longer needed in staging as a "real"
        driver for this hardware is now merged in the tree in the "correct"
        location in drivers/net/
      
        All of these patches have been tested in linux-next"
      
      * tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: r8188eu: Fix case where ethtype was never obtained and always be checked against 0
        staging: r8712u: Fix case where ethtype was never obtained and always be checked against 0
        staging: r8188eu: Calling rtw_get_stainfo() with a NULL sta_addr will return NULL
        staging: comedi: fix circular locking dependency in comedi_mmap()
        staging: r8723au: Add missing initialization of change_inx in sort algorithm
        Staging: unisys: use after free in list_for_each()
        staging: unisys: use after free in error messages
        staging: speakup: fix misuse of kstrtol() in handle_goto()
        staging: goldfish: Call free_irq in error path
        staging: delete rtl8187se wireless driver
        staging: rtl8723au: Fix buffer overflow in rtw_get_wfd_ie()
        staging: gs_fpgaboot: remove __TIMESTAMP__ macro
        staging: vme: fix memory leak in vme_user_probe()
        staging: fpgaboot: clean up Makefile
        staging/usbip: fix store_attach() sscanf return value check
        staging/usbip: userspace - fix usbipd SIGSEGV from refresh_exported_devices()
        staging: rtl8188eu: remove spaces, correct counts to unbreak P2P ioctls
        staging/rtl8821ae: Fix OOM handling in _rtl_init_deferred_work()
      8cb652bb
    • Linus Torvalds's avatar
      Merge tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 575a2929
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are a number of small tty/serial driver fixes for 3.15-rc2.  Also
        in here are some Documentation file removals for drivers that we
        removed a long time ago, no need to keep it around any longer.
      
        All of these have been in linux-next for a bit"
      
      * tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "serial: 8250, disable "too much work" messages"
        serial: amba-pl011: fix regression, causing an Oops on rmmod
        tty: Fix help text of SYNCLINK_CS
        tty: fix memleak in alloc_pid
        ttyprintk: Allow built as a module
        ttyprintk: Fix wrong tty_unregister_driver() call in the error path
        serial: 8250, disable "too much work" messages
        Documentation/serial: Delete obsolete driver documentation
        serial: omap: Fix missing pm_runtime_resume handling by simplifying code
        serial_core: Fix pm imbalance on unbind
        serial: pl011: change Rx burst size to half of trigger level
        serial: timberdale: Depend on X86_32
        serial: st-asc: Fix SysRq char handling
        Revert "serial: clps711x: Give a chance to perform useful tasks during wait loop"
        serial_core: Fix conditional start_tx on ring buffer not empty
        serial: efm32: use $vendor,$device scheme for compatible string
        serial: omap: free the wakeup settings in remove
      575a2929
    • Linus Torvalds's avatar
      Merge tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 7e55f81e
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a number of tiny USB fixes and new device ids for 3.15-rc2.
        Nothing major, just issues some people have reported.
      
        All of these have been in linux-next"
      
      * tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        uas: fix deadlocky memory allocations
        uas: fix error handling during scsi_scan()
        uas: fix GFP_NOIO under spinlock
        uwb: adds missing error handling
        USB: cdc-acm: Remove Motorola/Telit H24 serial interfaces from ACM driver
        USB: ohci-jz4740: FEAT_POWER is a port feature, not a hub feature
        USB: ohci-jz4740: Fix uninitialized variable warning
        USB: EHCI: tegra: set txfill_tuning
        usb: ehci-platform: Return immediately from suspend if ehci_suspend fails
        usb: ehci-exynos: Return immediately from suspend if ehci_suspend fails
        USB: fix crash during hotplug of PCI USB controller card
        USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
        usb: usb-common: fix typo for usb_state_string
        USB: usb_wwan: fix handling of missing bulk endpoints
        USB: pl2303: add ids for Hewlett-Packard HP POS pole displays
        USB: cp210x: Add 8281 (Nanotec Plug & Drive)
        usb: option driver, add support for Telit UE910v2
        Revert "USB: serial: add usbid for dell wwan card to sierra.c"
        USB: serial: ftdi_sio: add id for Brainboxes serial cards
      7e55f81e
    • Linus Torvalds's avatar
      Merge branch 'akpm' (incoming from Andrew) · ea2388f2
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "13 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        thp: close race between split and zap huge pages
        mm: fix new kernel-doc warning in filemap.c
        mm: fix CONFIG_DEBUG_VM_RB description
        mm: use paravirt friendly ops for NUMA hinting ptes
        mips: export flush_icache_range
        mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
        wait: explain the shadowing and type inconsistencies
        Shiraz has moved
        Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
        powerpc/mm: fix ".__node_distance" undefined
        kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
        init/Kconfig: move the trusted keyring config option to general setup
        vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
      ea2388f2
    • Kirill A. Shutemov's avatar
      thp: close race between split and zap huge pages · b5a8cad3
      Kirill A. Shutemov authored
      Sasha Levin has reported two THP BUGs[1][2].  I believe both of them
      have the same root cause.  Let's look to them one by one.
      
      The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!".  It's
      BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page().  From my
      testing I see that page_mapcount() is higher than mapcount here.
      
      I think it happens due to race between zap_huge_pmd() and
      page_check_address_pmd().  page_check_address_pmd() misses PMD which is
      under zap:
      
      	CPU0						CPU1
      						zap_huge_pmd()
      						  pmdp_get_and_clear()
      __split_huge_page()
        anon_vma_interval_tree_foreach()
          __split_huge_page_splitting()
            page_check_address_pmd()
              mm_find_pmd()
      	  /*
      	   * We check if PMD present without taking ptl: no
      	   * serialization against zap_huge_pmd(). We miss this PMD,
      	   * it's not accounted to 'mapcount' in __split_huge_page().
      	   */
      	  pmd_present(pmd) == 0
      
        BUG_ON(mapcount != page_mapcount(page)) // CRASH!!!
      
      						  page_remove_rmap(page)
      						    atomic_add_negative(-1, &page->_mapcount)
      
      The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!".
      It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd().
      
      This happens in similar way:
      
      	CPU0						CPU1
      						zap_huge_pmd()
      						  pmdp_get_and_clear()
      						  page_remove_rmap(page)
      						    atomic_add_negative(-1, &page->_mapcount)
      __split_huge_page()
        anon_vma_interval_tree_foreach()
          __split_huge_page_splitting()
            page_check_address_pmd()
              mm_find_pmd()
      	  pmd_present(pmd) == 0	/* The same comment as above */
        /*
         * No crash this time since we already decremented page->_mapcount in
         * zap_huge_pmd().
         */
        BUG_ON(mapcount != page_mapcount(page))
      
        /*
         * We split the compound page here into small pages without
         * serialization against zap_huge_pmd()
         */
        __split_huge_page_refcount()
      						VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!!
      
      So my understanding the problem is pmd_present() check in mm_find_pmd()
      without taking page table lock.
      
      The bug was introduced by me commit with commit 117b0791. Sorry for
      that. :(
      
      Let's open code mm_find_pmd() in page_check_address_pmd() and do the
      check under page table lock.
      
      Note that __page_check_address() does the same for PTE entires
      if sync != 0.
      
      I've stress tested split and zap code paths for 36+ hours by now and
      don't see crashes with the patch applied. Before it took <20 min to
      trigger the first bug and few hours for second one (if we ignore
      first).
      
      [1] https://lkml.kernel.org/g/<53440991.9090001@oracle.com>
      [2] https://lkml.kernel.org/g/<5310C56C.60709@oracle.com>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>	[3.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5a8cad3