1. 22 Nov, 2013 4 commits
    • Tejun Heo's avatar
      memcg: remove cgroup_event->cft · 347c4a87
      Tejun Heo authored
      The only use of cgroup_event->cft is distinguishing "usage_in_bytes"
      and "memsw.usgae_in_bytes" for mem_cgroup_usage_[un]register_event(),
      which can be done by adding an explicit argument to the function and
      implementing two wrappers so that the two cases can be distinguished
      from the function alone.
      
      Remove cgroup_event->cft and the related code including
      [un]register_events() methods.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      347c4a87
    • Tejun Heo's avatar
      cgroup, memcg: move cgroup->event_list[_lock] and event callbacks into memcg · fba94807
      Tejun Heo authored
      cgroup_event is being moved from cgroup core to memcg and the
      implementation is already moved by the previous patch.  This patch
      moves the data fields and callbacks.
      
      * cgroup->event_list[_lock] are moved to mem_cgroup.
      
      * cftype->[un]register_event() are moved to cgroup_event.  This makes
        it impossible for individual cftype definitions to specify their
        event callbacks.  This is worked around by simply hard-coding
        filename to event callback mapping in cgroup_write_event_control().
        This is awkward and inflexible, which is actually desirable given
        that we don't want to grow more usages of this feature.
      
      * eventfd_ctx declaration is removed from cgroup.h, which makes
        vmpressure.h miss eventfd_ctx declaration.  Include eventfd.h from
        vmpressure.h.
      
      v2: Use file name from dentry instead of cftype.  This will allow
          removing all cftype handling in the function.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      fba94807
    • Tejun Heo's avatar
      memcg: cgroup_write_event_control() now knows @css is for memcg · b5557c4c
      Tejun Heo authored
      @css for cgroup_write_event_control() is now always for memcg and the
      target file should be a memcg file too.  Drop code which assumes @css
      is dummy_css and the target file may belong to different subsystems.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      b5557c4c
    • Tejun Heo's avatar
      cgroup, memcg: move cgroup_event implementation to memcg · 79bd9814
      Tejun Heo authored
      cgroup_event is way over-designed and tries to build a generic
      flexible event mechanism into cgroup - fully customizable event
      specification for each user of the interface.  This is utterly
      unnecessary and overboard especially in the light of the planned
      unified hierarchy as there's gonna be single agent.  Simply generating
      events at fixed points, or if that's too restrictive, configureable
      cadence or single set of configureable points should be enough.
      
      Thankfully, memcg is the only user and gets to keep it.  Replacing it
      with something simpler on sane_behavior is strongly recommended.
      
      This patch moves cgroup_event and "cgroup.event_control"
      implementation to mm/memcontrol.c.  Clearing of events on cgroup
      destruction is moved from cgroup_destroy_locked() to
      mem_cgroup_css_offline(), which shouldn't make any noticeable
      difference.
      
      cgroup_css() and __file_cft() are exported to enable the move;
      however, this will soon be reverted once the event code is updated to
      be memcg specific.
      
      Note that "cgroup.event_control" will now exist only on the hierarchy
      with memcg attached to it.  While this change is visible to userland,
      it is unlikely to be noticeable as the file has never been meaningful
      outside memcg.
      
      Aside from the above change, this is pure code relocation.
      
      v2: Per Li Zefan's comments, init/Kconfig updated accordingly and
          poll.h inclusion moved from cgroup.c to memcontrol.c.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      79bd9814
  2. 03 Nov, 2013 3 commits
  3. 02 Nov, 2013 2 commits
  4. 01 Nov, 2013 20 commits
  5. 31 Oct, 2013 11 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (fixes from Andrew Morton) · 4f794ee8
      Linus Torvalds authored
      Merge four more fixes from Andrew Morton.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
        mm: memcg: fix test for child groups
        mm: memcg: lockdep annotation for memcg OOM lock
        mm: memcg: use proper memcg in limit bypass
      4f794ee8
    • Ming Lei's avatar
      lib/scatterlist.c: don't flush_kernel_dcache_page on slab page · 3d77b50c
      Ming Lei authored
      Commit b1adaf65 ("[SCSI] block: add sg buffer copy helper
      functions") introduces two sg buffer copy helpers, and calls
      flush_kernel_dcache_page() on pages in SG list after these pages are
      written to.
      
      Unfortunately, the commit may introduce a potential bug:
      
       - Before sending some SCSI commands, kmalloc() buffer may be passed to
         block layper, so flush_kernel_dcache_page() can see a slab page
         finally
      
       - According to cachetlb.txt, flush_kernel_dcache_page() is only called
         on "a user page", which surely can't be a slab page.
      
       - ARCH's implementation of flush_kernel_dcache_page() may use page
         mapping information to do optimization so page_mapping() will see the
         slab page, then VM_BUG_ON() is triggered.
      
      Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
      and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
      before calling flush_kernel_dcache_page().
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Tested-by: default avatarSimon Baatz <gmbnomis@gmail.com>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@vger.kernel.org>	[3.2+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d77b50c
    • Johannes Weiner's avatar
      mm: memcg: fix test for child groups · 696ac172
      Johannes Weiner authored
      When memcg code needs to know whether any given memcg has children, it
      uses the cgroup child iteration primitives and returns true/false
      depending on whether the iteration loop is executed at least once or
      not.
      
      Because a cgroup's list of children is RCU protected, these primitives
      require the RCU read-lock to be held, which is not the case for all
      memcg callers.  This results in the following splat when e.g.  enabling
      hierarchy mode:
      
        WARNING: CPU: 3 PID: 1 at kernel/cgroup.c:3043 css_next_child+0xa3/0x160()
        CPU: 3 PID: 1 Comm: systemd Not tainted 3.12.0-rc5-00117-g83f11a9c-dirty #18
        Hardware name: LENOVO 3680B56/3680B56, BIOS 6QET69WW (1.39 ) 04/26/2012
        Call Trace:
          dump_stack+0x54/0x74
          warn_slowpath_common+0x78/0xa0
          warn_slowpath_null+0x1a/0x20
          css_next_child+0xa3/0x160
          mem_cgroup_hierarchy_write+0x5b/0xa0
          cgroup_file_write+0x108/0x2a0
          vfs_write+0xbd/0x1e0
          SyS_write+0x4c/0xa0
          system_call_fastpath+0x16/0x1b
      
      In the memcg case, we only care about children when we are attempting to
      modify inheritable attributes interactively.  Racing with deletion could
      mean a spurious -EBUSY, no problem.  Racing with addition is handled
      just fine as well through the memcg_create_mutex: if the child group is
      not on the list after the mutex is acquired, it won't be initialized
      from the parent's attributes until after the unlock.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      696ac172
    • Johannes Weiner's avatar
      mm: memcg: lockdep annotation for memcg OOM lock · 0056f4e6
      Johannes Weiner authored
      The memcg OOM lock is a mutex-type lock that is open-coded due to
      memcg's special needs.  Add annotations for lockdep coverage.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0056f4e6
    • Johannes Weiner's avatar
      mm: memcg: use proper memcg in limit bypass · 3168ecbe
      Johannes Weiner authored
      Commit 84235de3 ("fs: buffer: move allocation failure loop into the
      allocator") allowed __GFP_NOFAIL allocations to bypass the limit if they
      fail to reclaim enough memory for the charge.  But because the main test
      case was on a 3.2-based system, the patch missed the fact that on newer
      kernels the charge function needs to return root_mem_cgroup when
      bypassing the limit, and not NULL.  This will corrupt whatever memory is
      at NULL + percpu pointer offset.  Fix this quickly before problems are
      reported.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3168ecbe
    • Linus Torvalds's avatar
      vfs: decrapify dput(), fix cache behavior under normal load · 358eec18
      Linus Torvalds authored
      We do not want to dirty the dentry->d_flags cacheline in dput() just to
      set the DCACHE_REFERENCED flag when it is already set in the common case
      anyway.  This way the first cacheline of the dentry (which contains the
      RCU lookup information etc) can stay shared among multiple CPU's.
      
      This finishes off some of the details of all the scalability patches
      merged during the merge window.
      
      Also don't mark dentry_kill() for inlining, since it's the uncommon path
      and inlining it just makes the common path slower due to extra function
      entry/exit overhead.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      358eec18
    • Linus Torvalds's avatar
      i915: fix compiler warning · 0baab4fd
      Linus Torvalds authored
      The last i915 drm update brought with it this annoying warning
      
        drivers/gpu/drm/i915/intel_crt.c: In function ‘intel_crt_get_config’:
        drivers/gpu/drm/i915/intel_crt.c:110:21: warning: unused variable ‘dev’ [-Wunused-variable]
          struct drm_device *dev = encoder->base.dev;
                             ^
      
      introduced by commit 7195a50b ("drm/i915: Add HSW CRT output readout
      support").
      
      Remove the offending pointless variable.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0baab4fd
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 52469b4f
      Linus Torvalds authored
      Pull NUMA balancing memory corruption fixes from Ingo Molnar:
       "So these fixes are definitely not something I'd like to sit on, but as
        I said to Mel at the KS the timing is quite tight, with Linus planning
        v3.12-final within a week.
      
        Fedora-19 is affected:
      
         comet:~> grep NUMA_BALANCING /boot/config-3.11.3-201.fc19.x86_64
      
         CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
         CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
         CONFIG_NUMA_BALANCING=y
      
        AFAICS Ubuntu will be affected as well, once it updates the kernel:
      
         hubble:~> grep NUMA_BALANCING /boot/config-3.8.0-32-generic
      
         CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
         CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
         CONFIG_NUMA_BALANCING=y
      
        These 6 commits are a minimalized set of cherry-picks needed to fix
        the memory corruption bugs.  All commits are fixes, except "mm: numa:
        Sanitize task_numa_fault() callsites" which is a cleanup that made two
        followup fixes simpler.
      
        I've done targeted testing with just this SHA1 to try to make sure
        there are no cherry-picking artifacts.  The original non-cherry-picked
        set of fixes were exposed to linux-next for a couple of weeks"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        mm: Account for a THP NUMA hinting update as one PTE update
        mm: Close races between THP migration and PMD numa clearing
        mm: numa: Sanitize task_numa_fault() callsites
        mm: Prevent parallel splits during THP migration
        mm: Wait for THP migrations to complete during NUMA hinting faults
        mm: numa: Do not account for a hinting fault if we raced
      52469b4f
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 026f8f61
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
       "A bit later than I would want, but the changes are very minor - a few
        new device IDs for new hardware in existing drivers, fix for battery
        in Wacom devices not be considered system battery and cause emergency
        hibernations, and a couple of other bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: ALPS - add support for model found on Dell XT2
        Input: wacom - add support for ISDv4 0x10E sensor
        Input: wacom - add support for ISDv4 0x10F sensor
        Input: wacom - export battery scope
        Input: cm109 - convert high volume dev_err() to dev_err_ratelimited()
        Input: move name/timer init to input_alloc_dev()
        Input: i8042 - i8042_flush fix for a full 8042 buffer
        Input: pxa27x_keypad - fix NULL pointer dereference
      026f8f61
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.12-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e7647027
      Linus Torvalds authored
      Pull ACPI and power management fixes from Rafael J Wysocki:
       "Last-minute ACPI and power management fixes for 3.12
      
         - Revert epoll and select commits related to the freezer, introduced
           during the 3.11 cycle, that cause mysterious user space breakage to
           occur during resume from suspend to RAM for multiple users of
           32-bit x86 systems.  Material for 3.11.y stable kernels.
      
         - Revert a recent ACPI-based PCI hotplug (ACPIPHP) commit that was
           part of boot problem fixes for one machine, but turns out to cause
           issues with hotplug on Thunderbolt chains with multiple devices.
           It also turns out to be unnecessary after another fix in the same
           area that went in later.  From Mika Westerberg"
      
      * tag 'pm+acpi-3.12-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "ACPI / hotplug / PCI: Avoid doing too much for spurious notifies"
        Revert "select: use freezable blocking call"
        Revert "epoll: use freezable blocking call"
      e7647027
    • Russell King's avatar
      ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM · a4461f41
      Russell King authored
      Unable to handle kernel NULL pointer dereference at virtual address 00000008
      pgd = d5300000
      [00000008] *pgd=0d265831, *pte=00000000, *ppte=00000000
      Internal error: Oops: 17 [#1] PREEMPT ARM
      CPU: 0 PID: 2295 Comm: vlc Not tainted 3.11.0+ #755
      task: dee74800 ti: e213c000 task.ti: e213c000
      PC is at snd_pcm_info+0xc8/0xd8
      LR is at 0x30232065
      pc : [<c031b52c>]    lr : [<30232065>]    psr: a0070013
      sp : e213dea8  ip : d81cb0d0  fp : c05f7678
      r10: c05f7770  r9 : fffffdfd  r8 : 00000000
      r7 : d8a968a8  r6 : d8a96800  r5 : d8a96200  r4 : d81cb000
      r3 : 00000000  r2 : d81cb000  r1 : 00000001  r0 : d8a96200
      Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      Control: 10c5387d  Table: 15300019  DAC: 00000015
      Process vlc (pid: 2295, stack limit = 0xe213c248)
      [<c031b52c>] (snd_pcm_info) from [<c031b570>] (snd_pcm_info_user+0x34/0x9c)
      [<c031b570>] (snd_pcm_info_user) from [<c03164a4>] (snd_pcm_control_ioctl+0x274/0x280)
      [<c03164a4>] (snd_pcm_control_ioctl) from [<c0311458>] (snd_ctl_ioctl+0xc0/0x55c)
      [<c0311458>] (snd_ctl_ioctl) from [<c00eca84>] (do_vfs_ioctl+0x80/0x31c)
      [<c00eca84>] (do_vfs_ioctl) from [<c00ecd5c>] (SyS_ioctl+0x3c/0x60)
      [<c00ecd5c>] (SyS_ioctl) from [<c000e500>] (ret_fast_syscall+0x0/0x48)
      Code: e1a00005 e59530dc e3a01001 e1a02004 (e5933008)
      ---[ end trace cb3d9bdb8dfefb3c ]---
      
      This is provoked when the ASoC front end is open along with its backend,
      (which causes the backend to have a runtime assigned to it) and then the
      SNDRV_CTL_IOCTL_PCM_INFO is requested for the (visible) backend device.
      
      Resolve this by ensuring that ASoC internal backend devices are not
      visible to userspace, just as the commentry for snd_pcm_new_internal()
      says it should be.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by: default avatarMark Brown <broonie@linaro.org>
      Cc: <stable@vger.kernel.org> [v3.4+]
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      a4461f41