1. 13 Feb, 2014 1 commit
    • Tejun Heo's avatar
      Revert "cgroup: use an ordered workqueue for cgroup destruction" · 1a11533f
      Tejun Heo authored
      This reverts commit ab3f5faa.
      Explanation from Hugh:
      
        It's because more thorough testing, by others here, found that it
        wasn't always solving the problem: so I asked Tejun privately to
        hold off from sending it in, until we'd worked out why not.
      
        Most of our testing being on a v3,11-based kernel, it was perfectly
        possible that the problem was merely our own e.g. missing Tejun's
        8a2b7538 ("workqueue: fix ordered workqueues in NUMA setups").
      
        But that turned out not to be enough to fix it either. Then Filipe
        pointed out how percpu_ref_kill_and_confirm() uses call_rcu_sched()
        before we ever get to put the offline on to the workqueue: by the
        time we get to the workqueue, the ordering has already been lost.
      
        So, thanks for the Acks, but I'm afraid that this ordered workqueue
        solution is just not good enough: we should simply forget that patch
        and provide a different answer."
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      1a11533f
  2. 11 Feb, 2014 1 commit
    • Li Zefan's avatar
      cgroup: protect modifications to cgroup_idr with cgroup_mutex · 0ab02ca8
      Li Zefan authored
      Setup cgroupfs like this:
        # mount -t cgroup -o cpuacct xxx /cgroup
        # mkdir /cgroup/sub1
        # mkdir /cgroup/sub2
      
      Then run these two commands:
        # for ((; ;)) { mkdir /cgroup/sub1/tmp && rmdir /mnt/sub1/tmp; } &
        # for ((; ;)) { mkdir /cgroup/sub2/tmp && rmdir /mnt/sub2/tmp; } &
      
      After seconds you may see this warning:
      
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0()
      idr_remove called for id=6 which is not allocated.
      ...
      Call Trace:
       [<ffffffff8156063c>] dump_stack+0x7a/0x96
       [<ffffffff810591ac>] warn_slowpath_common+0x8c/0xc0
       [<ffffffff81059296>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff81300aa7>] sub_remove+0x87/0x1b0
       [<ffffffff810f3f02>] ? css_killed_work_fn+0x32/0x1b0
       [<ffffffff81300bf5>] idr_remove+0x25/0xd0
       [<ffffffff810f2bab>] cgroup_destroy_css_killed+0x5b/0xc0
       [<ffffffff810f4000>] css_killed_work_fn+0x130/0x1b0
       [<ffffffff8107cdbc>] process_one_work+0x26c/0x550
       [<ffffffff8107eefe>] worker_thread+0x12e/0x3b0
       [<ffffffff81085f96>] kthread+0xe6/0xf0
       [<ffffffff81570bac>] ret_from_fork+0x7c/0xb0
      ---[ end trace 2d1577ec10cf80d0 ]---
      
      It's because allocating/removing cgroup ID is not properly synchronized.
      
      The bug was introduced when we converted cgroup_ida to cgroup_idr.
      While synchronization is already done inside ida_simple_{get,remove}(),
      users are responsible for concurrent calls to idr_{alloc,remove}().
      
      tj: Refreshed on top of b58c8998 ("cgroup: fix error return from
      cgroup_create()").
      
      Fixes: 4e96ee8e ("cgroup: convert cgroup_ida to cgroup_idr")
      Cc: <stable@vger.kernel.org> #3.12+
      Reported-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      0ab02ca8
  3. 08 Feb, 2014 3 commits
    • Tejun Heo's avatar
      cgroup: fix locking in cgroup_cfts_commit() · 48573a89
      Tejun Heo authored
      cgroup_cfts_commit() walks the cgroup hierarchy that the target
      subsystem is attached to and tries to apply the file changes.  Due to
      the convolution with inode locking, it can't keep cgroup_mutex locked
      while iterating.  It currently holds only RCU read lock around the
      actual iteration and then pins the found cgroup using dget().
      
      Unfortunately, this is incorrect.  Although the iteration does check
      cgroup_is_dead() before invoking dget(), there's nothing which
      prevents the dentry from going away inbetween.  Note that this is
      different from the usual css iterations where css_tryget() is used to
      pin the css - css_tryget() tests whether the css can be pinned and
      fails if not.
      
      The problem can be solved by simply holding cgroup_mutex instead of
      RCU read lock around the iteration, which actually reduces LOC.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      48573a89
    • Tejun Heo's avatar
      cgroup: fix error return from cgroup_create() · b58c8998
      Tejun Heo authored
      cgroup_create() was returning 0 after allocation failures.  Fix it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      b58c8998
    • Tejun Heo's avatar
      cgroup: fix error return value in cgroup_mount() · eb46bf89
      Tejun Heo authored
      When cgroup_mount() fails to allocate an id for the root, it didn't
      set ret before jumping to unlock_drop ending up returning 0 after a
      failure.  Fix it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      eb46bf89
  4. 07 Feb, 2014 1 commit
    • Hugh Dickins's avatar
      cgroup: use an ordered workqueue for cgroup destruction · ab3f5faa
      Hugh Dickins authored
      Sometimes the cleanup after memcg hierarchy testing gets stuck in
      mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.
      
      There may turn out to be several causes, but a major cause is this: the
      workitem to offline parent can get run before workitem to offline child;
      parent's mem_cgroup_reparent_charges() circles around waiting for the
      child's pages to be reparented to its lrus, but it's holding cgroup_mutex
      which prevents the child from reaching its mem_cgroup_reparent_charges().
      
      Just use an ordered workqueue for cgroup_destroy_wq.
      
      tj: Committing as the temporary fix until the reverse dependency can
          be removed from memcg.  Comment updated accordingly.
      
      Fixes: e5fca243 ("cgroup: use a dedicated workqueue for cgroup destruction")
      Suggested-by: default avatarFilipe Brandenburger <filbranden@google.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      ab3f5faa
  5. 03 Feb, 2014 7 commits
    • Tejun Heo's avatar
      nfs: include xattr.h from fs/nfs/nfs3proc.c · 0a6be655
      Tejun Heo authored
      fs/nfs/nfs3proc.c is making use of xattr but was getting linux/xattr.h
      indirectly through linux/cgroup.h, which will soon drop the inclusion
      of xattr.h.  Explicitly include linux/xattr.h from nfs3proc.c so that
      compilation doesn't fail when linux/cgroup.h drops linux/xattr.h.
      
      As the following cgroup changes will depend on these changes, it
      probably would be easier to route this through cgroup branch.  Would
      that be okay?
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Cc: linux-nfs@vger.kernel.org
      0a6be655
    • Li Zefan's avatar
      cpuset: update MAINTAINERS entry · 230579d7
      Li Zefan authored
      Add mailing list and tree tag to the entry.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      230579d7
    • Tejun Heo's avatar
      arm, pm, vmpressure: add missing slab.h includes · 1ff6bbfd
      Tejun Heo authored
      arch/arm/mach-tegra/pm.c, kernel/power/console.c and mm/vmpressure.c
      were somehow getting slab.h indirectly through cgroup.h which in turn
      was getting it indirectly through xattr.h.  A scheduled cgroup change
      drops xattr.h inclusion from cgroup.h and breaks compilation of these
      three files.  Add explicit slab.h includes to the three files.
      
      A pending cgroup patch depends on this change and it'd be great if
      this can be routed through cgroup/for-3.14-fixes branch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarStephen Warren <swarren@wwwdotorg.org>
      Cc: Thierry Reding <thierry.reding@gmail.com>
      Cc: linux-tegra@vger.kernel.org
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: cgroups@vger.kernel.org
      1ff6bbfd
    • Linus Torvalds's avatar
      Linus 3.14-rc1 · 38dbfb59
      Linus Torvalds authored
      38dbfb59
    • Linus Torvalds's avatar
      Merge branch 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 69048e01
      Linus Torvalds authored
      Pull parisc updates from Helge Deller:
       "The three major changes in this patchset is a implementation for
        flexible userspace memory maps, cache-flushing fixes (again), and a
        long-discussed ABI change to make EWOULDBLOCK the same value as
        EAGAIN.
      
        parisc has been the only platform where we had EWOULDBLOCK != EAGAIN
        to keep HP-UX compatibility.  Since we will probably never implement
        full HP-UX support, we prefer to drop this compatibility to make it
        easier for us with Linux userspace programs which mostly never checked
        for both values.  We don't expect major fall-outs because of this
        change, and if we face some, we will simply rebuild the necessary
        applications in the debian archives"
      
      * 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: add flexible mmap memory layout support
        parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc
        parisc: convert uapi/asm/stat.h to use native types only
        parisc: wire up sched_setattr and sched_getattr
        parisc: fix cache-flushing
        parisc/sti_console: prefer Linux fonts over built-in ROM fonts
      69048e01
    • Mikulas Patocka's avatar
      hpfs: optimize quad buffer loading · 1c0b8a7a
      Mikulas Patocka authored
      HPFS needs to load 4 consecutive 512-byte sectors when accessing the
      directory nodes or bitmaps.  We can't switch to 2048-byte block size
      because files are allocated in the units of 512-byte sectors.
      
      Previously, the driver would allocate a 2048-byte area using kmalloc,
      copy the data from four buffers to this area and eventually copy them
      back if they were modified.
      
      In the current implementation of the buffer cache, buffers are allocated
      in the pagecache.  That means that 4 consecutive 512-byte buffers are
      stored in consecutive areas in the kernel address space.  So, we don't
      need to allocate extra memory and copy the content of the buffers there.
      
      This patch optimizes the code to avoid copying the buffers.  It checks
      if the four buffers are stored in contiguous memory - if they are not,
      it falls back to allocating a 2048-byte area and copying data there.
      Signed-off-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c0b8a7a
    • Mikulas Patocka's avatar
      hpfs: remember free space · 2cbe5c76
      Mikulas Patocka authored
      Previously, hpfs scanned all bitmaps each time the user asked for free
      space using statfs.  This patch changes it so that hpfs scans the
      bitmaps only once, remembes the free space and on next invocation of
      statfs it returns the value instantly.
      
      New versions of wine are hammering on the statfs syscall very heavily,
      making some games unplayable when they're stored on hpfs, with load
      times in minutes.
      
      This should be backported to the stable kernels because it fixes
      user-visible problem (excessive level load times in wine).
      Signed-off-by: default avatarMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2cbe5c76
  6. 02 Feb, 2014 12 commits
  7. 01 Feb, 2014 12 commits
  8. 31 Jan, 2014 3 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 8a1f006a
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights:
      
         - Fix several races in nfs_revalidate_mapping
         - NFSv4.1 slot leakage in the pNFS files driver
         - Stable fix for a slot leak in nfs40_sequence_done
         - Don't reject NFSv4 servers that support ACLs with only ALLOW aces"
      
      * tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        nfs: initialize the ACL support bits to zero.
        NFSv4.1: Cleanup
        NFSv4.1: Clean up nfs41_sequence_done
        NFSv4: Fix a slot leak in nfs40_sequence_done
        NFSv4.1 free slot before resending I/O to MDS
        nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING
        NFS: Fix races in nfs_revalidate_mapping
        sunrpc: turn warn_gssd() log message into a dprintk()
        NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping
        nfs: handle servers that support only ALLOW ACE type.
      8a1f006a
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 14864a52
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "The big chunks here are the updates for oxygen driver for Xonar DG
        devices, which were slipped from the previous pull request.  They are
        device-specific and thus not too dangerous.
      
        Other than that, all patches are small bug fixes, mainly for Samsung
        build fixes, a few HD-audio enhancements, and other misc ASoC fixes.
        (And this time ASoC merge is less than Octopus, lucky seven :)"
      
      * tag 'sound-fix-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (42 commits)
        ALSA: hda/hdmi - allow PIN_OUT to be dynamically enabled
        ALSA: hda - add headset mic detect quirks for another Dell laptop
        ALSA: oxygen: Xonar DG(X): cleanup and minor changes
        ALSA: oxygen: Xonar DG(X): modify high-pass filter control
        ALSA: oxygen: Xonar DG(X): modify input select functions
        ALSA: oxygen: Xonar DG(X): modify capture volume functions
        ALSA: oxygen: Xonar DG(X): use headphone volume control
        ALSA: oxygen: Xonar DG(X): modify playback output select
        ALSA: oxygen: Xonar DG(X): capture from I2S channel 1, not 2
        ALSA: oxygen: Xonar DG(X): move the mixer code into another file
        ALSA: oxygen: modify CS4245 register dumping function
        ALSA: oxygen: modify adjust_dg_dac_routing function
        ALSA: oxygen: Xonar DG(X): modify DAC/ADC parameters function
        ALSA: oxygen: Xonar DG(X): modify initialization functions
        ALSA: oxygen: Xonar DG(X): add new CS4245 SPI functions
        ALSA: oxygen: additional definitions for the Xonar DG/DGX card
        ALSA: oxygen: change description of the xonar_dg.c file
        ALSA: oxygen: export oxygen_update_dac_routing symbol
        ALSA: oxygen: add mute mask for the OXYGEN_PLAY_ROUTING register
        ALSA: oxygen: modify the SPI writing function
        ...
      14864a52
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 4e13c5d0
      Linus Torvalds authored
      Pull SCSI target updates from Nicholas Bellinger:
       "The highlights this round include:
      
        - add support for SCSI Referrals (Hannes)
        - add support for T10 DIF into target core (nab + mkp)
        - add support for T10 DIF emulation in FILEIO + RAMDISK backends (Sagi + nab)
        - add support for T10 DIF -> bio_integrity passthrough in IBLOCK backend (nab)
        - prep changes to iser-target for >= v3.15 T10 DIF support (Sagi)
        - add support for qla2xxx N_Port ID Virtualization - NPIV (Saurav + Quinn)
        - allow percpu_ida_alloc() to receive task state bitmask (Kent)
        - fix >= v3.12 iscsi-target session reset hung task regression (nab)
        - fix >= v3.13 percpu_ref se_lun->lun_ref_active race (nab)
        - fix a long-standing network portal creation race (Andy)"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
        target: Fix percpu_ref_put race in transport_lun_remove_cmd
        target/iscsi: Fix network portal creation race
        target: Report bad sector in sense data for DIF errors
        iscsi-target: Convert gfp_t parameter to task state bitmask
        iscsi-target: Fix connection reset hang with percpu_ida_alloc
        percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask
        iscsi-target: Pre-allocate more tags to avoid ack starvation
        qla2xxx: Configure NPIV fc_vport via tcm_qla2xxx_npiv_make_lport
        qla2xxx: Enhancements to enable NPIV support for QLOGIC ISPs with TCM/LIO.
        qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
        IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine
        IB/isert: Move fastreg descriptor creation to a function
        IB/isert: Avoid frwr notation, user fastreg
        IB/isert: seperate connection protection domains and dma MRs
        tcm_loop: Enable DIF/DIX modes in SCSI host LLD
        target/rd: Add DIF protection into rd_execute_rw
        target/rd: Add support for protection SGL setup + release
        target/rd: Refactor rd_build_device_space + rd_release_device_space
        target/file: Add DIF protection support to fd_execute_rw
        target/file: Add DIF protection init/format support
        ...
      4e13c5d0