1. 11 Dec, 2014 7 commits
    • Al Viro's avatar
      fs/namei.c: new helper (path_cleanup()) · 893b7775
      Al Viro authored
      All callers of path_init() proceed to do the identical cleanup when
      they are done with nameidata.  Don't open-code it...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      893b7775
    • Al Viro's avatar
    • Al Viro's avatar
      make default ->i_fop have ->open() fail with ENXIO · bd9b51e7
      Al Viro authored
      As it is, default ->i_fop has NULL ->open() (along with all other methods).
      The only case where it matters is reopening (via procfs symlink) a file that
      didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned
      to something sane (default would fail on read/write/ioctl/etc.).
      
      	Unfortunately, such case exists - alloc_file() users, especially
      anon_get_file() ones.  There we have tons of opened files of very different
      kinds sharing the same inode.  As the result, attempt to reopen those via
      procfs succeeds and you get a descriptor you can't do anything with.
      
      	Moreover, in case of sockets we set ->i_fop that will only be used
      on such reopen attempts - and put a failing ->open() into it to make sure
      those do not succeed.
      
      	It would be simpler to put such ->open() into default ->i_fop and leave
      it unchanged both for anon inode (as we do anyway) and for socket ones.  Result:
      	* everything going through do_dentry_open() works as it used to
      	* sock_no_open() kludge is gone
      	* attempts to reopen anon-inode files fail as they really ought to
      	* ditto for aio_private_file()
      	* ditto for perfmon - this one actually tried to imitate sock_no_open()
      trick, but failed to set ->i_fop, so in the current tree reopens succeed and
      yield completely useless descriptor.  Intent clearly had been to fail with
      -ENXIO on such reopens; now it actually does.
      	* everything else that used alloc_file() keeps working - it has ->i_fop
      set for its inodes anyway
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      bd9b51e7
    • Al Viro's avatar
      1f55a6ec
    • Al Viro's avatar
      Merge branch 'nsfs' into for-next · 707c5960
      Al Viro authored
      707c5960
    • Al Viro's avatar
      kill proc_ns completely · 3d3d35b1
      Al Viro authored
      procfs inodes need only the ns_ops part; nsfs inodes don't need it at all
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3d3d35b1
    • Al Viro's avatar
      take the targets of /proc/*/ns/* symlinks to separate fs · e149ed2b
      Al Viro authored
      New pseudo-filesystem: nsfs.  Targets of /proc/*/ns/* live there now.
      It's not mountable (not even registered, so it's not in /proc/filesystems,
      etc.).  Files on it *are* bindable - we explicitly permit that in do_loopback().
      
      This stuff lives in fs/nsfs.c now; proc_ns_fget() moved there as well.
      get_proc_ns() is a macro now (it's simply returning ->i_private; would
      have been an inline, if not for header ordering headache).
      proc_ns_inode() is an ex-parrot.  The interface used in procfs is
      ns_get_path(path, task, ops) and ns_get_name(buf, size, task, ops).
      
      Dentries and inodes are never hashed; a non-counting reference to dentry
      is stashed in ns_common (removed by ->d_prune()) and reused by ns_get_path()
      if present.  See ns_get_path()/ns_prune_dentry/nsfs_evict() for details
      of that mechanism.
      
      As the result, proc_ns_follow_link() has stopped poking in nd->path.mnt;
      it does nd_jump_link() on a consistent <vfsmount,dentry> pair it gets
      from ns_get_path().
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e149ed2b
  2. 09 Dec, 2014 5 commits
  3. 04 Dec, 2014 8 commits
  4. 27 Nov, 2014 9 commits
  5. 23 Nov, 2014 11 commits
    • Linus Torvalds's avatar
      Linux 3.18-rc6 · 5d01410f
      Linus Torvalds authored
      5d01410f
    • Andy Lutomirski's avatar
      uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME · 82975bc6
      Andy Lutomirski authored
      x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
      not on non-paranoid returns.  I suspect that this is a mistake and that
      the code only works because int3 is paranoid.
      
      Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
      for the x86 bug.  With that bug fixed, we can remove _TIF_NOTIFY_RESUME
      from the uprobes code.
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      82975bc6
    • Thomas Gleixner's avatar
      sched: Provide update_curr callbacks for stop/idle scheduling classes · 90e362f4
      Thomas Gleixner authored
      Chris bisected a NULL pointer deference in task_sched_runtime() to
      commit 6e998916 'sched/cputime: Fix clock_nanosleep()/clock_gettime()
      inconsistency'.
      
      Chris observed crashes in atop or other /proc walking programs when he
      started fork bombs on his machine.  He assumed that this is a new exit
      race, but that does not make any sense when looking at that commit.
      
      What's interesting is that, the commit provides update_curr callbacks
      for all scheduling classes except stop_task and idle_task.
      
      While nothing can ever hit that via the clock_nanosleep() and
      clock_gettime() interfaces, which have been the target of the commit in
      question, the author obviously forgot that there are other code paths
      which invoke task_sched_runtime()
      
      do_task_stat(()
       thread_group_cputime_adjusted()
         thread_group_cputime()
           task_cputime()
             task_sched_runtime()
              if (task_current(rq, p) && task_on_rq_queued(p)) {
                update_rq_clock(rq);
                up->sched_class->update_curr(rq);
              }
      
      If the stats are read for a stomp machine task, aka 'migration/N' and
      that task is current on its cpu, this will happily call the NULL pointer
      of stop_task->update_curr.  Ooops.
      
      Chris observation that this happens faster when he runs the fork bomb
      makes sense as the fork bomb will kick migration threads more often so
      the probability to hit the issue will increase.
      
      Add the missing update_curr callbacks to the scheduler classes stop_task
      and idle_task.  While idle tasks cannot be monitored via /proc we have
      other means to hit the idle case.
      
      Fixes: 6e998916 'sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency'
      Reported-by: default avatarChris Mason <clm@fb.com>
      Reported-and-tested-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      90e362f4
    • Linus Torvalds's avatar
      Merge branch 'x86-traps' (trap handling from Andy Lutomirski) · 00c89b2f
      Linus Torvalds authored
      Merge x86-64 iret fixes from Andy Lutomirski:
       "This addresses the following issues:
      
         - an unrecoverable double-fault triggerable with modify_ldt.
         - invalid stack usage in espfix64 failed IRET recovery from IST
           context.
         - invalid stack usage in non-espfix64 failed IRET recovery from IST
           context.
      
        It also makes a good but IMO scary change: non-espfix64 failed IRET
        will now report the correct error.  Hopefully nothing depended on the
        old incorrect behavior, but maybe Wine will get confused in some
        obscure corner case"
      
      * emailed patches from Andy Lutomirski <luto@amacapital.net>:
        x86_64, traps: Rework bad_iret
        x86_64, traps: Stop using IST for #SS
        x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
      00c89b2f
    • Andy Lutomirski's avatar
      x86_64, traps: Rework bad_iret · b645af2d
      Andy Lutomirski authored
      It's possible for iretq to userspace to fail.  This can happen because
      of a bad CS, SS, or RIP.
      
      Historically, we've handled it by fixing up an exception from iretq to
      land at bad_iret, which pretends that the failed iret frame was really
      the hardware part of #GP(0) from userspace.  To make this work, there's
      an extra fixup to fudge the gs base into a usable state.
      
      This is suboptimal because it loses the original exception.  It's also
      buggy because there's no guarantee that we were on the kernel stack to
      begin with.  For example, if the failing iret happened on return from an
      NMI, then we'll end up executing general_protection on the NMI stack.
      This is bad for several reasons, the most immediate of which is that
      general_protection, as a non-paranoid idtentry, will try to deliver
      signals and/or schedule from the wrong stack.
      
      This patch throws out bad_iret entirely.  As a replacement, it augments
      the existing swapgs fudge into a full-blown iret fixup, mostly written
      in C.  It's should be clearer and more correct.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b645af2d
    • Andy Lutomirski's avatar
      x86_64, traps: Stop using IST for #SS · 6f442be2
      Andy Lutomirski authored
      On a 32-bit kernel, this has no effect, since there are no IST stacks.
      
      On a 64-bit kernel, #SS can only happen in user code, on a failed iret
      to user space, a canonical violation on access via RSP or RBP, or a
      genuine stack segment violation in 32-bit kernel code.  The first two
      cases don't need IST, and the latter two cases are unlikely fatal bugs,
      and promoting them to double faults would be fine.
      
      This fixes a bug in which the espfix64 code mishandles a stack segment
      violation.
      
      This saves 4k of memory per CPU and a tiny bit of code.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6f442be2
    • Andy Lutomirski's avatar
      x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C · af726f21
      Andy Lutomirski authored
      There's nothing special enough about the espfix64 double fault fixup to
      justify writing it in assembly.  Move it to C.
      
      This also fixes a bug: if the double fault came from an IST stack, the
      old asm code would return to a partially uninitialized stack frame.
      
      Fixes: 3891a04aSigned-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      af726f21
    • Linus Torvalds's avatar
      Merge tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 27946315
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A collection of fixes this week:
      
         - A set of clock fixes for shmobile platforms
         - A fix for tegra that moves serial port labels to be per board.
           We're choosing to merge this for 3.18 because the labels will start
           being parsed in 3.19, and without this change serial port numbers
           that used to be stable since the dawn of time will change numbers.
         - A few other DT tweaks for Tegra.
         - A fix for multi_v7_defconfig that makes it stop spewing cpufreq
           errors on Arndale (Exynos)"
      
      * tag 'armsoc-for-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: multi_v7_defconfig: fix failure setting CPU voltage by enabling dependent I2C controller
        ARM: tegra: roth: Fix SD card VDD_IO regulator
        ARM: tegra: Remove eMMC vmmc property for roth/tn7
        ARM: dts: tegra: move serial aliases to per-board
        ARM: tegra: Add serial port labels to Tegra124 DT
        ARM: shmobile: kzm9g legacy: Set i2c clks_per_count to 2
        ARM: shmobile: r8a7740 dtsi: Correct IIC0 parent clock
        ARM: shmobile: r8a7790: Fix SD3CKCR address to device tree
        ARM: shmobile: r8a7740 legacy: Correct IIC0 parent clock
        ARM: shmobile: r8a7740 legacy: Add missing INTCA clock for irqpin module
        ARM: shmobile: r8a7790: Fix SD3CKCR address
        ARM: dts: sun6i: Re-parent ahb1_mux to pll6 as required by dma controller
      27946315
    • Linus Torvalds's avatar
      Merge branch 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu · 9f2e0f63
      Linus Torvalds authored
      Pull percpu fix from Tejun Heo:
       "This contains one patch to fix a race condition which can lead to
        percpu_ref using a percpu pointer which is corrupted with a set DEAD
        bit.  The bug was introduced while separating out the ATOMIC mode flag
        from the DEAD flag.  The fix is pretty straight forward.
      
        I just committed the patch to the percpu tree but am sending out the
        pull request early as I'll be on vacation for a week.  The patch
        should be fairly safe and while the latency will be higher I'll be
        checking emails"
      
      * 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
        percpu-ref: fix DEAD flag contamination of percpu pointer
      9f2e0f63
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · d038a63a
      Linus Torvalds authored
      Pull btrfs deadlock fix from Chris Mason:
       "This has a fix for a long standing deadlock that we've been trying to
        nail down for a while.  It ended up being a bad interaction with the
        fair reader/writer locks and the order btrfs reacquires locks in the
        btree"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs: fix lockups from btrfs_clear_path_blocking
      d038a63a
    • Tejun Heo's avatar
      percpu-ref: fix DEAD flag contamination of percpu pointer · 4aab3b5b
      Tejun Heo authored
      While decoupling ATOMIC and DEAD flags, f47ad457 ("percpu_ref:
      decouple switching to percpu mode and reinit") updated
      __ref_is_percpu() so that it only tests ATOMIC flag to determine
      whether the ref is in percpu mode or not; however, while DEAD implies
      ATOMIC, the two flags are set separately during percpu_ref_kill() and
      if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
      ATOMIC.  Because __ref_is_percpu() returns @ref->percpu_count_ptr
      value verbatim as the percpu pointer after testing ATOMIC, the pointer
      may now be contaminated with the DEAD flag.
      
      This can be fixed by clearing the flag bits before returning the
      pointer which was the fix proposed by Shaohua; however, as DEAD
      implies ATOMIC, we can just test for both flags at once and avoid the
      explicit masking.
      
      Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
      are clear before returning @ref->percpu_count_ptr as the percpu
      pointer.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-Reviewed-by: default avatarShaohua Li <shli@kernel.org>
      Link: http://lkml.kernel.org/r/995deb699f5b873c45d667df4add3b06f73c2c25.1416638887.git.shli@kernel.org
      Fixes: f47ad457 ("percpu_ref: decouple switching to percpu mode and reinit")
      4aab3b5b