1. 04 Sep, 2015 22 commits
    • Robert Jarzmik's avatar
      scripts: decode_stacktrace: fix ARM architecture decoding · e260fe01
      Robert Jarzmik authored
      Fix the stack decoder for the ARM architecture.
      An ARM stack is designed as :
      
      [   81.547704] [<c023eb04>] (bucket_find_contain) from [<c023ec88>] (check_sync+0x40/0x4f8)
      [   81.559668] [<c023ec88>] (check_sync) from [<c023f8c4>] (debug_dma_sync_sg_for_cpu+0x128/0x194)
      [   81.571583] [<c023f8c4>] (debug_dma_sync_sg_for_cpu) from [<c0327dec>] (__videobuf_s
      
      The current script doesn't expect the symbols to be bound by
      parenthesis, and triggers the following errors :
      
        awk: cmd. line:1: error: Unmatched ( or \(: / (check_sync$/
        [   81.547704] (bucket_find_contain) from (check_sync+0x40/0x4f8)
      
      Fix it by chopping starting and ending parenthesis from the each symbol
      name.
      
      As a side note, this probably comes from the function
      dump_backtrace_entry(), which is implemented differently for each
      architecture.  That makes a single decoding script a bit a challenge.
      Signed-off-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e260fe01
    • Jean Delvare's avatar
      scripts/Lindent: handle missing indent gracefully · fa70900e
      Jean Delvare authored
      If indent is not found, bail out immediately instead of spitting random
      shell script error messages.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fa70900e
    • Bart Van Assche's avatar
      kerneldoc: Convert error messages to GNU error message format · d40e1e65
      Bart Van Assche authored
      Editors like emacs and vi recognize a number of error message formats.
      The format used by the kerneldoc tool is not recognized by emacs.
      
      Change the kerneldoc error message format to the GNU style such that the
      emacs prev-error and next-error commands can be used to navigate through
      kerneldoc error messages.  For more information about the GNU error
      message format, see also
        https://www.gnu.org/prep/standards/html_node/Errors.html.
      
      This patch has been generated via the following sed command:
      
        sed -i.orig 's/Error(\${file}:\$.):/\${file}:\$.: error:/g;s/Warning(\${file}:\$.):/\${file}:\$.: warning:/g;s/Warning(\${file}):/\${file}:1: warning:/g;s/Info(\${file}:\$.):/\${file}:\$.: info:/g' scripts/kernel-doc
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d40e1e65
    • Sudip Mukherjee's avatar
      scripts/spelling.txt: spelling of uninitialized · c22b6ae6
      Sudip Mukherjee authored
      I just did a spelling mistake of uninitialized and wrote that as
      unintialized.  Fortunately I noticed it in my final review.
      Signed-off-by: default avatarSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c22b6ae6
    • Maninder Singh's avatar
      scripts/spelling.txt: add misspelled words for check · 779a6ce8
      Maninder Singh authored
      misspelled words for check:-
       chcek
       chck
       cehck
      
      I myself did these spell mistakes in changelog for patches, Thus
      suggesting to add in spelling.txt, so that checkpatch.pl warns it
      earlier.  References:-
      
      ./arch/powerpc/kernel/exceptions-64e.S:456: . . . make sure you chcek
      https://lkml.org/lkml/2015/6/25/289
      ./arch/x86/mm/pageattr.c:1368: * No need to cehck in that case
      
      [akpm@linux-foundation.org: add whcih->which, whcih I always get wrong]
      Signed-off-by: default avatarManinder Singh <maninder1.s@samsung.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      779a6ce8
    • Jan Kara's avatar
      fsnotify: get rid of fsnotify_destroy_mark_locked() · 4712e722
      Jan Kara authored
      fsnotify_destroy_mark_locked() is subtle to use because it temporarily
      releases group->mark_mutex.  To avoid future problems with this
      function, split it into two.
      
      fsnotify_detach_mark() is the part that needs group->mark_mutex and
      fsnotify_free_mark() is the part that must be called outside of
      group->mark_mutex.  This way it's much clearer what's going on and we
      also avoid some pointless acquisitions of group->mark_mutex.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4712e722
    • Jan Kara's avatar
      fsnotify: remove mark->free_list · 925d1132
      Jan Kara authored
      Free list is used when all marks on given inode / mount should be
      destroyed when inode / mount is going away.  However we can free all of
      the marks without using a special list with some care.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      925d1132
    • Jan Kara's avatar
      fsnotify: document mark locking · 1e39fc01
      Jan Kara authored
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e39fc01
    • Jan Kara's avatar
      fsnotify: fix check in inotify fdinfo printing · 3c53e514
      Jan Kara authored
      A check in inotify_fdinfo() checking whether mark is valid was always
      true due to a bug.  Luckily we can never get to invalidated marks since
      we hold mark_mutex and invalidated marks get removed from the group list
      when they are invalidated under that mutex.
      
      Anyway fix the check to make code more future proof.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3c53e514
    • Dave Hansen's avatar
      fs/notify: optimize inotify/fsnotify code for unwatched files · 7c49b861
      Dave Hansen authored
      I have a _tiny_ microbenchmark that sits in a loop and writes single
      bytes to a file.  Writing one byte to a tmpfs file is around 2x slower
      than reading one byte from a file, which is a _bit_ more than I expecte.
      This is a dumb benchmark, but I think it's hard to deny that write() is
      a hot path and we should avoid unnecessary overhead there.
      
      I did a 'perf record' of 30-second samples of read and write.  The top
      item in a diffprofile is srcu_read_lock() from fsnotify().  There are
      active inotify fd's from systemd, but nothing is actually listening to
      the file or its part of the filesystem.
      
      I *think* we can avoid taking the srcu_read_lock() for the common case
      where there are no actual marks on the file.  This means that there will
      both be nothing to notify for *and* implies that there is no need for
      clearing the ignore mask.
      
      This patch gave a 13.1% speedup in writes/second on my test, which is an
      improvement from the 10.8% that I saw with the last version.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Robert Love <rlove@rlove.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7c49b861
    • Yuriy Kolerov's avatar
      drivers/video/concole: add negative dependency for VGA_CONSOLE on ARC · 031e29b5
      Yuriy Kolerov authored
      Architectures which support VGA console must define screen_info
      structurture from "uapi/linux/screen_info.h".  Otherwise undefined
      symbol error occurs.  Usually it's defined in "setup.c" for each
      architecture.
      
      If an architecture does not support VGA console (ARC's case) there are 2
      ways: define a dummy instance of screen_info or add a negative
      dependency for VGA_CONSOLE in to prevent selecting this option.
      
      I've implemented the second way.  However the best solution is to add
      HAVE_VGA_CONSOLE option for targets which support VGA console.  Then
      turn off VGA_CONSOLE by default and add dependency to HAVE_VGA_CONSOLE.
      But right now it's better to just add a negative dependency for ARC and
      then consider how to collaborate about this issue with maintainers of
      other architectures.
      Signed-off-by: default avatarYuriy Kolerov <yuriy.kolerov@synopsys.com>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Jaya Kumar <jayalk@intworks.biz>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      031e29b5
    • Andy Lutomirski's avatar
      capabilities: add a securebit to disable PR_CAP_AMBIENT_RAISE · 746bf6d6
      Andy Lutomirski authored
      Per Andrew Morgan's request, add a securebit to allow admins to disable
      PR_CAP_AMBIENT_RAISE.  This securebit will prevent processes from adding
      capabilities to their ambient set.
      
      For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than
      just disabling setting previously cleared bits.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarAndrew G. Morgan <morgan@kernel.org>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Aaron Jones <aaronmdjones@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew G. Morgan <morgan@kernel.org>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: Markku Savela <msa@moth.iki.fi>
      Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      746bf6d6
    • Andy Lutomirski's avatar
      selftests/capabilities: Add tests for capability evolution · 32ae976e
      Andy Lutomirski authored
      This test focuses on ambient capabilities.  It requires either root or
      the ability to create user namespaces.  Some of the test cases will be
      skipped for nonroot users.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Christoph Lameter <cl@linux.com> # Original author
      Cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      32ae976e
    • Andy Lutomirski's avatar
      capabilities: ambient capabilities · 58319057
      Andy Lutomirski authored
      Credit where credit is due: this idea comes from Christoph Lameter with
      a lot of valuable input from Serge Hallyn.  This patch is heavily based
      on Christoph's patch.
      
      ===== The status quo =====
      
      On Linux, there are a number of capabilities defined by the kernel.  To
      perform various privileged tasks, processes can wield capabilities that
      they hold.
      
      Each task has four capability masks: effective (pE), permitted (pP),
      inheritable (pI), and a bounding set (X).  When the kernel checks for a
      capability, it checks pE.  The other capability masks serve to modify
      what capabilities can be in pE.
      
      Any task can remove capabilities from pE, pP, or pI at any time.  If a
      task has a capability in pP, it can add that capability to pE and/or pI.
      If a task has CAP_SETPCAP, then it can add any capability to pI, and it
      can remove capabilities from X.
      
      Tasks are not the only things that can have capabilities; files can also
      have capabilities.  A file can have no capabilty information at all [1].
      If a file has capability information, then it has a permitted mask (fP)
      and an inheritable mask (fI) as well as a single effective bit (fE) [2].
      File capabilities modify the capabilities of tasks that execve(2) them.
      
      A task that successfully calls execve has its capabilities modified for
      the file ultimately being excecuted (i.e.  the binary itself if that
      binary is ELF or for the interpreter if the binary is a script.) [3] In
      the capability evolution rules, for each mask Z, pZ represents the old
      value and pZ' represents the new value.  The rules are:
      
        pP' = (X & fP) | (pI & fI)
        pI' = pI
        pE' = (fE ? pP' : 0)
        X is unchanged
      
      For setuid binaries, fP, fI, and fE are modified by a moderately
      complicated set of rules that emulate POSIX behavior.  Similarly, if
      euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
      (primary, fP and fI usually end up being the full set).  For nonroot
      users executing binaries with neither setuid nor file caps, fI and fP
      are empty and fE is false.
      
      As an extra complication, if you execute a process as nonroot and fE is
      set, then the "secure exec" rules are in effect: AT_SECURE gets set,
      LD_PRELOAD doesn't work, etc.
      
      This is rather messy.  We've learned that making any changes is
      dangerous, though: if a new kernel version allows an unprivileged
      program to change its security state in a way that persists cross
      execution of a setuid program or a program with file caps, this
      persistent state is surprisingly likely to allow setuid or file-capped
      programs to be exploited for privilege escalation.
      
      ===== The problem =====
      
      Capability inheritance is basically useless.
      
      If you aren't root and you execute an ordinary binary, fI is zero, so
      your capabilities have no effect whatsoever on pP'.  This means that you
      can't usefully execute a helper process or a shell command with elevated
      capabilities if you aren't root.
      
      On current kernels, you can sort of work around this by setting fI to
      the full set for most or all non-setuid executable files.  This causes
      pP' = pI for nonroot, and inheritance works.  No one does this because
      it's a PITA and it isn't even supported on most filesystems.
      
      If you try this, you'll discover that every nonroot program ends up with
      secure exec rules, breaking many things.
      
      This is a problem that has bitten many people who have tried to use
      capabilities for anything useful.
      
      ===== The proposed change =====
      
      This patch adds a fifth capability mask called the ambient mask (pA).
      pA does what most people expect pI to do.
      
      pA obeys the invariant that no bit can ever be set in pA if it is not
      set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
      pA.  This ensures that existing programs that try to drop capabilities
      still do so, with a complication.  Because capability inheritance is so
      broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
      then calling execve effectively drops capabilities.  Therefore,
      setresuid from root to nonroot conditionally clears pA unless
      SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
      re-add bits to pA afterwards.
      
      The capability evolution rules are changed:
      
        pA' = (file caps or setuid or setgid ? 0 : pA)
        pP' = (X & fP) | (pI & fI) | pA'
        pI' = pI
        pE' = (fE ? pP' : pA')
        X is unchanged
      
      If you are nonroot but you have a capability, you can add it to pA.  If
      you do so, your children get that capability in pA, pP, and pE.  For
      example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
      automatically bind low-numbered ports.  Hallelujah!
      
      Unprivileged users can create user namespaces, map themselves to a
      nonzero uid, and create both privileged (relative to their namespace)
      and unprivileged process trees.  This is currently more or less
      impossible.  Hallelujah!
      
      You cannot use pA to try to subvert a setuid, setgid, or file-capped
      program: if you execute any such program, pA gets cleared and the
      resulting evolution rules are unchanged by this patch.
      
      Users with nonzero pA are unlikely to unintentionally leak that
      capability.  If they run programs that try to drop privileges, dropping
      privileges will still work.
      
      It's worth noting that the degree of paranoia in this patch could
      possibly be reduced without causing serious problems.  Specifically, if
      we allowed pA to persist across executing non-pA-aware setuid binaries
      and across setresuid, then, naively, the only capabilities that could
      leak as a result would be the capabilities in pA, and any attacker
      *already* has those capabilities.  This would make me nervous, though --
      setuid binaries that tried to privilege-separate might fail to do so,
      and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
      unexpected side effects.  (Whether these unexpected side effects would
      be exploitable is an open question.) I've therefore taken the more
      paranoid route.  We can revisit this later.
      
      An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
      ambient capabilities.  I think that this would be annoying and would
      make granting otherwise unprivileged users minor ambient capabilities
      (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
      it is with this patch.
      
      ===== Footnotes =====
      
      [1] Files that are missing the "security.capability" xattr or that have
      unrecognized values for that xattr end up with has_cap set to false.
      The code that does that appears to be complicated for no good reason.
      
      [2] The libcap capability mask parsers and formatters are dangerously
      misleading and the documentation is flat-out wrong.  fE is *not* a mask;
      it's a single bit.  This has probably confused every single person who
      has tried to use file capabilities.
      
      [3] Linux very confusingly processes both the script and the interpreter
      if applicable, for reasons that elude me.  The results from thinking
      about a script's file capabilities and/or setuid bits are mostly
      discarded.
      
      Preliminary userspace code is here, but it needs updating:
      https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2
      
      Here is a test program that can be used to verify the functionality
      (from Christoph):
      
      /*
       * Test program for the ambient capabilities. This program spawns a shell
       * that allows running processes with a defined set of capabilities.
       *
       * (C) 2015 Christoph Lameter <cl@linux.com>
       * Released under: GPL v3 or later.
       *
       *
       * Compile using:
       *
       *	gcc -o ambient_test ambient_test.o -lcap-ng
       *
       * This program must have the following capabilities to run properly:
       * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
       *
       * A command to equip the binary with the right caps is:
       *
       *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
       *
       *
       * To get a shell with additional caps that can be inherited by other processes:
       *
       *	./ambient_test /bin/bash
       *
       *
       * Verifying that it works:
       *
       * From the bash spawed by ambient_test run
       *
       *	cat /proc/$$/status
       *
       * and have a look at the capabilities.
       */
      
      #include <stdlib.h>
      #include <stdio.h>
      #include <errno.h>
      #include <cap-ng.h>
      #include <sys/prctl.h>
      #include <linux/capability.h>
      
      /*
       * Definitions from the kernel header files. These are going to be removed
       * when the /usr/include files have these defined.
       */
      #define PR_CAP_AMBIENT 47
      #define PR_CAP_AMBIENT_IS_SET 1
      #define PR_CAP_AMBIENT_RAISE 2
      #define PR_CAP_AMBIENT_LOWER 3
      #define PR_CAP_AMBIENT_CLEAR_ALL 4
      
      static void set_ambient_cap(int cap)
      {
      	int rc;
      
      	capng_get_caps_process();
      	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
      	if (rc) {
      		printf("Cannot add inheritable cap\n");
      		exit(2);
      	}
      	capng_apply(CAPNG_SELECT_CAPS);
      
      	/* Note the two 0s at the end. Kernel checks for these */
      	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
      		perror("Cannot set cap");
      		exit(1);
      	}
      }
      
      int main(int argc, char **argv)
      {
      	int rc;
      
      	set_ambient_cap(CAP_NET_RAW);
      	set_ambient_cap(CAP_NET_ADMIN);
      	set_ambient_cap(CAP_SYS_NICE);
      
      	printf("Ambient_test forking shell\n");
      	if (execv(argv[1], argv + 1))
      		perror("Cannot exec");
      
      	return 0;
      }
      
      Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Aaron Jones <aaronmdjones@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew G. Morgan <morgan@kernel.org>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
      Cc: Markku Savela <msa@moth.iki.fi>
      Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      58319057
    • Andrew Morton's avatar
      kernel/kthread.c:kthread_create_on_node(): clarify documentation · e9f06986
      Andrew Morton authored
      - Make it clear that the `node' arg refers to memory allocations only:
        kthread_create_on_node() does not pin the new thread to that node's
        CPUs.
      
      - Encourage the use of NUMA_NO_NODE.
      
      [nzimmer@sgi.com: use NUMA_NO_NODE in kthread_create() also]
      Cc: Nathan Zimmer <nzimmer@sgi.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e9f06986
    • Yinghai Lu's avatar
      mm: check if section present during memory block registering · 04697858
      Yinghai Lu authored
      Tony Luck found on his setup, if memory block size 512M will cause crash
      during booting.
      
        BUG: unable to handle kernel paging request at ffffea0074000020
        IP: get_nid_for_pfn+0x17/0x40
        PGD 128ffcb067 PUD 128ffc9067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1
        ...
        Call Trace:
           ? register_mem_sect_under_node+0x66/0xe0
           register_one_node+0x17b/0x240
           ? pci_iommu_alloc+0x6e/0x6e
           topology_init+0x3c/0x95
           do_one_initcall+0xcd/0x1f0
      
      The system has non continuous RAM address:
       BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable
       BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable
       BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable
       BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable
       BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable
      
      So there are start sections in memory block not present.  For example:
      
          memory block : [0x2c18000000, 0x2c20000000) 512M
      
      first three sections are not present.
      
      The current register_mem_sect_under_node() assume first section is
      present, but memory block section number range [start_section_nr,
      end_section_nr] would include not present section.
      
      For arch that support vmemmap, we don't setup memmap for struct page
      area within not present sections area.
      
      So skip the pfn range that belong to absent section.
      
      [akpm@linux-foundation.org: simplification]
      [rientjes@google.com: more simplification]
      Fixes: bdee237c ("x86: mm: Use 2GB memory block size on large memory x86-64 systems")
      Fixes: 982792c7 ("x86, mm: probe memory block size for generic x86 64bit")
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reported-by: default avatarTony Luck <tony.luck@intel.com>
      Tested-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Tested-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[3.15+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04697858
    • Ryan Ding's avatar
      ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio · aa1057b3
      Ryan Ding authored
      ocfs2_file_write_iter() is usng the wrong return value ('written').  This
      will cause ocfs2_rw_unlock() be called both in write_iter & end_io,
      triggering a BUG_ON.
      
      This issue was introduced by commit 7da839c4 ("ocfs2: use
      __generic_file_write_iter()").
      
      Orabug: 21612107
      Fixes: 7da839c4 ("ocfs2: use __generic_file_write_iter()")
      Signed-off-by: default avatarRyan Ding <ryan.ding@oracle.com>
      Reviewed-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa1057b3
    • Tang Chen's avatar
      memory-hotplug: add hot-added memory ranges to memblock before allocate node_data for a node. · 7f36e3e5
      Tang Chen authored
      Commit f9126ab9 ("memory-hotplug: fix wrong edge when hot add a new
      node") hot-added memory range to memblock, after creating pgdat for new
      node.
      
      But there is a problem:
      
        add_memory()
        |--> hotadd_new_pgdat()
             |--> free_area_init_node()
                  |--> get_pfn_range_for_nid()
                       |--> find start_pfn and end_pfn in memblock
        |--> ......
        |--> memblock_add_node(start, size, nid)    --------    Here, just too late.
      
      get_pfn_range_for_nid() will find that start_pfn and end_pfn are both 0.
      As a result, when adding memory, dmesg will give the following wrong
      message.
      
        Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff]
        On node 5 totalpages: 0
        Built 5 zonelists in Node order, mobility grouping on.  Total pages: 32588823
        Policy zone: Normal
        init_memory_mapping: [mem 0x60000000000-0x607ffffffff]
      
      The solution is simple, just add the memory range to memblock a little
      earlier, before hotadd_new_pgdat().
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[4.2.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7f36e3e5
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 88a99886
      Linus Torvalds authored
      Pull pin control updates from Linus Walleij:
       "This is the bulk of pin control changes for the v4.3 development
        cycle.
      
        Like with GPIO it's a lot of stuff.  If my subsystems are any sign of
        the overall tempo of the kernel v4.3 will be a gigantic diff.
      
      [ It looks like 4.3 is calmer than 4.2 in most other subsystems, but
        we'll see - Linus ]
      
        Core changes:
      
         - It is possible configure groups in debugfs.
      
         - Consolidation of chained IRQ handler install/remove replacing all
           call sites where irq_set_handler_data() and
           irq_set_chained_handler() were done in succession with a combined
           call to irq_set_chained_handler_and_data().  This series was
           created by Thomas Gleixner after the problem was observed by
           Russell King.
      
         - Tglx also made another series of patches switching
           __irq_set_handler_locked() for irq_set_handler_locked() which is
           way cleaner.
      
         - Tglx also wrote a good bunch of patches to make use of
           irq_desc_get_xxx() accessors and avoid looking up irq_descs from
           IRQ numbers.  The goal is to get rid of the irq number from the
           handlers in the IRQ flow which is nice.
      
        Driver feature enhancements:
      
         - Power management support for the SiRF SoC Atlas 7.
      
         - Power down support for the Qualcomm driver.
      
         - Intel Cherryview and Baytrail: switch drivers to use raw spinlocks
           in IRQ handlers to play nice with the realtime patch set.
      
         - Rework and new modes handling for Qualcomm SPMI-MPP.
      
         - Pinconf power source config for SH PFC.
      
        New drivers and subdrivers:
      
         - A new driver for Conexant Digicolor CX92755.
      
         - A new driver for UniPhier PH1-LD4, PH1-Pro4, PH1-sLD8, PH1-Pro5,
           ProXtream2 and PH1-LD6b SoC pin control support.
      
         - Reverse-egineered the S/PDIF settings for the Allwinner sun4i
           driver.
      
         - Support for Qualcomm Technologies QDF2xxx ARM64 SoCs
      
         - A new Freescale i.mx6ul subdriver.
      
        Cleanup:
      
         - Remove platform data support in a number of SH PFC subdrivers"
      
      * tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (95 commits)
        pinctrl: at91: fix null pointer dereference
        pinctrl: mediatek: Implement wake handler and suspend resume
        pinctrl: mediatek: Fix multiple registration issue.
        pinctrl: sh-pfc: r8a7794: add USB pin groups
        pinctrl: at91: Use generic irq_{request,release}_resources()
        pinctrl: cherryview: Use raw_spinlock for locking
        pinctrl: baytrail: Use raw_spinlock for locking
        pinctrl: imx6ul: Remove .owner field
        pinctrl: zynq: Fix typos in smc0_nand_grp and smc0_nor_grp
        pinctrl: sh-pfc: Implement pinconf power-source param for voltage switching
        clk: rockchip: add pclk_pd_pmu to the list of rk3288 critical clocks
        pinctrl: sun4i: add spdif to pin description.
        pinctrl: atlas7: clear ugly branch statements for pull and drivestrength
        pinctrl: baytrail: Serialize all register access
        pinctrl: baytrail: Drop FSF mailing address
        pinctrl: rockchip: only enable gpio clock when it setting
        pinctrl/mediatek: fix spelling mistake in dev_err error message
        pinctrl: cherryview: Serialize all register access
        pinctrl: UniPhier: PH1-Pro5: add I2C ch6 pin-mux setting
        pinctrl: nomadik: reflect current input value
        ...
      88a99886
    • Linus Torvalds's avatar
      Merge tag 'gpio-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 8d2faea6
      Linus Torvalds authored
      Pull GPIO updates from Linus Walleij:
       "This is the bulk of GPIO changes for the v4.3 kernel cycle.
      
        There is quite a lot going on in the GPIO subsystem this merge window,
        so the main matter is decribed below.
      
        The hits in other subsystems when making the GPIO flags optional are
        all ACKed by their respective subsystem maintainers.
      
        Core changes:
      
         - Root out the wrapper devm_gpiod_get() and gpiod_get() etc versions
           of the descriptor calls that did not use the flags argument on the
           end.  This was around for too long and eventually Uwe Kleine-König
           took the time to clean it out and the last users are removed along
           with the macros in this tag.  In several cases the use of flags
           simplifies the code.  For this reason we have (ACKed) patches
           hitting in DRM, IIO, media, NFC, USB+PHY up until we hammer in the
           nail with removing the macros.
      
         - Add a fat document describing how much ready-made GPIO stuff we
           have i the kernel to discourage people from reinventing a square
           wheel in userspace, as so often happens.
      
         - Create a separate lockdep class for each instance of a GPIO IRQ
           chip instead of using one class for all chips, as the current code
           will not work with systems with several GPIO chips doing lockdep
           debugging.
      
         - Protect against driver unloading also when a GPIO line is only used
           as IRQ for the GPIOLIB_IRQCHIP helpers.
      
         - If the GPIO chip has no designated owner, assign the parent device
           driver owner as owner.
      
         - Consolidation of chained IRQ handler install/remove replacing all
           call sites where irq_set_handler_data() and
           irq_set_chained_handler() were done in succession with a combined
           call to irq_set_chained_handler_and_data().
      
           This series was created by Thomas Gleixner after the problem was
           observed by Russell King.
      
         - Tglx also made another series of patches switching
           __irq_set_handler_locked() for irq_set_handler_locked() which is
           way cleaner.
      
         - Tglx and Jiang Liu wrote a good bunch of patches to make use of
           irq_desc_get_xxx() accessors and avoid looking up irq_descs from
           IRQ numbers.  The goal is to get rid of the irq number from the
           handlers in the IRQ flow which is nice.
      
         - Rob Herring killed off the set_irq_flags() for all GPIO drivers.
           This was an ARM specific function that is replaced with the generic
           irq_modify_status() where special flags are actually needed.
      
         - When an OF node has a pin range for its GPIOs, return -EPROBE_DEFER
           if the pin controller isn't available.  Pretty logical, yet needed
           to be fixed.
      
         - If a driver using GPIOLIB_IRQCHIP has its own irq_*_resources call
           back, then call these instead of the defaults provided by the
           GPIOLIB.
      
         - Fix an undocumented ABI hole: named GPIOs were not properly
           documented.
      
        Driver improvements:
      
         - Add get_direction() support to the generic GPIO driver, it's
           strange that we didn't have that before.
      
         - Make it possible to have input-only GPIO chips using the generic
           GPIO driver.
      
         - Clean out platform data support from the Emma Mobile (EM) driver
      
         - Finegrained runtime PM support for the RCAR driver.
      
         - Support r8a7795 (R-car H3) in the RCAR driver.
      
         - Support interrupts on GPIOs 16 thru 31 in the DaVinci driver.
      
         - Some consolidation and new support in the MPC8xxx driver, we now
           support MPC5125.
      
         - Preempt-RT-friendly patches: the OMAP, MPC8xxx, drivers uses raw
           spinlocks making it work better with the realime patches.
      
         - Interrupt support for the EXTRAXFS GPIO driver.
      
         - Make the ETRAXFS GPIO driver support also ARTPEC-3.
      
         - Interrupt and wakeup support for the BRCMSTB driver, also for
           wakeup from S5 cold boot.
      
         - Mask MXC IRQs during suspend.
      
         - Improve OMAP2 GPIO set_debounce() to work according to spec.
      
         - The VF610 driver handles IRQs properly.
      
        New drivers:
      
         - ZTE ZX GPIO driver"
      
      * tag 'gpio-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (87 commits)
        Revert "gpio: extraxfs: fix returnvar.cocci warnings"
        gpio: tc3589x: use static container helper
        gpio: xlp: fix error return code
        gpio: vf610: handle level IRQ's properly
        gpio: max732x: Fix error handling in probe()
        gpio: omap: fix clk_prepare/unprepare usage
        gpio: omap: protect regs access in omap_gpio_irq_handler
        gpio: omap: fix omap2_set_gpio_debounce
        gpio: omap: switch to use platform_get_irq
        gpio: omap: remove wrong irq_domain_remove usage in probe
        gpiolib: add description for gpio irqchip fields in struct gpio_chip
        gpio: extraxfs: fix returnvar.cocci warnings
        gpiolib: irqchip: use different lockdep class for each gpio irqchip
        gpio/grgpio: fix deadlock in grgpio_irq_unmap()
        Documentation: gpio: consumer: describe active low property
        gpio: mxc: fix section mismatch warning
        gpio/mxc: mask gpio interrupts in suspend
        gpio: omap: Fix missing raw locks conversion
        gpio: brcmstb: support wakeup from S5 cold boot
        gpio: brcmstb: Add interrupt and wakeup source support
        ...
      8d2faea6
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 02cf1da2
      Linus Torvalds authored
      Pull tile updates from Chris Metcalf:
       "This includes secure computing support as well as miscellaneous minor
        improvements"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        tile: correct some typos in opcode type names
        tile/vdso: emit a GNU hash as well
        tile: Remove finish_arch_switch
        tile: enable full SECCOMP support
        tile/time: Migrate to new 'set-state' interface
      02cf1da2
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · a4fdb2a4
      Linus Torvalds authored
      Pull arm64 updates from Will Deacon:
      
       - Support for new architectural features introduced in ARMv8.1:
         * Privileged Access Never (PAN) to catch user pointer dereferences in
           the kernel
         * Large System Extension (LSE) for building scalable atomics and locks
           (depends on locking/arch-atomic from tip, which is included here)
         * Hardware Dirty Bit Management (DBM) for updating clean PTEs
           automatically
      
       - Move our PSCI implementation out into drivers/firmware/, where it can
         be shared with arch/arm/. RMK has also pulled this component branch
         and has additional patches moving arch/arm/ over. MAINTAINERS is
         updated accordingly.
      
       - Better BUG implementation based on the BRK instruction for trapping
      
       - Leaf TLB invalidation for unmapping user pages
      
       - Support for PROBE_ONLY PCI configurations
      
       - Various cleanups and non-critical fixes, including:
         * Always flush FP/SIMD state over exec()
         * Restrict memblock additions based on range of linear mapping
         * Ensure *(LIST_POISON) generates a fatal fault
         * Context-tracking syscall return no longer corrupts return value when
           not forced on.
         * Alternatives patching synchronisation/stability improvements
         * Signed sub-word cmpxchg compare fix (tickled by HAVE_CMPXCHG_LOCAL)
         * Force SMP=y
         * Hide direct DCC access from userspace
         * Fix EFI stub memory allocation when DRAM starts at 0x0
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
        arm64: flush FP/SIMD state correctly after execve()
        arm64: makefile: fix perf_callchain.o kconfig dependency
        arm64: set MAX_MEMBLOCK_ADDR according to linear region size
        of/fdt: make memblock maximum physical address arch configurable
        arm64: Fix source code file path in comments
        arm64: entry: always restore x0 from the stack on syscall return
        arm64: mdscr_el1: avoid exposing DCC to userspace
        arm64: kconfig: Move LIST_POISON to a safe value
        arm64: Add __exception_irq_entry definition for function graph
        arm64: mm: ensure patched kernel text is fetched from PoU
        arm64: alternatives: ensure secondary CPUs execute ISB after patching
        arm64: make ll/sc __cmpxchg_case_##name asm consistent
        arm64: dma-mapping: Simplify pgprot handling
        arm64: restore cpu suspend/resume functionality
        ARM64: PCI: do not enable resources on PROBE_ONLY systems
        arm64: cmpxchg: truncate sub-word signed types before comparison
        arm64: alternative: put secondary CPUs into polling loop during patch
        arm64/Documentation: clarify wording regarding memory below the Image
        arm64: lse: fix lse cmpxchg code indentation
        arm64: remove redundant object file list
        ...
      a4fdb2a4
  2. 03 Sep, 2015 18 commits
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 807249d3
      Linus Torvalds authored
      Pull MIPS updates from Ralf Baechle:
       "This is the main pull request for 4.3 for MIPS.  Here's the summary:
      
        Three fixes that didn't make 4.2-stable:
      
         - a -Os build might compile the kernel using the MIPS16 instruction
           set but the R2 optimized inline functions in <uapi/asm/swab.h> are
           implemented using 32-bit wide instructions which is invalid.
      
         - a build error in pgtable-bits.h for a particular kernel
           configuration.
      
         - accessing registers of the CM GCR might have been compiled to use
           64 bit accesses but these registers are onl 32 bit wide.
      
        And also a few new bits:
      
         - move the ATH79 GPIO driver to drivers/gpio
      
         - the definition of IRQCHIP_DECLARE has moved to linux/irqchip.h,
           change ATH79 accordingly.
      
         - fix definition of pgprot_writecombine
      
         - add an implementation of dma_map_ops.mmap
      
         - fix alignment of quiet build output for vmlinuz link
      
         - BCM47xx: Use kmemdup rather than duplicating its implementation
      
         - Netlogic: Fix 0x0x prefixes of constants.
      
         - merge Bjorn Helgaas' series to remove most of the weak keywords
           from function declarations.
      
         - CP0 and CP1 registers are best considered treated as unsigned
           values to avoid large values from becoming negative values.
      
         - improve support for the MIPS GIC timer.
      
         - enable common clock framework for Malta and SEAD3.
      
         - a number of improvments and fixes to dump_tlb().
      
         - document the MIPS TLB dump functionality in Magic SysRq.
      
         - Cavium Octeon CN68XX improvments.
      
         - NetLogic improvments.
      
         - irq: Use access helper irq_data_get_affinity_mask.
      
         - handle MSA unaligned accesses.
      
         - a number of R6-related math-emu fixes.
      
         - support for I6400.
      
         - improvments to MSA support.
      
         - add uprobes support.
      
         - move from deprecated __initcall to arch_initcall.
      
         - remove finish_arch_switch().
      
         - IRQ cleanups by Thomas Gleixner.
      
         - migrate to new 'set-state' interface.
      
         - random small cleanups"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (148 commits)
        MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
        MIPS: Fix alignment of quiet build output for vmlinuz link
        MIPS: math-emu: Remove unused handle_dsemul function declaration
        MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 CLASS FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 RINT FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 SELNEZ FPU instruction
        MIPS: math-emu: Add support for the MIPS R6 SELEQZ FPU instruction
        MIPS: math-emu: Add support for the CMP.condn.fmt R6 instruction
        MIPS: inst.h: Add new MIPS R6 FPU opcodes
        MIPS: Octeon: Fix management port MII address on Kontron S1901
        MIPS: BCM47xx: Use kmemdup rather than duplicating its implementation
        STAGING: Octeon: Use common helpers for determining interface and port
        MIPS: Octeon: Support interfaces 4 and 5
        MIPS: Octeon: Set up 1:1 mapping between CN68XX PKO queues and ports
        MIPS: Octeon: Initialize CN68XX PKO
        STAGING: Octeon: Support CN68XX style WQE
        ...
      807249d3
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ff474e8c
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask
         from Benjamin Herrenschmidt
      
       - EEH fixes for SRIOV from Gavin
      
       - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth
      
       - use hardware RNG for arch_get_random_seed_* not arch_get_random_*
         from Paul Mackerras
      
       - seccomp filter support from Michael Ellerman
      
       - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh
         Salgaonkar
      
       - add powerpc timebase as a trace clock source from Naveen N.  Rao
      
       - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual
      
       - add an inline function to update POWER8 HID0 from Gautham R.  Shenoy
      
       - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman
      
       - drop support for 64K local store on 4K kernels from Michael Ellerman
      
       - move dma_get_required_mask() from pnv_phb to pci_controller_ops from
         Andrew Donnellan
      
       - initialize distance lookup table from drconf path from Nikunj A
         Dadhania
      
       - enable RTC class support from Vaibhav Jain
      
       - disable automatically blocked PCI config from Gavin Shan
      
       - add LEDs driver for PowerNV platform from Vasant Hegde
      
       - fix endianness issues in the HVSI driver from Laurent Dufour
      
       - kexec endian fixes from Samuel Mendoza-Jonas
      
       - fix corrupted pdn list from Gavin Shan
      
       - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan
      
       - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
         optimizations, checksum optimizations, 85xx config fragments and
         updates, device tree updates, e6500 fixes for non-SMP, and misc
         cleanup and minor fixes.
      
       - a ton of cxl updates & fixes:
          - add explicit precision specifiers from Rasmus Villemoes
          - use more common format specifier from Rasmus Villemoes
          - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
          - destroy afu->contexts_idr on release of an afu from Johannes
            Thumshirn
          - compile with -Werror from Daniel Axtens
          - EEH support from Daniel Axtens
          - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
          - add alternate MMIO error handling from Ian Munsie
          - allow release of contexts which have been OPENED but not STARTED
            from Andrew Donnellan
          - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
          - release irqs if memory allocation fails from Vaibhav Jain
          - remove racy attempt to force EEH invocation in reset from Daniel
            Axtens
          - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
          - fix force unmapping mmaps of contexts allocated through the kernel
            api from Ian Munsie
          - set up and enable PSL Timebase from Philippe Bergheaud
      
      * tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits)
        cxl: Set up and enable PSL Timebase
        cxl: Fix force unmapping mmaps of contexts allocated through the kernel api
        cxl: Fix + cleanup error paths in cxl_dev_context_init
        powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
        powerpc/pseries: Cleanup on pci_dn_reconfig_notifier()
        powerpc/pseries: Fix corrupted pdn list
        powerpc/powernv: Enable LEDS support
        powerpc/iommu: Set default DMA offset in dma_dev_setup
        cxl: Remove racy attempt to force EEH invocation in reset
        cxl: Release irqs if memory allocation fails
        cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE
        powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver
        powerpc/powernv: Reset HILE before kexec_sequence()
        powerpc/kexec: Reset secondary cpu endianness before kexec
        powerpc/hvsi: Fix endianness issues in the HVSI driver
        leds/powernv: Add driver for PowerNV platform
        powerpc/powernv: Create LED platform device
        powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states
        powerpc/powernv: Fix the log message when disabling VF
        cxl: Allow release of contexts which have been OPENED but not STARTED
        ...
      ff474e8c
    • Linus Torvalds's avatar
      Merge branch 'pcmcia' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 4c92b5bb
      Linus Torvalds authored
      Pull ARM pcmcia updates from Russell King:
       "A series of changes updating the PXA and SA11x0 PCMCIA code to use
        devm_* APIs, and resolve some resource leaks in doing so.  This
        results in a few small cleanups which are included in this set.
      
        FYI, the recommit of these today is to add Robert Jarzmik's
        reviewed-by tags, which I'd forgotten to add from mid-July"
      
      * 'pcmcia' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        pcmcia: soc_common: remove skt_dev_info's clk pointer
        pcmcia: sa11xx_base.c: remove useless init/exit functions
        pcmcia: sa1111: simplify clk handing in sa1111_pcmcia_add()
        pcmcia: sa1111: update socket driver to use devm_clk_get() API
        pcmcia: pxa2xx: convert memory allocation to devm_* API
        pcmcia: pxa2xx: update socket driver to use devm_clk_get() API
        pcmcia: sa11x0: convert memory allocation to devm_* API
        pcmcia: sa11x0: fix missing clk_put() in sa11x0 socket drivers
      4c92b5bb
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · c706c7eb
      Linus Torvalds authored
      Pull ARM development updates from Russell King:
       "Included in this update:
      
         - moving PSCI code from ARM64/ARM to drivers/
      
         - removal of some architecture internals from global kernel view
      
         - addition of software based "privileged no access" support using the
           old domains register to turn off the ability for kernel
           loads/stores to access userspace.  Only the proper accessors will
           be usable.
      
         - addition of early fixup support for early console
      
         - re-addition (and reimplementation) of OMAP special interconnect
           barrier
      
         - removal of finish_arch_switch()
      
         - only expose cpuX/online in sysfs if hotpluggable
      
         - a number of code cleanups"
      
      * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits)
        ARM: software-based priviledged-no-access support
        ARM: entry: provide uaccess assembly macro hooks
        ARM: entry: get rid of multiple macro definitions
        ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die()
        ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore()
        ARM: mm: improve do_ldrd_abort macro
        ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
        ARM: entry: efficiency cleanups
        ARM: entry: get rid of asm_trace_hardirqs_on_cond
        ARM: uaccess: simplify user access assembly
        ARM: domains: remove DOMAIN_TABLE
        ARM: domains: keep vectors in separate domain
        ARM: domains: get rid of manager mode for user domain
        ARM: domains: move initial domain setting value to asm/domains.h
        ARM: domains: provide domain_mask()
        ARM: domains: switch to keeping domain value in register
        ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE
        ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD()
        ARM: 8416/1: Feroceon: use of_iomap() to map register base
        ARM: 8415/1: early fixmap support for earlycon
        ...
      c706c7eb
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 79b0691d
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Tooling fixes plus a handful of late arriving tooling changes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf tools: Fix link time error with sample_reg_masks on non x86
        perf build: Fix Intel PT instruction decoder dependency problem
        perf dwarf: Fix potential array out of bounds access
        perf record: Add ability to name registers to record
        perf/x86: Add list of register names
        perf script: Enable printing of interrupted machine state
        perf evlist: Open event on evsel cpus and threads
        bpf tools: New API to get name from a BPF object
        perf tools: Fix build on powerpc broken by pt/bts
      79b0691d
    • Linus Torvalds's avatar
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ca520cab
      Linus Torvalds authored
      Pull locking and atomic updates from Ingo Molnar:
       "Main changes in this cycle are:
      
         - Extend atomic primitives with coherent logic op primitives
           (atomic_{or,and,xor}()) and deprecate the old partial APIs
           (atomic_{set,clear}_mask())
      
           The old ops were incoherent with incompatible signatures across
           architectures and with incomplete support.  Now every architecture
           supports the primitives consistently (by Peter Zijlstra)
      
         - Generic support for 'relaxed atomics':
      
             - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
             - atomic_read_acquire()
             - atomic_set_release()
      
           This came out of porting qwrlock code to arm64 (by Will Deacon)
      
         - Clean up the fragile static_key APIs that were causing repeat bugs,
           by introducing a new one:
      
             DEFINE_STATIC_KEY_TRUE(name);
             DEFINE_STATIC_KEY_FALSE(name);
      
           which define a key of different types with an initial true/false
           value.
      
           Then allow:
      
             static_branch_likely()
             static_branch_unlikely()
      
           to take a key of either type and emit the right instruction for the
           case.  To be able to know the 'type' of the static key we encode it
           in the jump entry (by Peter Zijlstra)
      
         - Static key self-tests (by Jason Baron)
      
         - qrwlock optimizations (by Waiman Long)
      
         - small futex enhancements (by Davidlohr Bueso)
      
         - ... and misc other changes"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
        jump_label/x86: Work around asm build bug on older/backported GCCs
        locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
        locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
        locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
        locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
        locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
        locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
        locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
        locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
        locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
        locking/static_keys: Make verify_keys() static
        jump label, locking/static_keys: Update docs
        locking/static_keys: Provide a selftest
        jump_label: Provide a self-test
        s390/uaccess, locking/static_keys: employ static_branch_likely()
        x86, tsc, locking/static_keys: Employ static_branch_likely()
        locking/static_keys: Add selftest
        locking/static_keys: Add a new static_key interface
        locking/static_keys: Rework update logic
        locking/static_keys: Add static_key_{en,dis}able() helpers
        ...
      ca520cab
    • Linus Torvalds's avatar
      Merge tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 4c12ab7e
      Linus Torvalds authored
      Pull f2fs updates from Jaegeuk Kim:
       "The major work includes fixing and enhancing the existing extent_cache
        feature, which has been well settling down so far and now it becomes a
        default mount option accordingly.
      
        Also, this version newly registers a f2fs memory shrinker to reclaim
        several objects consumed by a couple of data structures in order to
        avoid memory pressures.
      
        Another new feature is to add ioctl(F2FS_GARBAGE_COLLECT) which
        triggers a cleaning job explicitly by users.
      
        Most of the other patches are to fix bugs occurred in the corner cases
        across the whole code area"
      
      * tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (85 commits)
        f2fs: upset segment_info repair
        f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent
        f2fs: update extent tree in batches
        f2fs: fix to release inode correctly
        f2fs: handle f2fs_truncate error correctly
        f2fs: avoid unneeded initializing when converting inline dentry
        f2fs: atomically set inode->i_flags
        f2fs: fix wrong pointer access during try_to_free_nids
        f2fs: use __GFP_NOFAIL to avoid infinite loop
        f2fs: lookup neighbor extent nodes for merging later
        f2fs: split __insert_extent_tree_ret for readability
        f2fs: kill dead code in __insert_extent_tree
        f2fs: adjust showing of extent cache stat
        f2fs: add largest/cached stat in extent cache
        f2fs: fix incorrect mapping for bmap
        f2fs: add annotation for space utilization of regular/inline dentry
        f2fs: fix to update cached_en of extent tree properly
        f2fs: fix typo
        f2fs: check the node block address of newly allocated nid
        f2fs: go out for insert_inode_locked failure
        ...
      4c12ab7e
    • Linus Torvalds's avatar
      Merge tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 9cbf22b3
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
       "This set mainly includes a change to the way the dlm uses the SCTP API
        in the kernel, removing the direct dependency on the sctp module.
        Other odd SCTP-related fixes are also included.
      
        The other notable fix is for a long standing regression in the
        behavior of lock value blocks for user space locks"
      
      * tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        dlm: print error from kernel_sendpage
        dlm: fix lvb copy for user locks
        dlm: sctp_accept_from_sock() can be static
        dlm: fix reconnecting but not sending data
        dlm: replace BUG_ON with a less severe handling
        dlm: use sctp 1-to-1 API
        dlm: fix not reconnecting on connecting error handling
        dlm: fix race while closing connections
        dlm: fix connection stealing if using SCTP
      9cbf22b3
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · ea814ab9
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "Pretty much all bug fixes and clean ups for 4.3, after a lot of
        features and other churn going into 4.2"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        Revert "ext4: remove block_device_ejected"
        ext4: ratelimit the file system mounted message
        ext4: silence a format string false positive
        ext4: simplify some code in read_mmp_block()
        ext4: don't manipulate recovery flag when freezing no-journal fs
        jbd2: limit number of reserved credits
        ext4 crypto: remove duplicate header file
        ext4: update c/mtime on truncate up
        jbd2: avoid infinite loop when destroying aborted journal
        ext4, jbd2: add REQ_FUA flag when recording an error in the superblock
        ext4 crypto: fix spelling typo in comment
        ext4 crypto: exit cleanly if ext4_derive_key_aes() fails
        ext4: reject journal options for ext2 mounts
        ext4: implement cgroup writeback support
        ext4: replace ext4_io_submit->io_op with ->io_wbc
        ext4 crypto: check for too-short encrypted file names
        ext4 crypto: use a jbd2 transaction when adding a crypto policy
        jbd2: speedup jbd2_journal_dirty_metadata()
      ea814ab9
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · e31fb9e0
      Linus Torvalds authored
      Pull ext3 removal, quota & udf fixes from Jan Kara:
       "The biggest change in the pull is the removal of ext3 filesystem
        driver (~28k lines removed).  Ext4 driver is a full featured
        replacement these days and both RH and SUSE use it for several years
        without issues.  Also there are some workarounds in VM & block layer
        mainly for ext3 which we could eventually get rid of.
      
        Other larger change is addition of proper error handling for
        dquot_initialize().  The rest is small fixes and cleanups"
      
      [ I wasn't convinced about the ext3 removal and worried about things
        falling through the cracks for legacy users, but ext4 maintainers
        piped up and were all unanimously in favor of removal, and maintaining
        all legacy ext3 support inside ext4.   - Linus ]
      
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        udf: Don't modify filesystem for read-only mounts
        quota: remove an unneeded condition
        ext4: memory leak on error in ext4_symlink()
        mm/Kconfig: NEED_BOUNCE_POOL: clean-up condition
        ext4: Improve ext4 Kconfig test
        block: Remove forced page bouncing under IO
        fs: Remove ext3 filesystem driver
        doc: Update doc about journalling layer
        jfs: Handle error from dquot_initialize()
        reiserfs: Handle error from dquot_initialize()
        ocfs2: Handle error from dquot_initialize()
        ext4: Handle error from dquot_initialize()
        ext2: Handle error from dquot_initalize()
        quota: Propagate error from ->acquire_dquot()
      e31fb9e0
    • Linus Torvalds's avatar
      Merge branch 'hpfs' (patches from Mikulas) · 824b005c
      Linus Torvalds authored
      Merge hpfs upddate from Mikulas Patocka.
      
      * emailed patches from Mikulas Patocka <mikulas@twibright.com>:
        hpfs: update ctime and mtime on directory modification
        hpfs: support hotfixes
      824b005c
    • Mikulas Patocka's avatar
      hpfs: update ctime and mtime on directory modification · f49a26e7
      Mikulas Patocka authored
      Update ctime and mtime when a directory is modified. (though OS/2 doesn't
      update them anyway)
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@kernel.org	# v3.3+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f49a26e7
    • Mikulas Patocka's avatar
      hpfs: support hotfixes · a64eefaa
      Mikulas Patocka authored
      When the OS/2 driver hits a disk write error, it writes the sector to
      another location and adds the sector mapping to the hotfix map.
      
      This patch makes the hpfs driver understand the hotfix map and remap
      accesses accoring to it.
      Signed-off-by: default avatarMikulas Patocka <mikulas@twibright.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a64eefaa
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · dd5cdb48
      Linus Torvalds authored
      Pull networking updates from David Miller:
       "Another merge window, another set of networking changes.  I've heard
        rumblings that the lightweight tunnels infrastructure has been voted
        networking change of the year.  But what do I know?
      
         1) Add conntrack support to openvswitch, from Joe Stringer.
      
         2) Initial support for VRF (Virtual Routing and Forwarding), which
            allows the segmentation of routing paths without using multiple
            devices.  There are some semantic kinks to work out still, but
            this is a reasonably strong foundation.  From David Ahern.
      
         3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.
      
         4) Ignore route nexthops with a link down state in ipv6, just like
            ipv4.  From Andy Gospodarek.
      
         5) Remove spinlock from fast path of act_gact and act_mirred, from
            Eric Dumazet.
      
         6) Document the DSA layer, from Florian Fainelli.
      
         7) Add netconsole support to bcmgenet, systemport, and DSA.  Also
            from Florian Fainelli.
      
         8) Add Mellanox Switch Driver and core infrastructure, from Jiri
            Pirko.
      
         9) Add support for "light weight tunnels", which allow for
            encapsulation and decapsulation without bearing the overhead of a
            full blown netdevice.  From Thomas Graf, Jiri Benc, and a cast of
            others.
      
        10) Add Identifier Locator Addressing support for ipv6, from Tom
            Herbert.
      
        11) Support fragmented SKBs in iwlwifi, from Johannes Berg.
      
        12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.
      
        13) Add BQL support to 3c59x driver, from Loganaden Velvindron.
      
        14) Stop using a zero TX queue length to mean that a device shouldn't
            have a qdisc attached, use an explicit flag instead.  From Phil
            Sutter.
      
        15) Use generic geneve netdevice infrastructure in openvswitch, from
            Pravin B Shelar.
      
        16) Add infrastructure to avoid re-forwarding a packet in software
            that was already forwarded by a hardware switch.  From Scott
            Feldman.
      
        17) Allow AF_PACKET fanout function to be implemented in a bpf
            program, from Willem de Bruijn"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
        netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
        netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
        net: fec: clear receive interrupts before processing a packet
        ipv6: fix exthdrs offload registration in out_rt path
        xen-netback: add support for multicast control
        bgmac: Update fixed_phy_register()
        sock, diag: fix panic in sock_diag_put_filterinfo
        flow_dissector: Use 'const' where possible.
        flow_dissector: Fix function argument ordering dependency
        ixgbe: Resolve "initialized field overwritten" warnings
        ixgbe: Remove bimodal SR-IOV disabling
        ixgbe: Add support for reporting 2.5G link speed
        ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
        ixgbe: support for ethtool set_rxfh
        ixgbe: Avoid needless PHY access on copper phys
        ixgbe: cleanup to use cached mask value
        ixgbe: Remove second instance of lan_id variable
        ixgbe: use kzalloc for allocating one thing
        flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
        ixgbe: Remove unused PCI bus types
        ...
      dd5cdb48
    • Russell King's avatar
      pcmcia: soc_common: remove skt_dev_info's clk pointer · fca8b807
      Russell King authored
      We no longer need to store the clk pointer in struct skt_dev_info as we
      no longer need to remember the clk pointer for the cleanup paths.
      Reviewed-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      fca8b807
    • Russell King's avatar
      pcmcia: sa11xx_base.c: remove useless init/exit functions · c3eb700c
      Russell King authored
      A library module is not required to have module init/exit functions.
      Get rid of these unnecessary functions.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      c3eb700c
    • Russell King's avatar
      pcmcia: sa1111: simplify clk handing in sa1111_pcmcia_add() · 321ae964
      Russell King authored
      clk_get(dev, NULL) will always refer to the same clock, so it's
      pointless calling this multiple times for the same device.  As we no
      longer have to worry about the cleanup (via use of devm_clk_get()) we
      can simplify sa1111_pcmcia_add() too.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      321ae964
    • Russell King's avatar
      pcmcia: sa1111: update socket driver to use devm_clk_get() API · 924e5ea2
      Russell King authored
      Update the pxa2xx socket driver to use the devm_clk_get() API so that
      the cleanup paths are simplified.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      924e5ea2