1. 15 Jan, 2018 15 commits
    • David Windsor's avatar
      jfs: Define usercopy region in jfs_ip slab cache · 8d2704d3
      David Windsor authored
      The jfs symlink pathnames, stored in struct jfs_inode_info.i_inline and
      therefore contained in the jfs_ip slab cache, need to be copied to/from
      userspace.
      
      cache object allocation:
          fs/jfs/super.c:
              jfs_alloc_inode(...):
                  ...
                  jfs_inode = kmem_cache_alloc(jfs_inode_cachep, GFP_NOFS);
                  ...
                  return &jfs_inode->vfs_inode;
      
          fs/jfs/jfs_incore.h:
              JFS_IP(struct inode *inode):
                  return container_of(inode, struct jfs_inode_info, vfs_inode);
      
          fs/jfs/inode.c:
              jfs_iget(...):
                  ...
                  inode->i_link = JFS_IP(inode)->i_inline;
      
      example usage trace:
          readlink_copy+0x43/0x70
          vfs_readlink+0x62/0x110
          SyS_readlinkat+0x100/0x130
      
          fs/namei.c:
              readlink_copy(..., link):
                  ...
                  copy_to_user(..., link, len);
      
              (inlined in vfs_readlink)
              generic_readlink(dentry, ...):
                  struct inode *inode = d_inode(dentry);
                  const char *link = inode->i_link;
                  ...
                  readlink_copy(..., link);
      
      In support of usercopy hardening, this patch defines a region in the
      jfs_ip slab cache in which userspace copy operations are allowed.
      
      This region is known as the slab cache's usercopy region. Slab caches
      can now check that each dynamically sized copy operation involving
      cache-managed memory falls entirely within the slab's usercopy region.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log, provide usage trace]
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: jfs-discussion@lists.sourceforge.net
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      8d2704d3
    • David Windsor's avatar
      ext2: Define usercopy region in ext2_inode_cache slab cache · 85212d4e
      David Windsor authored
      The ext2 symlink pathnames, stored in struct ext2_inode_info.i_data and
      therefore contained in the ext2_inode_cache slab cache, need to be copied
      to/from userspace.
      
      cache object allocation:
          fs/ext2/super.c:
              ext2_alloc_inode(...):
                  struct ext2_inode_info *ei;
                  ...
                  ei = kmem_cache_alloc(ext2_inode_cachep, GFP_NOFS);
                  ...
                  return &ei->vfs_inode;
      
          fs/ext2/ext2.h:
              EXT2_I(struct inode *inode):
                  return container_of(inode, struct ext2_inode_info, vfs_inode);
      
          fs/ext2/namei.c:
              ext2_symlink(...):
                  ...
                  inode->i_link = (char *)&EXT2_I(inode)->i_data;
      
      example usage trace:
          readlink_copy+0x43/0x70
          vfs_readlink+0x62/0x110
          SyS_readlinkat+0x100/0x130
      
          fs/namei.c:
              readlink_copy(..., link):
                  ...
                  copy_to_user(..., link, len);
      
              (inlined into vfs_readlink)
              generic_readlink(dentry, ...):
                  struct inode *inode = d_inode(dentry);
                  const char *link = inode->i_link;
                  ...
                  readlink_copy(..., link);
      
      In support of usercopy hardening, this patch defines a region in the
      ext2_inode_cache slab cache in which userspace copy operations are
      allowed.
      
      This region is known as the slab cache's usercopy region. Slab caches
      can now check that each dynamically sized copy operation involving
      cache-managed memory falls entirely within the slab's usercopy region.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log, provide usage trace]
      Cc: Jan Kara <jack@suse.com>
      Cc: linux-ext4@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      85212d4e
    • David Windsor's avatar
      ext4: Define usercopy region in ext4_inode_cache slab cache · f8dd7c70
      David Windsor authored
      The ext4 symlink pathnames, stored in struct ext4_inode_info.i_data
      and therefore contained in the ext4_inode_cache slab cache, need
      to be copied to/from userspace.
      
      cache object allocation:
          fs/ext4/super.c:
              ext4_alloc_inode(...):
                  struct ext4_inode_info *ei;
                  ...
                  ei = kmem_cache_alloc(ext4_inode_cachep, GFP_NOFS);
                  ...
                  return &ei->vfs_inode;
      
          include/trace/events/ext4.h:
                  #define EXT4_I(inode) \
                      (container_of(inode, struct ext4_inode_info, vfs_inode))
      
          fs/ext4/namei.c:
              ext4_symlink(...):
                  ...
                  inode->i_link = (char *)&EXT4_I(inode)->i_data;
      
      example usage trace:
          readlink_copy+0x43/0x70
          vfs_readlink+0x62/0x110
          SyS_readlinkat+0x100/0x130
      
          fs/namei.c:
              readlink_copy(..., link):
                  ...
                  copy_to_user(..., link, len)
      
              (inlined into vfs_readlink)
              generic_readlink(dentry, ...):
                  struct inode *inode = d_inode(dentry);
                  const char *link = inode->i_link;
                  ...
                  readlink_copy(..., link);
      
      In support of usercopy hardening, this patch defines a region in the
      ext4_inode_cache slab cache in which userspace copy operations are
      allowed.
      
      This region is known as the slab cache's usercopy region. Slab caches
      can now check that each dynamically sized copy operation involving
      cache-managed memory falls entirely within the slab's usercopy region.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log, provide usage trace]
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: linux-ext4@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      f8dd7c70
    • David Windsor's avatar
      vfs: Copy struct mount.mnt_id to userspace using put_user() · 6391af6f
      David Windsor authored
      The mnt_id field can be copied with put_user(), so there is no need to
      use copy_to_user(). In both cases, hardened usercopy is being bypassed
      since the size is constant, and not open to runtime manipulation.
      
      This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log]
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      6391af6f
    • David Windsor's avatar
      vfs: Define usercopy region in names_cache slab caches · 6a9b8820
      David Windsor authored
      VFS pathnames are stored in the names_cache slab cache, either inline
      or across an entire allocation entry (when approaching PATH_MAX). These
      are copied to/from userspace, so they must be entirely whitelisted.
      
      cache object allocation:
          include/linux/fs.h:
              #define __getname()    kmem_cache_alloc(names_cachep, GFP_KERNEL)
      
      example usage trace:
          strncpy_from_user+0x4d/0x170
          getname_flags+0x6f/0x1f0
          user_path_at_empty+0x23/0x40
          do_mount+0x69/0xda0
          SyS_mount+0x83/0xd0
      
          fs/namei.c:
              getname_flags(...):
                  ...
                  result = __getname();
                  ...
                  kname = (char *)result->iname;
                  result->name = kname;
                  len = strncpy_from_user(kname, filename, EMBEDDED_NAME_MAX);
                  ...
                  if (unlikely(len == EMBEDDED_NAME_MAX)) {
                      const size_t size = offsetof(struct filename, iname[1]);
                      kname = (char *)result;
      
                      result = kzalloc(size, GFP_KERNEL);
                      ...
                      result->name = kname;
                      len = strncpy_from_user(kname, filename, PATH_MAX);
      
      In support of usercopy hardening, this patch defines the entire cache
      object in the names_cache slab cache as whitelisted, since it may entirely
      hold name strings to be copied to/from userspace.
      
      This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log, add usage trace]
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      6a9b8820
    • David Windsor's avatar
      dcache: Define usercopy region in dentry_cache slab cache · 80344266
      David Windsor authored
      When a dentry name is short enough, it can be stored directly in the
      dentry itself (instead in a separate kmalloc allocation). These dentry
      short names, stored in struct dentry.d_iname and therefore contained in
      the dentry_cache slab cache, need to be coped to userspace.
      
      cache object allocation:
          fs/dcache.c:
              __d_alloc(...):
                  ...
                  dentry = kmem_cache_alloc(dentry_cache, ...);
                  ...
                  dentry->d_name.name = dentry->d_iname;
      
      example usage trace:
          filldir+0xb0/0x140
          dcache_readdir+0x82/0x170
          iterate_dir+0x142/0x1b0
          SyS_getdents+0xb5/0x160
      
          fs/readdir.c:
              (called via ctx.actor by dir_emit)
              filldir(..., const char *name, ...):
                  ...
                  copy_to_user(..., name, namlen)
      
          fs/libfs.c:
              dcache_readdir(...):
                  ...
                  next = next_positive(dentry, p, 1)
                  ...
                  dir_emit(..., next->d_name.name, ...)
      
      In support of usercopy hardening, this patch defines a region in the
      dentry_cache slab cache in which userspace copy operations are allowed.
      
      This region is known as the slab cache's usercopy region. Slab caches can
      now check that each dynamic copy operation involving cache-managed memory
      falls entirely within the slab's usercopy region.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust hunks for kmalloc-specific things moved later]
      [kees: adjust commit log, provide usage trace]
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      80344266
    • David Windsor's avatar
      usercopy: Mark kmalloc caches as usercopy caches · 6c0c21ad
      David Windsor authored
      Mark the kmalloc slab caches as entirely whitelisted. These caches
      are frequently used to fulfill kernel allocations that contain data
      to be copied to/from userspace. Internal-only uses are also common,
      but are scattered in the kernel. For now, mark all the kmalloc caches
      as whitelisted.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on my
      understanding of the code. Changes or omissions from the original code are
      mine and don't reflect the original grsecurity/PaX code.
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: merged in moved kmalloc hunks, adjust commit log]
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mm@kvack.org
      Cc: linux-xfs@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      6c0c21ad
    • Kees Cook's avatar
      usercopy: Allow strict enforcement of whitelists · 2d891fbc
      Kees Cook authored
      This introduces CONFIG_HARDENED_USERCOPY_FALLBACK to control the
      behavior of hardened usercopy whitelist violations. By default, whitelist
      violations will continue to WARN() so that any bad or missing usercopy
      whitelists can be discovered without being too disruptive.
      
      If this config is disabled at build time or a system is booted with
      "slab_common.usercopy_fallback=0", usercopy whitelists will BUG() instead
      of WARN(). This is useful for admins that want to use usercopy whitelists
      immediately.
      Suggested-by: default avatarMatthew Garrett <mjg59@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      2d891fbc
    • Kees Cook's avatar
      usercopy: WARN() on slab cache usercopy region violations · afcc90f8
      Kees Cook authored
      This patch adds checking of usercopy cache whitelisting, and is modified
      from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting code in the
      last public patch of grsecurity/PaX based on my understanding of the
      code. Changes or omissions from the original code are mine and don't
      reflect the original grsecurity/PaX code.
      
      The SLAB and SLUB allocators are modified to WARN() on all copy operations
      in which the kernel heap memory being modified falls outside of the cache's
      defined usercopy region.
      
      Based on an earlier patch from David Windsor.
      
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: linux-mm@kvack.org
      Cc: linux-xfs@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      afcc90f8
    • David Windsor's avatar
      usercopy: Prepare for usercopy whitelisting · 8eb8284b
      David Windsor authored
      This patch prepares the slab allocator to handle caches having annotations
      (useroffset and usersize) defining usercopy regions.
      
      This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
      whitelisting code in the last public patch of grsecurity/PaX based on
      my understanding of the code. Changes or omissions from the original
      code are mine and don't reflect the original grsecurity/PaX code.
      
      Currently, hardened usercopy performs dynamic bounds checking on slab
      cache objects. This is good, but still leaves a lot of kernel memory
      available to be copied to/from userspace in the face of bugs. To further
      restrict what memory is available for copying, this creates a way to
      whitelist specific areas of a given slab cache object for copying to/from
      userspace, allowing much finer granularity of access control. Slab caches
      that are never exposed to userspace can declare no whitelist for their
      objects, thereby keeping them unavailable to userspace via dynamic copy
      operations. (Note, an implicit form of whitelisting is the use of constant
      sizes in usercopy operations and get_user()/put_user(); these bypass
      hardened usercopy checks since these sizes cannot change at runtime.)
      
      To support this whitelist annotation, usercopy region offset and size
      members are added to struct kmem_cache. The slab allocator receives a
      new function, kmem_cache_create_usercopy(), that creates a new cache
      with a usercopy region defined, suitable for declaring spans of fields
      within the objects that get copied to/from userspace.
      
      In this patch, the default kmem_cache_create() marks the entire allocation
      as whitelisted, leaving it semantically unchanged. Once all fine-grained
      whitelists have been added (in subsequent patches), this will be changed
      to a usersize of 0, making caches created with kmem_cache_create() not
      copyable to/from userspace.
      
      After the entire usercopy whitelist series is applied, less than 15%
      of the slab cache memory remains exposed to potential usercopy bugs
      after a fresh boot:
      
      Total Slab Memory:           48074720
      Usercopyable Memory:          6367532  13.2%
               task_struct                    0.2%         4480/1630720
               RAW                            0.3%            300/96000
               RAWv6                          2.1%           1408/64768
               ext4_inode_cache               3.0%       269760/8740224
               dentry                        11.1%       585984/5273856
               mm_struct                     29.1%         54912/188448
               kmalloc-8                    100.0%          24576/24576
               kmalloc-16                   100.0%          28672/28672
               kmalloc-32                   100.0%          81920/81920
               kmalloc-192                  100.0%          96768/96768
               kmalloc-128                  100.0%        143360/143360
               names_cache                  100.0%        163840/163840
               kmalloc-64                   100.0%        167936/167936
               kmalloc-256                  100.0%        339968/339968
               kmalloc-512                  100.0%        350720/350720
               kmalloc-96                   100.0%        455616/455616
               kmalloc-8192                 100.0%        655360/655360
               kmalloc-1024                 100.0%        812032/812032
               kmalloc-4096                 100.0%        819200/819200
               kmalloc-2048                 100.0%      1310720/1310720
      
      After some kernel build workloads, the percentage (mainly driven by
      dentry and inode caches expanding) drops under 10%:
      
      Total Slab Memory:           95516184
      Usercopyable Memory:          8497452   8.8%
               task_struct                    0.2%         4000/1456000
               RAW                            0.3%            300/96000
               RAWv6                          2.1%           1408/64768
               ext4_inode_cache               3.0%     1217280/39439872
               dentry                        11.1%     1623200/14608800
               mm_struct                     29.1%         73216/251264
               kmalloc-8                    100.0%          24576/24576
               kmalloc-16                   100.0%          28672/28672
               kmalloc-32                   100.0%          94208/94208
               kmalloc-192                  100.0%          96768/96768
               kmalloc-128                  100.0%        143360/143360
               names_cache                  100.0%        163840/163840
               kmalloc-64                   100.0%        245760/245760
               kmalloc-256                  100.0%        339968/339968
               kmalloc-512                  100.0%        350720/350720
               kmalloc-96                   100.0%        563520/563520
               kmalloc-8192                 100.0%        655360/655360
               kmalloc-1024                 100.0%        794624/794624
               kmalloc-4096                 100.0%        819200/819200
               kmalloc-2048                 100.0%      1257472/1257472
      Signed-off-by: default avatarDavid Windsor <dave@nullcore.net>
      [kees: adjust commit log, split out a few extra kmalloc hunks]
      [kees: add field names to function declarations]
      [kees: convert BUGs to WARNs and fail closed]
      [kees: add attack surface reduction analysis to commit log]
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mm@kvack.org
      Cc: linux-xfs@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      8eb8284b
    • Kees Cook's avatar
      stddef.h: Introduce sizeof_field() · 4229a470
      Kees Cook authored
      The size of fields within a structure is needed in a few places in the
      kernel already, and will be needed for the usercopy whitelisting when
      declaring whitelist regions within structures. This creates a dedicated
      macro and redefines offsetofend() to use it.
      
      Existing usage, ignoring the 1200+ lustre assert uses:
      
      $ git grep -E 'sizeof\(\(\((struct )?[a-zA-Z_]+ \*\)0\)->' | \
      	grep -v staging/lustre | wc -l
      65
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      4229a470
    • Kees Cook's avatar
      lkdtm/usercopy: Adjust test to include an offset to check reporting · c7588686
      Kees Cook authored
      Instead of doubling the size, push the start position up by 16 bytes to
      still trigger an overflow. This allows to verify that offset reporting
      is working correctly.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      c7588686
    • Kees Cook's avatar
      usercopy: Include offset in hardened usercopy report · f4e6e289
      Kees Cook authored
      This refactors the hardened usercopy code so that failure reporting can
      happen within the checking functions instead of at the top level. This
      simplifies the return value handling and allows more details and offsets
      to be included in the report. Having the offset can be much more helpful
      in understanding hardened usercopy bugs.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      f4e6e289
    • Kees Cook's avatar
      usercopy: Enhance and rename report_usercopy() · b394d468
      Kees Cook authored
      In preparation for refactoring the usercopy checks to pass offset to
      the hardened usercopy report, this renames report_usercopy() to the
      more accurate usercopy_abort(), marks it as noreturn because it is,
      adds a hopefully helpful comment for anyone investigating such reports,
      makes the function available to the slab allocators, and adds new "detail"
      and "offset" arguments.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      b394d468
    • Kees Cook's avatar
      usercopy: Remove pointer from overflow report · 4f5e8386
      Kees Cook authored
      Using %p was already mostly useless in the usercopy overflow reports,
      so this removes it entirely to avoid confusion now that %p-hashing
      is enabled.
      
      Fixes: ad67b74d ("printk: hash addresses printed with %p")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      4f5e8386
  2. 03 Dec, 2017 5 commits
  3. 02 Dec, 2017 4 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.15-2' of git://git.linux-nfs.org/projects/anna/linux-nfs · 2db767d9
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "These patches fix a problem with compiling using an old version of
        gcc, and also fix up error handling in the SUNRPC layer.
      
         - NFSv4: Ensure gcc 4.4.4 can compile initialiser for
           "invalid_stateid"
      
         - SUNRPC: Allow connect to return EHOSTUNREACH
      
         - SUNRPC: Handle ENETDOWN errors"
      
      * tag 'nfs-for-4.15-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        SUNRPC: Handle ENETDOWN errors
        SUNRPC: Allow connect to return EHOSTUNREACH
        NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid"
      2db767d9
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.15-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 788c1da0
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Here are some bug fixes for 4.15-rc2.
      
         - fix memory leaks that appeared after removing ifork inline data
           buffer
      
         - recover deferred rmap update log items in correct order
      
         - fix memory leaks when buffer construction fails
      
         - fix memory leaks when bmbt is corrupt
      
         - fix some uninitialized variables and math problems in the quota
           scrubber
      
         - add some omitted attribution tags on the log replay commit
      
         - fix some UBSAN complaints about integer overflows with large sparse
           files
      
         - implement an effective inode mode check in online fsck
      
         - fix log's inability to retry quota item writeout due to transient
           errors"
      
      * tag 'xfs-4.15-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: Properly retry failed dquot items in case of error during buffer writeback
        xfs: scrub inode mode properly
        xfs: remove unused parameter from xfs_writepage_map
        xfs: ubsan fixes
        xfs: calculate correct offset in xfs_scrub_quota_item
        xfs: fix uninitialized variable in xfs_scrub_quota
        xfs: fix leaks on corruption errors in xfs_bmap.c
        xfs: fortify xfs_alloc_buftarg error handling
        xfs: log recovery should replay deferred ops in order
        xfs: always free inline data before resetting inode fork during ifree
      788c1da0
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.15-rc2_cleanups' of... · e1ba1c99
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
      
      Pull RISC-V cleanups and ABI fixes from Palmer Dabbelt:
       "This contains a handful of small cleanups that are a result of
        feedback that didn't make it into our original patch set, either
        because the feedback hadn't been given yet, I missed the original
        emails, or we weren't ready to submit the changes yet.
      
        I've been maintaining the various cleanup patch sets I have as their
        own branches, which I then merged together and signed. Each merge
        commit has a short summary of the changes, and each branch is based on
        your latest tag (4.15-rc1, in this case). If this isn't the right way
        to do this then feel free to suggest something else, but it seems sane
        to me.
      
        Here's a short summary of the changes, roughly in order of how
        interesting they are.
      
         - libgcc.h has been moved from include/lib, where it's the only
           member, to include/linux. This is meant to avoid tab completion
           conflicts.
      
         - VDSO entries for clock_get/gettimeofday/getcpu have been added.
           These are simple syscalls now, but we want to let glibc use them
           from the start so we can make them faster later.
      
         - A VDSO entry for instruction cache flushing has been added so
           userspace can flush the instruction cache.
      
         - The VDSO symbol versions for __vdso_cmpxchg{32,64} have been
           removed, as those VDSO entries don't actually exist.
      
         - __io_writes has been corrected to respect the given type.
      
         - A new READ_ONCE in arch_spin_is_locked().
      
         - __test_and_op_bit_ord() is now actually ordered.
      
         - Various small fixes throughout the tree to enable allmodconfig to
           build cleanly.
      
         - Removal of some dead code in our atomic support headers.
      
         - Improvements to various comments in our atomic support headers"
      
      * tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: (23 commits)
        RISC-V: __io_writes should respect the length argument
        move libgcc.h to include/linux
        RISC-V: Clean up an unused include
        RISC-V: Allow userspace to flush the instruction cache
        RISC-V: Flush I$ when making a dirty page executable
        RISC-V: Add missing include
        RISC-V: Use define for get_cycles like other architectures
        RISC-V: Provide stub of setup_profiling_timer()
        RISC-V: Export some expected symbols for modules
        RISC-V: move empty_zero_page definition to C and export it
        RISC-V: io.h: type fixes for warnings
        RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros
        RISC-V: use generic serial.h
        RISC-V: remove spin_unlock_wait()
        RISC-V: `sfence.vma` orderes the instruction cache
        RISC-V: Add READ_ONCE in arch_spin_is_locked()
        RISC-V: __test_and_op_bit_ord should be strongly ordered
        RISC-V: Remove smb_mb__{before,after}_spinlock()
        RISC-V: Remove __smp_bp__{before,after}_atomic
        RISC-V: Comment on why {,cmp}xchg is ordered how it is
        ...
      e1ba1c99
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 4b1967c9
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "The critical one here is a fix for fpsimd register corruption across
        signals which was introduced by the SVE support code (the register
        files overlap), but the others are worth having as well.
      
        Summary:
      
         - Fix FP register corruption when SVE is not available or in use
      
         - Fix out-of-tree module build failure when CONFIG_ARM64_MODULE_PLTS=y
      
         - Missing 'const' generating errors with LTO builds
      
         - Remove unsupported events from Cortex-A73 PMU description
      
         - Removal of stale and incorrect comments"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: context: Fix comments and remove pointless smp_wmb()
        arm64: cpu_ops: Add missing 'const' qualifiers
        arm64: perf: remove unsupported events for Cortex-A73
        arm64: fpsimd: Fix failure to restore FPSIMD state after signals
        arm64: pgd: Mark pgd_cache as __ro_after_init
        arm64: ftrace: emit ftrace-mod.o contents through code
        arm64: module-plts: factor out PLT generation code for ftrace
        arm64: mm: cleanup stale AIVIVT references
      4b1967c9
  4. 01 Dec, 2017 16 commits
    • Palmer Dabbelt's avatar
      RISC-V: Fixes for clean allmodconfig build · 3b62de26
      Palmer Dabbelt authored
      Olaf said: Here's a short series of patches that produces a working
      allmodconfig. Would be nice to see them go in so we can add build
      coverage.
      
      I've dropped patches 8 and 10 from the original set:
      
      * [PATCH 08/10] (RISC-V: Set __ARCH_WANT_RENAMEAT to pick up generic
        version) has a better fix that I've sent out for review, we don't want
        renameat.
      * [PATCH 10/10] (input: joystick: riscv has get_cycles) has already been
        taken into Dmitry Torokhov's tree.
      3b62de26
    • Palmer Dabbelt's avatar
      move libgcc.h to include/linux · 185e788c
      Palmer Dabbelt authored
      185e788c
    • Palmer Dabbelt's avatar
      7382fbde
    • Palmer Dabbelt's avatar
      RISC-V: User-Visible Changes · 07f8ba74
      Palmer Dabbelt authored
      This merge contains the user-visible, ABI-breaking changes that we want
      to make sure we have in Linux before our first release.   Highlights
      include:
      
      * VDSO entries for clock_get/gettimeofday/getcpu have been added.  These
        are simple syscalls now, but we want to let glibc use them from the
        start so we can make them faster later.
      * A VDSO entry for instruction cache flushing has been added so
        userspace can flush the instruction cache.
      * The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed,
        as those VDSO entries don't actually exist.
      
      Conflicts:
              arch/riscv/include/asm/tlbflush.h
      07f8ba74
    • Palmer Dabbelt's avatar
      RISC-V Atomic Cleanups · f8182f61
      Palmer Dabbelt authored
      This patch set is the result of some feedback that filtered through
      after our original patch set was reviewed, some of which was the result
      of me missing some email.  It contains:
      
      * A new READ_ONCE in arch_spin_is_locked()
      * __test_and_op_bit_ord() is now actually ordered
      * Improvements to various comments
      * Removal of some dead code
      f8182f61
    • Palmer Dabbelt's avatar
      RISC-V: __io_writes should respect the length argument · da894ff1
      Palmer Dabbelt authored
      Whoops -- I must have just been being an idiot again.  Thanks to Segher
      for finding the bug :).
      
      CC: Segher Boessenkool <segher@kernel.crashing.org>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      da894ff1
    • Christoph Hellwig's avatar
      move libgcc.h to include/linux · 4db2b604
      Christoph Hellwig authored
      Introducing a new include/lib directory just for this file totally
      messes up tab completion for include/linux, which is highly annoying.
      
      Move it to include/linux where we have headers for all kinds of other
      lib/ code as well.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      4db2b604
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · a0651c7f
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Two fixes for nasty kexec/kdump crashes in certain configurations.
      
        A couple of minor fixes for the new TIDR code.
      
        A fix for an oops in a CXL error handling path.
      
        Thanks to: Andrew Donnellan, Christophe Lombard, David Gibson, Mahesh
        Salgaonkar, Vaibhav Jain"
      
      * tag 'powerpc-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Do not assign thread.tidr if already assigned
        powerpc: Avoid signed to unsigned conversion in set_thread_tidr()
        powerpc/kexec: Fix kexec/kdump in P9 guest kernels
        powerpc/powernv: Fix kexec crashes caused by tlbie tracing
        cxl: Check if vphb exists before iterating over AFU devices
      a0651c7f
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20171201' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · ae753ee2
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
       "Two fix patches for the AFS filesystem:
      
         - Fix the refcounting on permit caching.
      
         - AFS inode (afs_vnode) fields need resetting after allocation
           because they're only initialised when slab pages are obtained from
           the page allocator"
      
      * tag 'afs-fixes-20171201' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Properly reset afs_vnode (inode) fields
        afs: Fix permit refcounting
      ae753ee2
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 3c1c4ddf
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC core:
         - Ensure that debugfs files are removed properly
         - Fix missing blk_put_request()
         - Deal with errors from blk_get_request()
         - Rewind mmc bus suspend operations at failures
         - Prepend '0x' to ocr and pre_eol_info in sysfs to identify as hex
      
        MMC host:
         - sdhci-msm: Make it optional to wait for signal level changes
         - sdhci: Avoid swiotlb buffer being full"
      
      * tag 'mmc-v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: core: prepend 0x to OCR entry in sysfs
        mmc: core: prepend 0x to pre_eol_info entry in sysfs
        mmc: sdhci: Avoid swiotlb buffer being full
        mmc: sdhci-msm: Optionally wait for signal level changes
        mmc: block: Ensure that debugfs files are removed
        mmc: core: Do not leave the block driver in a suspended state
        mmc: block: Check return value of blk_get_request()
        mmc: block: Fix missing blk_put_request()
      3c1c4ddf
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.15-rc2' of git://people.freedesktop.org/~airlied/linux · 5dc9cbc4
      Linus Torvalds authored
      Pull drm fixes and cleanups from Dave Airlie:
       "The main thing are a bunch of fixes for the new amd display code, a
        bunch of smatch fixes.
      
        core:
         - Atomic helper regression fix.
         - Deferred fbdev fallout regression fix.
      
        amdgpu:
         - New display code (dc) dpms, suspend/resume and smatch fixes, along
           with some others
         - Some regression fixes for amdkfd/radeon.
         - Fix a ttm regression for swiotlb disabled
      
        bridge:
         - A bunch of fixes for the tc358767 bridge
      
        mali-dp + hdlcd:
         - some fixes and internal API catchups.
      
        imx-drm:
         -regression fix in atomic code.
      
        omapdrm:
         - platform detection regression fixes"
      
      * tag 'drm-fixes-for-v4.15-rc2' of git://people.freedesktop.org/~airlied/linux: (76 commits)
        drm/imx: always call wait_for_flip_done in commit_tail
        omapdrm: hdmi4_cec: signedness bug in hdmi4_cec_init()
        drm: omapdrm: Fix DPI on platforms using the DSI VDDS
        omapdrm: hdmi4: Correct the SoC revision matching
        drm/omap: displays: panel-dpi: add backlight dependency
        drm/omap: Fix error handling path in 'omap_dmm_probe()'
        drm/i915: Disable THP until we have a GPU read BW W/A
        drm/bridge: tc358767: fix 1-lane behavior
        drm/bridge: tc358767: fix AUXDATAn registers access
        drm/bridge: tc358767: fix timing calculations
        drm/bridge: tc358767: fix DP0_MISC register set
        drm/bridge: tc358767: filter out too high modes
        drm/bridge: tc358767: do no fail on hi-res displays
        drm/bridge: Fix lvds-encoder since the panel_bridge rework.
        drm/bridge: synopsys/dw-hdmi: Enable cec clock
        drm/bridge: adv7511/33: Fix adv7511_cec_init() failure handling
        drm/radeon: remove init of CIK VMIDs 8-16 for amdkfd
        drm/ttm: fix populate_and_map() functions once more
        drm/fb_helper: Disable all crtc's when initial setup fails.
        drm/atomic: make drm_atomic_helper_wait_for_vblanks more agressive
        ...
      5dc9cbc4
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 75f64f68
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A selection of fixes/changes that should make it into this series.
        This contains:
      
         - NVMe, two merges, containing:
              - pci-e, rdma, and fc fixes
              - Device quirks
      
         - Fix for a badblocks leak in null_blk
      
         - bcache fix from Rui Hua for a race condition regression where
           -EINTR was returned to upper layers that didn't expect it.
      
         - Regression fix for blktrace for a bug introduced in this series.
      
         - blktrace cleanup for cgroup id.
      
         - bdi registration error handling.
      
         - Small series with cleanups for blk-wbt.
      
         - Various little fixes for typos and the like.
      
        Nothing earth shattering, most important are the NVMe and bcache fixes"
      
      * 'for-linus' of git://git.kernel.dk/linux-block: (34 commits)
        nvme-pci: fix NULL pointer dereference in nvme_free_host_mem()
        nvme-rdma: fix memory leak during queue allocation
        blktrace: fix trace mutex deadlock
        nvme-rdma: Use mr pool
        nvme-rdma: Check remotely invalidated rkey matches our expected rkey
        nvme-rdma: wait for local invalidation before completing a request
        nvme-rdma: don't complete requests before a send work request has completed
        nvme-rdma: don't suppress send completions
        bcache: check return value of register_shrinker
        bcache: recover data from backing when data is clean
        bcache: Fix building error on MIPS
        bcache: add a comment in journal bucket reading
        nvme-fc: don't use bit masks for set/test_bit() numbers
        blk-wbt: fix comments typo
        blk-wbt: move wbt_clear_stat to common place in wbt_done
        blk-sysfs: remove NULL pointer checking in queue_wb_lat_store
        blk-wbt: remove duplicated setting in wbt_init
        nvme-pci: add quirk for delay before CHK RDY for WDC SN200
        block: remove useless assignment in bio_split
        null_blk: fix dev->badblocks leak
        ...
      75f64f68
    • Will Deacon's avatar
      arm64: context: Fix comments and remove pointless smp_wmb() · 3a33c760
      Will Deacon authored
      The comments in the ASID allocator incorrectly hint at an MP-style idiom
      using the asid_generation and the active_asids array. In fact, the
      synchronisation is achieved using a combination of an xchg operation
      and a spinlock, so update the comments and remove the pointless smp_wmb().
      
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3a33c760
    • Yury Norov's avatar
      arm64: cpu_ops: Add missing 'const' qualifiers · 770ba060
      Yury Norov authored
      Building the kernel with an LTO-enabled GCC spits out the following "const"
      warning for the cpu_ops code:
      
        mm/percpu.c:2168:20: error: pcpu_fc_names causes a section type conflict
        with dt_supported_cpu_ops
        const char * const pcpu_fc_names[PCPU_FC_NR] __initconst = {
                ^
        arch/arm64/kernel/cpu_ops.c:34:37: note: ‘dt_supported_cpu_ops’ was declared here
        static const struct cpu_operations *dt_supported_cpu_ops[] __initconst = {
      
      Fix it by adding missed const qualifiers.
      Signed-off-by: default avatarYury Norov <ynorov@caviumnetworks.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      770ba060
    • Xu YiPing's avatar
      arm64: perf: remove unsupported events for Cortex-A73 · f8ada189
      Xu YiPing authored
      bus access read/write events are not supported in A73, based on the
      Cortex-A73 TRM r0p2, section 11.9 Events (pages 11-457 to 11-460).
      
      Fixes: 5561b6c5 "arm64: perf: add support for Cortex-A73"
      Acked-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: default avatarXu YiPing <xuyiping@hisilicon.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      f8ada189
    • Dave Martin's avatar
      arm64: fpsimd: Fix failure to restore FPSIMD state after signals · 9de52a75
      Dave Martin authored
      The fpsimd_update_current_state() function is responsible for
      loading the FPSIMD state from the user signal frame into the
      current task during sigreturn.  When implementing support for SVE,
      conditional code was added to this function in order to handle the
      case where SVE state need to be loaded for the task and merged with
      the FPSIMD data from the signal frame; however, the FPSIMD-only
      case was unintentionally dropped.
      
      As a result of this, sigreturn does not currently restore the
      FPSIMD state of the task, except in the case where the system
      supports SVE and the signal frame contains SVE state in addition to
      FPSIMD state.
      
      This patch fixes this bug by making the copy-in of the FPSIMD data
      from the signal frame to thread_struct unconditional.
      
      This remains a performance regression from v4.14, since the FPSIMD
      state is now copied into thread_struct and then loaded back,
      instead of _only_ being loaded into the CPU FPSIMD registers.
      However, it is essential to call task_fpsimd_load() here anyway in
      order to ensure that the SVE enable bit in CPACR_EL1 is set
      correctly before returning to userspace.  This could use some
      refactoring, but since sigreturn is not a fast path I have kept
      this patch as a pure fix and left the refactoring for later.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Fixes: 8cd969d2 ("arm64/sve: Signal handling support")
      Reported-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Tested-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      9de52a75