1. 08 Jun, 2017 23 commits
    • Mat Martineau's avatar
      KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API · 4099177c
      Mat Martineau authored
      The initial Diffie-Hellman computation made direct use of the MPI
      library because the crypto module did not support DH at the time. Now
      that KPP is implemented, KEYCTL_DH_COMPUTE should use it to get rid of
      duplicate code and leverage possible hardware acceleration.
      
      This fixes an issue whereby the input to the KDF computation would
      include additional uninitialized memory when the result of the
      Diffie-Hellman computation was shorter than the input prime number.
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4099177c
    • Loganaden Velvindron's avatar
    • Eric Biggers's avatar
    • Eric Biggers's avatar
      KEYS: DH: ensure the KDF counter is properly aligned · 87784748
      Eric Biggers authored
      Accessing a 'u8[4]' through a '__be32 *' violates alignment rules.  Just
      make the counter a __be32 instead.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarStephan Mueller <smueller@chronox.de>
      87784748
    • Eric Biggers's avatar
      KEYS: DH: don't feed uninitialized "otherinfo" into KDF · 744f05d3
      Eric Biggers authored
      If userspace called KEYCTL_DH_COMPUTE with kdf_params containing NULL
      otherinfo but nonzero otherinfolen, the kernel would allocate a buffer
      for the otherinfo, then feed it into the KDF without initializing it.
      Fix this by always doing the copy from userspace (which will fail with
      EFAULT in this scenario).
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarStephan Mueller <smueller@chronox.de>
      744f05d3
    • Eric Biggers's avatar
      KEYS: DH: forbid using digest_null as the KDF hash · 7265558d
      Eric Biggers authored
      Requesting "digest_null" in the keyctl_kdf_params caused an infinite
      loop in kdf_ctr() because the "null" hash has a digest size of 0.  Fix
      it by rejecting hash algorithms with a digest size of 0.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarStephan Mueller <smueller@chronox.de>
      7265558d
    • Eric Biggers's avatar
      KEYS: sanitize key structs before freeing · f1dcb268
      Eric Biggers authored
      While a 'struct key' itself normally does not contain sensitive
      information, Documentation/security/keys.txt actually encourages this:
      
           "Having a payload is not required; and the payload can, in fact,
           just be a value stored in the struct key itself."
      
      In case someone has taken this advice, or will take this advice in the
      future, zero the key structure before freeing it.  We might as well, and
      as a bonus this could make it a bit more difficult for an adversary to
      determine which keys have recently been in use.
      
      This is safe because the key_jar cache does not use a constructor.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f1dcb268
    • Eric Biggers's avatar
      KEYS: trusted: sanitize all key material · 6ee46e6b
      Eric Biggers authored
      As the previous patch did for encrypted-keys, zero sensitive any
      potentially sensitive data related to the "trusted" key type before it
      is freed.  Notably, we were not zeroing the tpm_buf structures in which
      the actual key is stored for TPM seal and unseal, nor were we zeroing
      the trusted_key_payload in certain error paths.
      
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: David Safford <safford@us.ibm.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6ee46e6b
    • Eric Biggers's avatar
      KEYS: encrypted: sanitize all key material · 86f21380
      Eric Biggers authored
      For keys of type "encrypted", consistently zero sensitive key material
      before freeing it.  This was already being done for the decrypted
      payloads of encrypted keys, but not for the master key and the keys
      derived from the master key.
      
      Out of an abundance of caution and because it is trivial to do so, also
      zero buffers containing the key payload in encrypted form, although
      depending on how the encrypted-keys feature is used such information
      does not necessarily need to be kept secret.
      
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: David Safford <safford@us.ibm.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      86f21380
    • Eric Biggers's avatar
      KEYS: user_defined: sanitize key payloads · 21b116d0
      Eric Biggers authored
      Zero the payloads of user and logon keys before freeing them.  This
      prevents sensitive key material from being kept around in the slab
      caches after a key is released.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      21b116d0
    • Eric Biggers's avatar
      KEYS: sanitize add_key() and keyctl() key payloads · 7df036ef
      Eric Biggers authored
      Before returning from add_key() or one of the keyctl() commands that
      takes in a key payload, zero the temporary buffer that was allocated to
      hold the key payload copied from userspace.  This may contain sensitive
      key material that should not be kept around in the slab caches.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      7df036ef
    • Eric Biggers's avatar
      KEYS: fix freeing uninitialized memory in key_update() · dcedad31
      Eric Biggers authored
      key_update() freed the key_preparsed_payload even if it was not
      initialized first.  This would cause a crash if userspace called
      keyctl_update() on a key with type like "asymmetric" that has a
      ->preparse() method but not an ->update() method.  Possibly it could
      even be triggered for other key types by racing with keyctl_setperm() to
      make the KEY_NEED_WRITE check fail (the permission was already checked,
      so normally it wouldn't fail there).
      
      Reproducer with key type "asymmetric", given a valid cert.der:
      
      keyctl new_session
      keyid=$(keyctl padd asymmetric desc @s < cert.der)
      keyctl setperm $keyid 0x3f000000
      keyctl update $keyid data
      
      [  150.686666] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
      [  150.687601] IP: asymmetric_key_free_kids+0x12/0x30
      [  150.688139] PGD 38a3d067
      [  150.688141] PUD 3b3de067
      [  150.688447] PMD 0
      [  150.688745]
      [  150.689160] Oops: 0000 [#1] SMP
      [  150.689455] Modules linked in:
      [  150.689769] CPU: 1 PID: 2478 Comm: keyctl Not tainted 4.11.0-rc4-xfstests-00187-ga9f6b6b8 #742
      [  150.690916] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
      [  150.692199] task: ffff88003b30c480 task.stack: ffffc90000350000
      [  150.692952] RIP: 0010:asymmetric_key_free_kids+0x12/0x30
      [  150.693556] RSP: 0018:ffffc90000353e58 EFLAGS: 00010202
      [  150.694142] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000004
      [  150.694845] RDX: ffffffff81ee3920 RSI: ffff88003d4b0700 RDI: 0000000000000001
      [  150.697569] RBP: ffffc90000353e60 R08: ffff88003d5d2140 R09: 0000000000000000
      [  150.702483] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
      [  150.707393] R13: 0000000000000004 R14: ffff880038a4d2d8 R15: 000000000040411f
      [  150.709720] FS:  00007fcbcee35700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
      [  150.711504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  150.712733] CR2: 0000000000000001 CR3: 0000000039eab000 CR4: 00000000003406e0
      [  150.714487] Call Trace:
      [  150.714975]  asymmetric_key_free_preparse+0x2f/0x40
      [  150.715907]  key_update+0xf7/0x140
      [  150.716560]  ? key_default_cmp+0x20/0x20
      [  150.717319]  keyctl_update_key+0xb0/0xe0
      [  150.718066]  SyS_keyctl+0x109/0x130
      [  150.718663]  entry_SYSCALL_64_fastpath+0x1f/0xc2
      [  150.719440] RIP: 0033:0x7fcbce75ff19
      [  150.719926] RSP: 002b:00007ffd5d167088 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa
      [  150.720918] RAX: ffffffffffffffda RBX: 0000000000404d80 RCX: 00007fcbce75ff19
      [  150.721874] RDX: 00007ffd5d16785e RSI: 000000002866cd36 RDI: 0000000000000002
      [  150.722827] RBP: 0000000000000006 R08: 000000002866cd36 R09: 00007ffd5d16785e
      [  150.723781] R10: 0000000000000004 R11: 0000000000000206 R12: 0000000000404d80
      [  150.724650] R13: 00007ffd5d16784d R14: 00007ffd5d167238 R15: 000000000040411f
      [  150.725447] Code: 83 c4 08 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 ff 74 23 55 48 89 e5 53 48 89 fb <48> 8b 3f e8 06 21 c5 ff 48 8b 7b 08 e8 fd 20 c5 ff 48 89 df e8
      [  150.727489] RIP: asymmetric_key_free_kids+0x12/0x30 RSP: ffffc90000353e58
      [  150.728117] CR2: 0000000000000001
      [  150.728430] ---[ end trace f7f8fe1da2d5ae8d ]---
      
      Fixes: 4d8c0250 ("KEYS: Call ->free_preparse() even after ->preparse() returns an error")
      Cc: stable@vger.kernel.org # 3.17+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      dcedad31
    • Eric Biggers's avatar
      KEYS: fix dereferencing NULL payload with nonzero length · d1743dd0
      Eric Biggers authored
      sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a
      NULL payload with nonzero length to be passed to the key type's
      ->preparse(), ->instantiate(), and/or ->update() methods.  Various key
      types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did
      not handle this case, allowing an unprivileged user to trivially cause a
      NULL pointer dereference (kernel oops) if one of these key types was
      present.  Fix it by doing the copy_from_user() when 'plen' is nonzero
      rather than when '_payload' is non-NULL, causing the syscall to fail
      with EFAULT as expected when an invalid buffer is specified.
      
      Cc: stable@vger.kernel.org # 2.6.10+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d1743dd0
    • Eric Biggers's avatar
      KEYS: encrypted: use constant-time HMAC comparison · 25bba0ce
      Eric Biggers authored
      MACs should, in general, be compared using crypto_memneq() to prevent
      timing attacks.
      
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      25bba0ce
    • Eric Biggers's avatar
      KEYS: encrypted: fix race causing incorrect HMAC calculations · d76155ae
      Eric Biggers authored
      The encrypted-keys module was using a single global HMAC transform,
      which could be rekeyed by multiple threads concurrently operating on
      different keys, causing incorrect HMAC values to be calculated.  Fix
      this by allocating a new HMAC transform whenever we need to calculate a
      HMAC.  Also simplify things a bit by allocating the shash_desc's using
      SHASH_DESC_ON_STACK() for both the HMAC and unkeyed hashes.
      
      The following script reproduces the bug:
      
          keyctl new_session
          keyctl add user master "abcdefghijklmnop" @s
          for i in $(seq 2); do
              (
                  set -e
                  for j in $(seq 1000); do
                      keyid=$(keyctl add encrypted desc$i "new user:master 25" @s)
                      datablob="$(keyctl pipe $keyid)"
                      keyctl unlink $keyid > /dev/null
                      keyid=$(keyctl add encrypted desc$i "load $datablob" @s)
                      keyctl unlink $keyid > /dev/null
                  done
              ) &
          done
      
      Output with bug:
      
          [  439.691094] encrypted_key: bad hmac (-22)
          add_key: Invalid argument
          add_key: Invalid argument
      
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d76155ae
    • Eric Biggers's avatar
      KEYS: encrypted: fix buffer overread in valid_master_desc() · f2173234
      Eric Biggers authored
      With the 'encrypted' key type it was possible for userspace to provide a
      data blob ending with a master key description shorter than expected,
      e.g. 'keyctl add encrypted desc "new x" @s'.  When validating such a
      master key description, validate_master_desc() could read beyond the end
      of the buffer.  Fix this by using strncmp() instead of memcmp().  [Also
      clean up the code to deduplicate some logic.]
      
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f2173234
    • Eric Biggers's avatar
      KEYS: encrypted: avoid encrypting/decrypting stack buffers · e6235572
      Eric Biggers authored
      Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt
      stack buffers because the stack may be virtually mapped.  Fix this for
      the padding buffers in encrypted-keys by using ZERO_PAGE for the
      encryption padding and by allocating a temporary heap buffer for the
      decryption padding.
      
      Tested with CONFIG_DEBUG_SG=y:
      	keyctl new_session
      	keyctl add user master "abcdefghijklmnop" @s
      	keyid=$(keyctl add encrypted desc "new user:master 25" @s)
      	datablob="$(keyctl pipe $keyid)"
      	keyctl unlink $keyid
      	keyid=$(keyctl add encrypted desc "load $datablob" @s)
      	datablob2="$(keyctl pipe $keyid)"
      	[ "$datablob" = "$datablob2" ] && echo "Success!"
      
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org # 4.9+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e6235572
    • Eric Biggers's avatar
      KEYS: put keyring if install_session_keyring_to_cred() fails · 03ce2e0f
      Eric Biggers authored
      In join_session_keyring(), if install_session_keyring_to_cred() were to
      fail, we would leak the keyring reference, just like in the bug fixed by
      commit 23567fd0 ("KEYS: Fix keyring ref leak in
      join_session_keyring()").  Fortunately this cannot happen currently, but
      we really should be more careful.  Do this by adding and using a new
      error label at which the keyring reference is dropped.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      03ce2e0f
    • Markus Elfring's avatar
      KEYS: Delete an error message for a failed memory allocation in get_derived_key() · c203f3b0
      Markus Elfring authored
      Omit an extra message for a memory allocation failure in this function.
      
      This issue was detected by using the Coccinelle software.
      
      Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdfSigned-off-by: default avatarMarkus Elfring <elfring@users.sourceforge.net>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c203f3b0
    • Dan Carpenter's avatar
      X.509: Fix error code in x509_cert_parse() · 4df79702
      Dan Carpenter authored
      We forgot to set the error code on this path so it could result in
      returning NULL which leads to a NULL dereference.
      
      Fixes: db6c43bd ("crypto: KEYS: convert public key and digsig asym to the akcipher api")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4df79702
    • Mark Rutland's avatar
      KEYS: fix refcount_inc() on zero · 7f973ed9
      Mark Rutland authored
      If a key's refcount is dropped to zero between key_lookup() peeking at
      the refcount and subsequently attempting to increment it, refcount_inc()
      will see a zero refcount.  Here, refcount_inc() will WARN_ONCE(), and
      will *not* increment the refcount, which will remain zero.
      
      Once key_lookup() drops key_serial_lock, it is possible for the key to
      be freed behind our back.
      
      This patch uses refcount_inc_not_zero() to perform the peek and increment
      atomically.
      
      Fixes: fff29291 ("security, keys: convert key.usage from atomic_t to refcount_t")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: David Windsor <dwindsor@gmail.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hans Liljestrand <ishkamiel@gmail.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      7f973ed9
    • Davidlohr Bueso's avatar
      security: use READ_ONCE instead of deprecated ACCESS_ONCE · 3169b64f
      Davidlohr Bueso authored
      With the new standardized functions, we can replace all ACCESS_ONCE()
      calls across relevant security/keyrings/.
      
      ACCESS_ONCE() does not work reliably on non-scalar types. For example
      gcc 4.6 and 4.7 might remove the volatile tag for such accesses during
      the SRA (scalar replacement of aggregates) step:
      
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
      
      Update the new calls regardless of if it is a scalar type, this is
      cleaner than having three alternatives.
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3169b64f
    • Bilal Amarni's avatar
      security/keys: add CONFIG_KEYS_COMPAT to Kconfig · 9c4ca42f
      Bilal Amarni authored
      CONFIG_KEYS_COMPAT is defined in arch-specific Kconfigs and is missing for
      several 64-bit architectures : mips, parisc, tile.
      
      At the moment and for those architectures, calling in 32-bit userspace the
      keyctl syscall would return an ENOSYS error.
      
      This patch moves the CONFIG_KEYS_COMPAT option to security/keys/Kconfig, to
      make sure the compatibility wrapper is registered by default for any 64-bit
      architecture as long as it is configured with CONFIG_COMPAT.
      
      [DH: Modified to remove arm64 compat enablement also as requested by Eric
       Biggers]
      Signed-off-by: default avatarBilal Amarni <bilal.amarni@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      cc: Eric Biggers <ebiggers3@gmail.com>
      9c4ca42f
  2. 06 Jun, 2017 17 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b29794ec
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Made TCP congestion control documentation match current reality,
          from Anmol Sarma.
      
       2) Various build warning and failure fixes from Arnd Bergmann.
      
       3) Fix SKB list leak in ipv6_gso_segment().
      
       4) Use after free in ravb driver, from Eugeniu Rosca.
      
       5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.
      
       6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
          Piccoli.
      
       7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
          Zhang.
      
       8) Use after free in vxlan deletion, from Mark Bloch.
      
       9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
          Filippov.
      
      10) Fix stmmac hangs with TSO, from Niklas Cassel.
      
      11) Fix crash in CALIPSO ipv6, from Richard Haines.
      
      12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.
      
      13) Fix regression in sk_err socket error queue handling, noticed by
          ping applications. From Soheil Hassas Yeganeh.
      
      14) Update mlx4/mlx5 MAINTAINERS information.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
        net: stmmac: fix a broken u32 less than zero check
        net: stmmac: fix completely hung TX when using TSO
        net: ethoc: enable NAPI before poll may be scheduled
        net: bridge: fix a null pointer dereference in br_afspec
        ravb: Fix use-after-free on `ifconfig eth0 down`
        net/ipv6: Fix CALIPSO causing GPF with datagram support
        net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
        Revert "sit: reload iphdr in ipip6_rcv"
        i40e/i40evf: proper update of the page_offset field
        i40e: Fix state flags for bit set and clean operations of PF
        iwlwifi: fix host command memory leaks
        iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
        iwlwifi: mvm: clear new beacon command template struct
        iwlwifi: mvm: don't fail when removing a key from an inexisting sta
        iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
        iwlwifi: mvm: fix firmware debug restart recording
        iwlwifi: tt: move ucode_loaded check under mutex
        iwlwifi: mvm: support ibss in dqa mode
        iwlwifi: mvm: Fix command queue number on d0i3 flow
        iwlwifi: mvm: rs: start using LQ command color
        ...
      b29794ec
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · e87f327e
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
      
       1) Fix TLB context wrap races, from Pavel Tatashin.
      
       2) Cure some gcc-7 build issues.
      
       3) Handle invalid setup_hugepagesz command line values properly, from
          Liam R Howlett.
      
       4) Copy TSB using the correct address shift for the huge TSB, from Mike
          Kravetz.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: delete old wrap code
        sparc64: new context wrap
        sparc64: add per-cpu mm of secondary contexts
        sparc64: redefine first version
        sparc64: combine activate_mm and switch_mm
        sparc64: reset mm cpumask after wrap
        sparc/mm/hugepages: Fix setup_hugepagesz for invalid values.
        sparc: Machine description indices can vary
        sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
        arch/sparc: support NR_CPUS = 4096
        sparc64: Add __multi3 for gcc 7.x and later.
        sparc64: Fix build warnings with gcc 7.
        arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
      e87f327e
    • David Rientjes's avatar
      compiler, clang: suppress warning for unused static inline functions · abb2ea7d
      David Rientjes authored
      GCC explicitly does not warn for unused static inline functions for
      -Wunused-function.  The manual states:
      
      	Warn whenever a static function is declared but not defined or
      	a non-inline static function is unused.
      
      Clang does warn for static inline functions that are unused.
      
      It turns out that suppressing the warnings avoids potentially complex
      #ifdef directives, which also reduces LOC.
      
      Suppress the warning for clang.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      abb2ea7d
    • David S. Miller's avatar
      Merge branch 'sparc64-context-wrap-fixes' · b3aefc2f
      David S. Miller authored
      Pavel Tatashin says:
      
      ====================
      sparc64: context wrap fixes
      
      This patch series contains fixes for context wrap: when we are out of
      context ids, and need to get a new version.
      
      It fixes memory corruption issues which happen when more than number of
      context ids (currently set to 8K) number of processes are started
      simultaneously, and processes can get a wrong context.
      
      sparc64: new context wrap:
      - contains explanation of new wrap method, and also explanation of races
        that it solves
      sparc64: reset mm cpumask after wrap
      - explains issue of not reseting cpu mask on a wrap
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3aefc2f
    • Pavel Tatashin's avatar
      sparc64: delete old wrap code · 0197e41c
      Pavel Tatashin authored
      The old method that is using xcall and softint to get new context id is
      deleted, as it is replaced by a method of using per_cpu_secondary_mm
      without xcall to perform the context wrap.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0197e41c
    • Pavel Tatashin's avatar
      sparc64: new context wrap · a0582f26
      Pavel Tatashin authored
      The current wrap implementation has a race issue: it is called outside of
      the ctx_alloc_lock, and also does not wait for all CPUs to complete the
      wrap.  This means that a thread can get a new context with a new version
      and another thread might still be running with the same context. The
      problem is especially severe on CPUs with shared TLBs, like sun4v. I used
      the following test to very quickly reproduce the problem:
      - start over 8K processes (must be more than context IDs)
      - write and read values at a  memory location in every process.
      
      Very quickly memory corruptions start happening, and what we read back
      does not equal what we wrote.
      
      Several approaches were explored before settling on this one:
      
      Approach 1:
      Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for
      every process to complete the wrap. (Note: every CPU must WAIT before
      leaving smp_new_mmu_context_version_client() until every one arrives).
      
      This approach ends up with deadlocks, as some threads own locks which other
      threads are waiting for, and they never receive softint until these threads
      exit smp_new_mmu_context_version_client(). Since we do not allow the exit,
      deadlock happens.
      
      Approach 2:
      Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into
      into C code, and issue new versions to every CPU.
      This approach adds some overhead to runtime: in switch_mm() we must add
      some checks to make sure that versions have not changed due to wrap while
      we were loading the new secondary context. (could be protected by PSTATE_IE
      but that degrades performance as on M7 and older CPUs as it takes 50 cycles
      for each access). Also, we still need a global per-cpu array of MMs to know
      where we need to load new contexts, otherwise we can change context to a
      thread that is going way (if we received mondo between switch_mm() and
      switch_to() time). Finally, there are some issues with window registers in
      rtrap() when context IDs are changed during CPU mondo time.
      
      The approach in this patch is the simplest and has almost no impact on
      runtime.  We use the array with mm's where last secondary contexts were
      loaded onto CPUs and bump their versions to the new generation without
      changing context IDs. If a new process comes in to get a context ID, it
      will go through get_new_mmu_context() because of version mismatch. But the
      running processes do not need to be interrupted. And wrap is quicker as we
      do not need to xcall and wait for everyone to receive and complete wrap.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0582f26
    • Pavel Tatashin's avatar
      sparc64: add per-cpu mm of secondary contexts · 7a5b4bbf
      Pavel Tatashin authored
      The new wrap is going to use information from this array to figure out
      mm's that currently have valid secondary contexts setup.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a5b4bbf
    • Pavel Tatashin's avatar
      sparc64: redefine first version · c4415235
      Pavel Tatashin authored
      CTX_FIRST_VERSION defines the first context version, but also it defines
      first context. This patch redefines it to only include the first context
      version.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4415235
    • Pavel Tatashin's avatar
      sparc64: combine activate_mm and switch_mm · 14d0334c
      Pavel Tatashin authored
      The only difference between these two functions is that in activate_mm we
      unconditionally flush context. However, there is no need to keep this
      difference after fixing a bug where cpumask was not reset on a wrap. So, in
      this patch we combine these.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14d0334c
    • Pavel Tatashin's avatar
      sparc64: reset mm cpumask after wrap · 58897485
      Pavel Tatashin authored
      After a wrap (getting a new context version) a process must get a new
      context id, which means that we would need to flush the context id from
      the TLB before running for the first time with this ID on every CPU. But,
      we use mm_cpumask to determine if this process has been running on this CPU
      before, and this mask is not reset after a wrap. So, there are two possible
      fixes for this issue:
      
      1. Clear mm cpumask whenever mm gets a new context id
      2. Unconditionally flush context every time process is running on a CPU
      
      This patch implements the first solution
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58897485
    • Liam R. Howlett's avatar
      sparc/mm/hugepages: Fix setup_hugepagesz for invalid values. · f322980b
      Liam R. Howlett authored
      hugetlb_bad_size needs to be called on invalid values.  Also change the
      pr_warn to a pr_err to better align with other platforms.
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@Oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f322980b
    • James Clarke's avatar
      sparc: Machine description indices can vary · c982aa9c
      James Clarke authored
      VIO devices were being looked up by their index in the machine
      description node block, but this often varies over time as devices are
      added and removed. Instead, store the ID and look up using the type,
      config handle and ID.
      Signed-off-by: default avatarJames Clarke <jrtc27@jrtc27.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c982aa9c
    • Mike Kravetz's avatar
      sparc64: mm: fix copy_tsb to correctly copy huge page TSBs · 654f4807
      Mike Kravetz authored
      When a TSB grows beyond its current capacity, a new TSB is allocated
      and copy_tsb is called to copy entries from the old TSB to the new.
      A hash shift based on page size is used to calculate the index of an
      entry in the TSB.  copy_tsb has hard coded PAGE_SHIFT in these
      calculations.  However, for huge page TSBs the value REAL_HPAGE_SHIFT
      should be used.  As a result, when copy_tsb is called for a huge page
      TSB the entries are placed at the incorrect index in the newly
      allocated TSB.  When doing hardware table walk, the MMU does not
      match these entries and we end up in the TSB miss handling code.
      This code will then create and write an entry to the correct index
      in the TSB.  We take a performance hit for the table walk miss and
      recreation of these entries.
      
      Pass a new parameter to copy_tsb that is the page size shift to be
      used when copying the TSB.
      Suggested-by: default avatarAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      654f4807
    • Jane Chu's avatar
      arch/sparc: support NR_CPUS = 4096 · c79a1373
      Jane Chu authored
      Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info()
      only allocates a single page for NR_CPUS mondo entries. Thus we cannot
      use all 4096 CPUs on some SPARC platforms.
      
      To fix, allocate (2^order) pages where order is set according to the size
      of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa
      are not used in asm code, there are no imm13 offsets from the base PA
      that will break because they can only reach one page.
      
      Orabug: 25505750
      Signed-off-by: default avatarJane Chu <jane.chu@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarAtish Patra <atish.patra@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c79a1373
    • Colin Ian King's avatar
      net: stmmac: fix a broken u32 less than zero check · 1d3028f4
      Colin Ian King authored
      The check that queue is less or equal to zero is always true
      because queue is a u32; queue is decremented and will wrap around
      and never go -ve. Fix this by making queue an int.
      
      Detected by CoverityScan, CID#1428988 ("Unsigned compared against 0")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d3028f4
    • Niklas Cassel's avatar
      net: stmmac: fix completely hung TX when using TSO · 426849e6
      Niklas Cassel authored
      stmmac_tso_allocator can fail to set the Last Descriptor bit
      on a descriptor that actually was the last descriptor.
      
      This happens when the buffer of the last descriptor ends
      up having a size of exactly TSO_MAX_BUFF_SIZE.
      
      When the IP eventually reaches the next last descriptor,
      which actually has the bit set, the DMA will hang.
      
      When the DMA hangs, we get a tx timeout, however,
      since stmmac does not do a complete reset of the IP
      in stmmac_tx_timeout, we end up in a state with
      completely hung TX.
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@axis.com>
      Acked-by: default avatarGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Acked-by: default avatarAlexandre TORGUE <alexandre.torgue@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      426849e6
    • Max Filippov's avatar
      net: ethoc: enable NAPI before poll may be scheduled · d220b942
      Max Filippov authored
      ethoc_reset enables device interrupts, ethoc_interrupt may schedule a
      NAPI poll before NAPI is enabled in the ethoc_open, which results in
      device being unable to send or receive anything until it's closed and
      reopened. In case the device is flooded with ingress packets it may be
      unable to recover at all.
      Move napi_enable above ethoc_reset in the ethoc_open to fix that.
      
      Fixes: a1702857 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Reviewed-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d220b942