1. 08 Sep, 2021 1 commit
  2. 04 Sep, 2021 15 commits
    • Namjae Jeon's avatar
      ksmbd: add validation for ndr read/write functions · 303fff2b
      Namjae Jeon authored
      If ndr->length is smaller than expected size, ksmbd can access invalid
      access in ndr->data. This patch add validation to check ndr->offset is
      over ndr->length. and added exception handling to check return value of
      ndr read/write function.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      303fff2b
    • Namjae Jeon's avatar
      ksmbd: remove unused ksmbd_file_table_flush function · 687c59e7
      Namjae Jeon authored
      ksmbd_file_table_flush is a leftover from SMB1. This function is no longer
      needed as SMB1 has been removed from ksmbd.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      687c59e7
    • Hyunchul Lee's avatar
      ksmbd: smbd: fix dma mapping error in smb_direct_post_send_data · 72d6cbb5
      Hyunchul Lee authored
      Becase smb direct header is mapped and msg->num_sge
      already is incremented, the decrement should be
      removed from the condition.
      Signed-off-by: default avatarHyunchul Lee <hyc.lee@gmail.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      72d6cbb5
    • Per Forlin's avatar
      ksmbd: Reduce error log 'speed is unknown' to debug · d475866e
      Per Forlin authored
      This log happens on servers with a network bridge since
      the bridge does not have a specified link speed.
      This is not a real error so change the error log to debug instead.
      Signed-off-by: default avatarPer Forlin <perfn@axis.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      d475866e
    • Christian Brauner's avatar
      ksmbd: defer notify_change() call · 28a5d3de
      Christian Brauner authored
      When ownership is changed we might in certain scenarios loose the
      ability to alter the inode after we changed ownership. This can e.g.
      happen when we are on an idmapped mount where uid 0 is mapped to uid
      1000 and uid 1000 is mapped to uid 0.
      A caller with fs*id 1000 will be able to create files as *id 1000 on
      disk. They will also be able to change ownership of files owned by *id 0
      to *id 1000 but they won't be able to change ownership in the other
      direction. This means acl operations following notify_change() would
      fail. Move the notify_change() call after the acls have been updated.
      This guarantees that we don't end up with spurious "hash value diff"
      warnings later on because we managed to change ownership but didn't
      manage to alter acls.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      28a5d3de
    • Christian Brauner's avatar
      ksmbd: remove setattr preparations in set_file_basic_info() · db7fb6fe
      Christian Brauner authored
      Permission checking and copying over ownership information is the task
      of the underlying filesystem not ksmbd. The order is also wrong here.
      This modifies the inode before notify_change(). If notify_change() fails
      this will have changed ownership nonetheless. All of this is unnecessary
      though since the underlying filesystem's ->setattr handler will do all
      this (if required) by itself.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      db7fb6fe
    • Christian Brauner's avatar
      ksmbd: ensure error is surfaced in set_file_basic_info() · eb5784f0
      Christian Brauner authored
      It seems the error was accidently ignored until now. Make sure it is
      surfaced.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      eb5784f0
    • Christian Brauner's avatar
      ndr: fix translation in ndr_encode_posix_acl() · 9467a0ce
      Christian Brauner authored
      The sid_to_id() helper encodes raw ownership information suitable for
      s*id handling. This is conceptually equivalent to reporting ownership
      information via stat to userspace. In this case the consumer is ksmbd
      instead of a regular user. So when encoding raw ownership information
      suitable for s*id handling later we need to map the id up according to
      the user namespace of ksmbd itself taking any idmapped mounts into
      account.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      9467a0ce
    • Christian Brauner's avatar
      ksmbd: fix translation in sid_to_id() · 55cd04d7
      Christian Brauner authored
      The sid_to_id() functions is relevant when changing ownership of
      filesystem objects based on acl information. In this case we need to
      first translate the relevant s*ids into k*ids in ksmbd's user namespace
      and account for any idmapped mounts. Requesting a change in ownership
      requires the inverse translation to be applied when we would report
      ownership to userspace. So k*id_from_mnt() must be used here.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      55cd04d7
    • Christian Brauner's avatar
      ksmbd: fix subauth 0 handling in sid_to_id() · f0bb29d5
      Christian Brauner authored
      It's not obvious why subauth 0 would be excluded from translation. This
      would lead to wrong results whenever a non-identity idmapping is used.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      f0bb29d5
    • Christian Brauner's avatar
      ksmbd: fix translation in acl entries · 0e844efe
      Christian Brauner authored
      The ksmbd server performs translation of posix acls to smb acls.
      Currently the translation is wrong since the idmapping of the mount is
      used to map the ids into raw userspace ids but what is relevant is the
      user namespace of ksmbd itself. The user namespace of ksmbd itself which
      is the initial user namespace. The operation is similar to asking "What
      *ids would a userspace process see given that k*id in the relevant user
      namespace?". Before the final translation we need to apply the idmapping
      of the mount in case any is used. Add two simple helpers for ksmbd.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      0e844efe
    • Christian Brauner's avatar
      ksmbd: fix translation in ksmbd_acls_fattr() · 43205ca7
      Christian Brauner authored
      When creating new filesystem objects ksmbd translates between k*ids and
      s*ids. For this it often uses struct smb_fattr and stashes the k*ids in
      cf_uid and cf_gid. Let cf_uid and cf_gid always contain the final
      information taking any potential idmapped mounts into account. When
      finally translation cf_*id into s*ids translate them into the user
      namespace of ksmbd since that is the relevant user namespace here.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      43205ca7
    • Christian Brauner's avatar
      ksmbd: fix translation in create_posix_rsp_buf() · 3cdc20e7
      Christian Brauner authored
      When transferring ownership information to the client the k*ids are
      translated into raw *ids before they are sent over the wire. The
      function currently erroneously translates the k*ids according to the
      mount's idmapping. Instead, reporting the owning *ids to userspace the
      underlying k*ids need to be mapped up in the caller's user namespace.
      This is how stat() works.
      The caller in this instance is ksmbd itself and ksmbd always runs in the
      initial user namespace. Translate according to that taking any potential
      idmapped mounts into account.
      
      Switch to from_k*id_munged() which ensures that the overflow*id is
      returned instead of the (*id_t)-1 when the k*id can't be translated.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      3cdc20e7
    • Christian Brauner's avatar
      ksmbd: fix translation in smb2_populate_readdir_entry() · 475d6f98
      Christian Brauner authored
      When transferring ownership information to the
      client the k*ids are translated into raw *ids before they are sent over
      the wire. The function currently erroneously translates the k*ids
      according to the mount's idmapping. Instead, reporting the owning *ids
      to userspace the underlying k*ids need to be mapped up in the caller's
      user namespace. This is how stat() works.
      The caller in this instance is ksmbd itself and ksmbd always runs in the
      initial user namespace. Translate according to that.
      
      The idmapping of the mount is already taken into account by the lower
      filesystem and so kstat->*id will contain the mapped k*ids.
      
      Switch to from_k*id_munged() which ensures that the overflow*id is
      returned instead of the (*id_t)-1 when the k*id can't be translated.
      
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      475d6f98
    • Christian Brauner's avatar
      ksmbd: fix lookup on idmapped mounts · da1e7ada
      Christian Brauner authored
      It's great that the new in-kernel ksmbd server will support idmapped
      mounts out of the box! However, lookup is currently broken. Lookup
      helpers such as lookup_one_len() call inode_permission() internally to
      ensure that the caller is privileged over the inode of the base dentry
      they are trying to lookup under. So the permission checking here is
      currently wrong.
      
      Linux v5.15 will gain a new lookup helper lookup_one() that does take
      idmappings into account. I've added it as part of my patch series to
      make btrfs support idmapped mounts. The new helper is in linux-next as
      part of David's (Sterba) btrfs for-next branch as commit
      c972214c133b ("namei: add mapping aware lookup helper").
      
      I've said it before during one of my first reviews: I would very much
      recommend adding fstests to [1]. It already seems to have very
      rudimentary cifs support. There is a completely generic idmapped mount
      testsuite that supports idmapped mounts.
      
      [1]: https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git/
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Steve French <stfrench@microsoft.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Hyunchul Lee <hyc.lee@gmail.com>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: David Sterba <dsterba@suse.com>
      Cc: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      da1e7ada
  3. 31 Aug, 2021 8 commits
    • Linus Torvalds's avatar
      Merge tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6 · 9c849ce8
      Linus Torvalds authored
      Pull cifs client updates from Steve French:
       "Eleven cifs/smb3 client fixes:
      
         - mostly restructuring to allow disabling less secure algorithms
           (this will allow eventual removing rc4 and md4 from general use in
           the kernel)
      
         - four fixes, including two for stable
      
         - enable r/w support with fscache and cifs.ko
      
        I am working on a larger set of changes (the usual ... multichannel,
        auth and signing improvements), but wanted to get these in earlier to
        reduce chance of merge conflicts later in the merge window"
      
      * tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Do not leak EDEADLK to dgetents64 for STATUS_USER_SESSION_DELETED
        cifs: add cifs_common directory to MAINTAINERS file
        cifs: cifs_md4 convert to SPDX identifier
        cifs: create a MD4 module and switch cifs.ko to use it
        cifs: fork arc4 and create a separate module for it for cifs and other users
        cifs: remove support for NTLM and weaker authentication algorithms
        cifs: enable fscache usage even for files opened as rw
        oid_registry: Add OIDs for missing Spnego auth mechanisms to Macs
        smb3: fix posix extensions mount option
        cifs: fix wrong release in sess_alloc_buffer() failed path
        CIFS: Fix a potencially linear read overflow
      9c849ce8
    • Linus Torvalds's avatar
      Merge tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd · e24c567b
      Linus Torvalds authored
      Pull initial ksmbd implementation from Steve French:
       "Initial merge of kernel smb3 file server, ksmbd.
      
        The SMB family of protocols is the most widely deployed network
        filesystem protocol, the default on Windows and Macs (and even on many
        phones and tablets), with clients and servers on all major operating
        systems, but lacked a kernel server for Linux. For many cases the
        current userspace server choices were suboptimal either due to memory
        footprint, performance or difficulty integrating well with advanced
        Linux features.
      
        ksmbd is a new kernel module which implements the server-side of the
        SMB3 protocol. The target is to provide optimized performance, GPLv2
        SMB server, and better lease handling (distributed caching). The
        bigger goal is to add new features more rapidly (e.g. RDMA aka
        "smbdirect", and recent encryption and signing improvements to the
        protocol) which are easier to develop on a smaller, more tightly
        optimized kernel server than for example in Samba.
      
        The Samba project is much broader in scope (tools, security services,
        LDAP, Active Directory Domain Controller, and a cross platform file
        server for a wider variety of purposes) but the user space file server
        portion of Samba has proved hard to optimize for some Linux workloads,
        including for smaller devices.
      
        This is not meant to replace Samba, but rather be an extension to
        allow better optimizing for Linux, and will continue to integrate well
        with Samba user space tools and libraries where appropriate. Working
        with the Samba team we have already made sure that the configuration
        files and xattrs are in a compatible format between the kernel and
        user space server.
      
        Various types of functional and regression tests are regularly run
        against it. One example is the automated 'buildbot' regression tests
        which use the Linux client to test against ksmbd, e.g.
      
           http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/8/builds/56
      
        but other test suites, including Samba's smbtorture functional test
        suite are also used regularly"
      
      * tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd: (219 commits)
        ksmbd: fix __write_overflow warning in ndr_read_string
        MAINTAINERS: ksmbd: add cifs_common directory to ksmbd entry
        MAINTAINERS: ksmbd: update my email address
        ksmbd: fix permission check issue on chown and chmod
        ksmbd: don't set FILE DELETE and FILE_DELETE_CHILD in access mask by default
        MAINTAINERS: add git adddress of ksmbd
        ksmbd: update SMB3 multi-channel support in ksmbd.rst
        ksmbd: smbd: fix kernel oops during server shutdown
        ksmbd: remove select FS_POSIX_ACL in Kconfig
        ksmbd: use proper errno instead of -1 in smb2_get_ksmbd_tcon()
        ksmbd: update the comment for smb2_get_ksmbd_tcon()
        ksmbd: change int data type to boolean
        ksmbd: Fix multi-protocol negotiation
        ksmbd: fix an oops in error handling in smb2_open()
        ksmbd: add ipv6_addr_v4mapped check to know if connection from client is ipv4
        ksmbd: fix missing error code in smb2_lock
        ksmbd: use channel signingkey for binding SMB2 session setup
        ksmbd: don't set RSS capable in FSCTL_QUERY_NETWORK_INTERFACE_INFO
        ksmbd: Return STATUS_OBJECT_PATH_NOT_FOUND if smb2_creat() returns ENOENT
        ksmbd: fix -Wstringop-truncation warnings
        ...
      e24c567b
    • Linus Torvalds's avatar
      Merge tag 'for-5.15/io_uring-vfs-2021-08-30' of git://git.kernel.dk/linux-block · b91db6a0
      Linus Torvalds authored
      Pull io_uring mkdirat/symlinkat/linkat support from Jens Axboe:
       "This adds io_uring support for mkdirat, symlinkat, and linkat"
      
      * tag 'for-5.15/io_uring-vfs-2021-08-30' of git://git.kernel.dk/linux-block:
        io_uring: add support for IORING_OP_LINKAT
        io_uring: add support for IORING_OP_SYMLINKAT
        io_uring: add support for IORING_OP_MKDIRAT
        namei: update do_*() helpers to return ints
        namei: make do_linkat() take struct filename
        namei: add getname_uflags()
        namei: make do_symlinkat() take struct filename
        namei: make do_mknodat() take struct filename
        namei: make do_mkdirat() take struct filename
        namei: change filename_parentat() calling conventions
        namei: ignore ERR/NULL names in putname()
      b91db6a0
    • Linus Torvalds's avatar
      Merge tag 'io_uring-bio-cache.5-2021-08-30' of git://git.kernel.dk/linux-block · 3b629f8d
      Linus Torvalds authored
      Pull support for struct bio recycling from Jens Axboe:
       "This adds bio recycling support for polled IO, allowing quick reuse of
        a bio for high IOPS scenarios via a percpu bio_set list.
      
        It's good for almost a 10% improvement in performance, bumping our
        per-core IO limit from ~3.2M IOPS to ~3.5M IOPS"
      
      * tag 'io_uring-bio-cache.5-2021-08-30' of git://git.kernel.dk/linux-block:
        bio: improve kerneldoc documentation for bio_alloc_kiocb()
        block: provide bio_clear_hipri() helper
        block: use the percpu bio cache in __blkdev_direct_IO
        io_uring: enable use of bio alloc cache
        block: clear BIO_PERCPU_CACHE flag if polling isn't supported
        bio: add allocation cache abstraction
        fs: add kiocb alloc cache flag
        bio: optimize initialization of a bio
      3b629f8d
    • Linus Torvalds's avatar
      Merge tag 'for-5.15/io_uring-2021-08-30' of git://git.kernel.dk/linux-block · c547d89a
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
      
       - cancellation cleanups (Hao, Pavel)
      
       - io-wq accounting cleanup (Hao)
      
       - io_uring submit locking fix (Hao)
      
       - io_uring link handling fixes (Hao)
      
       - fixed file improvements (wangyangbo, Pavel)
      
       - allow updates of linked timeouts like regular timeouts (Pavel)
      
       - IOPOLL fix (Pavel)
      
       - remove batched file get optimization (Pavel)
      
       - improve reference handling (Pavel)
      
       - IRQ task_work batching (Pavel)
      
       - allow pure fixed file, and add support for open/accept (Pavel)
      
       - GFP_ATOMIC RT kernel fix
      
       - multiple CQ ring waiter improvement
      
       - funnel IRQ completions through task_work
      
       - add support for limiting async workers explicitly
      
       - add different clocksource support for timeouts
      
       - io-wq wakeup race fix
      
       - lots of cleanups and improvement (Pavel et al)
      
      * tag 'for-5.15/io_uring-2021-08-30' of git://git.kernel.dk/linux-block: (87 commits)
        io-wq: fix wakeup race when adding new work
        io-wq: wqe and worker locks no longer need to be IRQ safe
        io-wq: check max_worker limits if a worker transitions bound state
        io_uring: allow updating linked timeouts
        io_uring: keep ltimeouts in a list
        io_uring: support CLOCK_BOOTTIME/REALTIME for timeouts
        io-wq: provide a way to limit max number of workers
        io_uring: add build check for buf_index overflows
        io_uring: clarify io_req_task_cancel() locking
        io_uring: add task-refs-get helper
        io_uring: fix failed linkchain code logic
        io_uring: remove redundant req_set_fail()
        io_uring: don't free request to slab
        io_uring: accept directly into fixed file table
        io_uring: hand code io_accept() fd installing
        io_uring: openat directly into fixed fd table
        net: add accept helper not installing fd
        io_uring: fix io_try_cancel_userdata race for iowq
        io_uring: IRQ rw completion batching
        io_uring: batch task work locking
        ...
      c547d89a
    • Linus Torvalds's avatar
      Merge tag 'for-5.15/libata-2021-08-30' of git://git.kernel.dk/linux-block · 44d7d3b0
      Linus Torvalds authored
      Pull libata updates from Jens Axboe:
       "libata changes for the 5.15 release:
      
         - NCQ priority improvements (Damien, Niklas)
      
         - coccinelle warning fix (Jing)
      
         - dwc_460ex phy fix (Andy)"
      
      * tag 'for-5.15/libata-2021-08-30' of git://git.kernel.dk/linux-block:
        include:libata: fix boolreturn.cocci warnings
        docs: sysfs-block-device: document ncq_prio_supported
        docs: sysfs-block-device: improve ncq_prio_enable documentation
        libata: Introduce ncq_prio_supported sysfs sttribute
        libata: print feature list on device scan
        libata: fix ata_read_log_page() warning
        libata: cleanup NCQ priority handling
        libata: cleanup ata_dev_configure()
        libata: cleanup device sleep capability detection
        libata: simplify ata_scsi_rbuf_fill()
        libata: fix ata_host_start()
        ata: sata_dwc_460ex: No need to call phy_exit() befre phy_init()
      44d7d3b0
    • Linus Torvalds's avatar
      Merge tag 'for-5.15/drivers-2021-08-30' of git://git.kernel.dk/linux-block · 9a1d6c9e
      Linus Torvalds authored
      Pull block driver updates from Jens Axboe:
       "Sitting on top of the core block changes, here are the driver changes
        for the 5.15 merge window:
      
         - NVMe updates via Christoph:
             - suspend improvements for devices with an HMB (Keith Busch)
             - handle double completions more gacefull (Sagi Grimberg)
             - cleanup the selects for the nvme core code a bit (Sagi Grimberg)
             - don't update queue count when failing to set io queues (Ruozhu Li)
             - various nvmet connect fixes (Amit Engel)
             - cleanup lightnvm leftovers (Keith Busch, me)
             - small cleanups (Colin Ian King, Hou Pu)
             - add tracing for the Set Features command (Hou Pu)
             - CMB sysfs cleanups (Keith Busch)
             - add a mutex_destroy call (Keith Busch)
      
         - remove lightnvm subsystem. It's served its purpose and ultimately
           led to zoned nvme support, we no longer need it (Christoph)
      
         - revert floppy O_NDELAY fix (Denis)
      
         - nbd fixes (Hou, Pavel, Baokun)
      
         - nbd locking fixes (Tetsuo)
      
         - nbd device removal fixes (Christoph)
      
         - raid10 rcu warning fix (Xiao)
      
         - raid1 write behind fix (Guoqing)
      
         - rnbd fixes (Gioh, Md Haris)
      
         - misc fixes (Colin)"
      
      * tag 'for-5.15/drivers-2021-08-30' of git://git.kernel.dk/linux-block: (42 commits)
        Revert "floppy: reintroduce O_NDELAY fix"
        raid1: ensure write behind bio has less than BIO_MAX_VECS sectors
        md/raid10: Remove unnecessary rcu_dereference in raid10_handle_discard
        nbd: remove nbd->destroy_complete
        nbd: only return usable devices from nbd_find_unused
        nbd: set nbd->index before releasing nbd_index_mutex
        nbd: prevent IDR lookups from finding partially initialized devices
        nbd: reset NBD to NULL when restarting in nbd_genl_connect
        nbd: add missing locking to the nbd_dev_add error path
        nvme: remove the unused NVME_NS_* enum
        nvme: remove nvm_ndev from ns
        nvme: Have NVME_FABRICS select NVME_CORE instead of transport drivers
        block: nbd: add sanity check for first_minor
        nvmet: check that host sqsize does not exceed ctrl MQES
        nvmet: avoid duplicate qid in connect cmd
        nvmet: pass back cntlid on successful completion
        nvme-rdma: don't update queue count when failing to set io queues
        nvme-tcp: don't update queue count when failing to set io queues
        nvme-tcp: pair send_mutex init with destroy
        nvme: allow user toggling hmb usage
        ...
      9a1d6c9e
    • Linus Torvalds's avatar
      Merge tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block · 67936911
      Linus Torvalds authored
      Pull block updates from Jens Axboe:
       "Nothing major in here - lots of good cleanups and tech debt handling,
        which is also evident in the diffstats. In particular:
      
         - Add disk sequence numbers (Matteo)
      
         - Discard merge fix (Ming)
      
         - Relax disk zoned reporting restrictions (Niklas)
      
         - Bio error handling zoned leak fix (Pavel)
      
         - Start of proper add_disk() error handling (Luis, Christoph)
      
         - blk crypto fix (Eric)
      
         - Non-standard GPT location support (Dmitry)
      
         - IO priority improvements and cleanups (Damien)o
      
         - blk-throtl improvements (Chunguang)
      
         - diskstats_show() stack reduction (Abd-Alrhman)
      
         - Loop scheduler selection (Bart)
      
         - Switch block layer to use kmap_local_page() (Christoph)
      
         - Remove obsolete disk_name helper (Christoph)
      
         - block_device refcounting improvements (Christoph)
      
         - Ensure gendisk always has a request queue reference (Christoph)
      
         - Misc fixes/cleanups (Shaokun, Oliver, Guoqing)"
      
      * tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block: (129 commits)
        sg: pass the device name to blk_trace_setup
        block, bfq: cleanup the repeated declaration
        blk-crypto: fix check for too-large dun_bytes
        blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN
        blk-zoned: allow zone management send operations without CAP_SYS_ADMIN
        block: mark blkdev_fsync static
        block: refine the disk_live check in del_gendisk
        mmc: sdhci-tegra: Enable MMC_CAP2_ALT_GPT_TEGRA
        mmc: block: Support alternative_gpt_sector() operation
        partitions/efi: Support non-standard GPT location
        block: Add alternative_gpt_sector() operation
        bio: fix page leak bio_add_hw_page failure
        block: remove CONFIG_DEBUG_BLOCK_EXT_DEVT
        block: remove a pointless call to MINOR() in device_add_disk
        null_blk: add error handling support for add_disk()
        virtio_blk: add error handling support for add_disk()
        block: add error handling for device_add_disk / add_disk
        block: return errors from disk_alloc_events
        block: return errors from blk_integrity_add
        block: call blk_register_queue earlier in device_add_disk
        ...
      67936911
  4. 30 Aug, 2021 16 commits
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8596e589
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Updates for timekeeping, timers and related drivers:
      
        Core code:
      
         - Cure a couple of correctness issues in the posix CPU timer code to
           prevent that the tick dependency for NOHZ full is kept alive for no
           reason.
      
         - Avoid expensive double reprogramming of the clockevent device in
           hrtimer_start_range_ns().
      
         - Avoid pointless SMP function calls when the clock was set to avoid
           disturbing CPUs which do not have any affected timers queued.
      
         - Make the clocksource watchdog test work correctly when CONFIG_HZ is
           less than 100.
      
        Drivers:
      
         - Prefer the ARM architected timer over the Exynos timer which is way
           more expensive to access.
      
         - Add device tree bindings for new Ingenic SoCs
      
         - The usual improvements and cleanups all over the place"
      
      * tag 'timers-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
        clocksource: Make clocksource watchdog test safe for slow-HZ systems
        dt-bindings: timer: Add ABIs for new Ingenic SoCs
        clocksource/drivers/fttmr010: Pass around less pointers
        clocksource/drivers/mediatek: Optimize systimer irq clear flow on shutdown
        clocksource/drivers/ingenic: Use bitfield macro helpers
        clocksource/drivers/sh_cmt: Fix wrong setting if don't request IRQ for clock source channel
        dt-bindings: timer: convert rockchip,rk-timer.txt to YAML
        clocksource/drivers/exynos_mct: Mark MCT device as CLOCK_EVT_FEAT_PERCPU
        clocksource/drivers/exynos_mct: Prioritise Arm arch timer on arm64
        hrtimer: Unbreak hrtimer_force_reprogram()
        hrtimer: Use raw_cpu_ptr() in clock_was_set()
        hrtimer: Avoid more SMP function calls in clock_was_set()
        hrtimer: Avoid unnecessary SMP function calls in clock_was_set()
        hrtimer: Add bases argument to clock_was_set()
        time/timekeeping: Avoid invoking clock_was_set() twice
        timekeeping: Distangle resume and clock-was-set events
        timerfd: Provide timerfd_resume()
        hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case
        hrtimer: Ensure timerfd notification for HIGHRES=n
        hrtimer: Consolidate reprogramming code
        ...
      8596e589
    • Linus Torvalds's avatar
      Merge tag 'x86-misc-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bed91667
      Linus Torvalds authored
      Pull misc x86 updates from Thomas Gleixner:
       "A set of updates for the x86 reboot code:
      
         - Limit the Dell Optiplex 990 quirk to early BIOS versions to avoid
           the full 'power cycle' alike reboot which is required for the buggy
           BIOSes.
      
         - Update documentation for the reboot=pci command line option and
           document how DMI platform quirks can be overridden"
      
      * tag 'x86-misc-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/reboot: Limit Dell Optiplex 990 quirk to early BIOS versions
        x86/reboot: Document how to override DMI platform quirks
        x86/reboot: Document the "reboot=pci" option
      bed91667
    • Linus Torvalds's avatar
      Merge tag 'x86-irq-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ccd8ec4a
      Linus Torvalds authored
      Pull x86 PIRQ updates from Thomas Gleixner:
       "A set of updates to support port 0x22/0x23 based PCI configuration
        space which can be found on various ALi chipsets and is also available
        on older Intel systems which expose a PIRQ router.
      
        While the Intel support is more or less nostalgia, the ALi chips are
        still in use on popular embedded boards used for routers"
      
      * tag 'x86-irq-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Fix typo s/ECLR/ELCR/ for the PIC register
        x86: Avoid magic number with ELCR register accesses
        x86/PCI: Add support for the Intel 82426EX PIRQ router
        x86/PCI: Add support for the Intel 82374EB/82374SB (ESC) PIRQ router
        x86/PCI: Add support for the ALi M1487 (IBC) PIRQ router
        x86: Add support for 0x22/0x23 port I/O configuration space
      ccd8ec4a
    • Linus Torvalds's avatar
      Merge tag 'x86-cpu-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0a096f24
      Linus Torvalds authored
      Pull x86 cache flush updates from Thomas Gleixner:
       "A reworked version of the opt-in L1D flush mechanism.
      
        This is a stop gap for potential future speculation related hardware
        vulnerabilities and a mechanism for truly security paranoid
        applications.
      
        It allows a task to request that the L1D cache is flushed when the
        kernel switches to a different mm. This can be requested via prctl().
      
        Changes vs the previous versions:
      
         - Get rid of the software flush fallback
      
         - Make the handling consistent with other mitigations
      
         - Kill the task when it ends up on a SMT enabled core which defeats
           the purpose of L1D flushing obviously"
      
      * tag 'x86-cpu-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Documentation: Add L1D flushing Documentation
        x86, prctl: Hook L1D flushing in via prctl
        x86/mm: Prepare for opt-in based L1D flush in switch_mm()
        x86/process: Make room for TIF_SPEC_L1D_FLUSH
        sched: Add task_work callback for paranoid L1D flush
        x86/mm: Refactor cond_ibpb() to support other use cases
        x86/smp: Add a per-cpu view of SMT state
      0a096f24
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7d6e3fa8
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "Updates to the interrupt core and driver subsystems:
      
        Core changes:
      
         - The usual set of small fixes and improvements all over the place,
           but nothing stands out
      
        MSI changes:
      
         - Further consolidation of the PCI/MSI interrupt chip code
      
         - Make MSI sysfs code independent of PCI/MSI and expose the MSI
           interrupts of platform devices in the same way as PCI exposes them.
      
        Driver changes:
      
         - Support for ARM GICv3 EPPI partitions
      
         - Treewide conversion to generic_handle_domain_irq() for all chained
           interrupt controllers
      
         - Conversion to bitmap_zalloc() throughout the irq chip drivers
      
         - The usual set of small fixes and improvements"
      
      * tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
        platform-msi: Add ABI to show msi_irqs of platform devices
        genirq/msi: Move MSI sysfs handling from PCI to MSI core
        genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
        irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
        irqdomain: Export irq_domain_disconnect_hierarchy()
        irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
        irqchip/apple-aic: Fix irq_disable from within irq handlers
        pinctrl/rockchip: drop the gpio related codes
        gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
        gpio/rockchip: support next version gpio controller
        gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
        gpio/rockchip: add driver for rockchip gpio
        dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
        pinctrl/rockchip: add pinctrl device to gpio bank struct
        pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
        pinctrl/rockchip: always enable clock for gpio controller
        genirq: Fix kernel doc indentation
        EDAC/altera: Convert to generic_handle_domain_irq()
        powerpc: Bulk conversion to generic_handle_domain_irq()
        nios2: Bulk conversion to generic_handle_domain_irq()
        ...
      7d6e3fa8
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e5e726f7
      Linus Torvalds authored
      Pull locking and atomics updates from Thomas Gleixner:
       "The regular pile:
      
         - A few improvements to the mutex code
      
         - Documentation updates for atomics to clarify the difference between
           cmpxchg() and try_cmpxchg() and to explain the forward progress
           expectations.
      
         - Simplification of the atomics fallback generator
      
         - The addition of arch_atomic_long*() variants and generic arch_*()
           bitops based on them.
      
         - Add the missing might_sleep() invocations to the down*() operations
           of semaphores.
      
        The PREEMPT_RT locking core:
      
         - Scheduler updates to support the state preserving mechanism for
           'sleeping' spin- and rwlocks on RT.
      
           This mechanism is carefully preserving the state of the task when
           blocking on a 'sleeping' spin- or rwlock and takes regular wake-ups
           targeted at the same task into account. The preserved or updated
           (via a regular wakeup) state is restored when the lock has been
           acquired.
      
         - Restructuring of the rtmutex code so it can be utilized and
           extended for the RT specific lock variants.
      
         - Restructuring of the ww_mutex code to allow sharing of the ww_mutex
           specific functionality for rtmutex based ww_mutexes.
      
         - Header file disentangling to allow substitution of the regular lock
           implementations with the PREEMPT_RT variants without creating an
           unmaintainable #ifdef mess.
      
         - Shared base code for the PREEMPT_RT specific rw_semaphore and
           rwlock implementations.
      
           Contrary to the regular rw_semaphores and rwlocks the PREEMPT_RT
           implementation is writer unfair because it is infeasible to do
           priority inheritance on multiple readers. Experience over the years
           has shown that real-time workloads are not the typical workloads
           which are sensitive to writer starvation.
      
           The alternative solution would be to allow only a single reader
           which has been tried and discarded as it is a major bottleneck
           especially for mmap_sem. Aside of that many of the writer
           starvation critical usage sites have been converted to a writer
           side mutex/spinlock and RCU read side protections in the past
           decade so that the issue is less prominent than it used to be.
      
         - The actual rtmutex based lock substitutions for PREEMPT_RT enabled
           kernels which affect mutex, ww_mutex, rw_semaphore, spinlock_t and
           rwlock_t. The spin/rw_lock*() functions disable migration across
           the critical section to preserve the existing semantics vs per-CPU
           variables.
      
         - Rework of the futex REQUEUE_PI mechanism to handle the case of
           early wake-ups which interleave with a re-queue operation to
           prevent the situation that a task would be blocked on both the
           rtmutex associated to the outer futex and the rtmutex based hash
           bucket spinlock.
      
           While this situation cannot happen on !RT enabled kernels the
           changes make the underlying concurrency problems easier to
           understand in general. As a result the difference between !RT and
           RT kernels is reduced to the handling of waiting for the critical
           section. !RT kernels simply spin-wait as before and RT kernels
           utilize rcu_wait().
      
         - The substitution of local_lock for PREEMPT_RT with a spinlock which
           protects the critical section while staying preemptible. The CPU
           locality is established by disabling migration.
      
        The underlying concepts of this code have been in use in PREEMPT_RT for
        way more than a decade. The code has been refactored several times over
        the years and this final incarnation has been optimized once again to be
        as non-intrusive as possible, i.e. the RT specific parts are mostly
        isolated.
      
        It has been extensively tested in the 5.14-rt patch series and it has
        been verified that !RT kernels are not affected by these changes"
      
      * tag 'locking-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (92 commits)
        locking/rtmutex: Return success on deadlock for ww_mutex waiters
        locking/rtmutex: Prevent spurious EDEADLK return caused by ww_mutexes
        locking/rtmutex: Dequeue waiter on ww_mutex deadlock
        locking/rtmutex: Dont dereference waiter lockless
        locking/semaphore: Add might_sleep() to down_*() family
        locking/ww_mutex: Initialize waiter.ww_ctx properly
        static_call: Update API documentation
        locking/local_lock: Add PREEMPT_RT support
        locking/spinlock/rt: Prepare for RT local_lock
        locking/rtmutex: Add adaptive spinwait mechanism
        locking/rtmutex: Implement equal priority lock stealing
        preempt: Adjust PREEMPT_LOCK_OFFSET for RT
        locking/rtmutex: Prevent lockdep false positive with PI futexes
        futex: Prevent requeue_pi() lock nesting issue on RT
        futex: Simplify handle_early_requeue_pi_wakeup()
        futex: Reorder sanity checks in futex_requeue()
        futex: Clarify comment in futex_requeue()
        futex: Restructure futex_requeue()
        futex: Correct the number of requeued waiters for PI
        futex: Remove bogus condition for requeue PI
        ...
      e5e726f7
    • Linus Torvalds's avatar
      Merge tag 'smp-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 08403e21
      Linus Torvalds authored
      Pull SMP core updates from Thomas Gleixner:
      
       - Replace get/put_online_cpus() in various places. The final removal
         will happen shortly before v5.15-rc1 when the rest of the patches
         have been merged.
      
       - Add debug code to help the analysis of CPU hotplug failures
      
       - A set of kernel doc updates
      
      * tag 'smp-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        mm: Replace deprecated CPU-hotplug functions.
        md/raid5: Replace deprecated CPU-hotplug functions.
        Documentation: Replace deprecated CPU-hotplug functions.
        smp: Fix all kernel-doc warnings
        cpu/hotplug: Add debug printks for hotplug callback failures
        cpu/hotplug: Use DEVICE_ATTR_*() macro
        cpu/hotplug: Eliminate all kernel-doc warnings
        cpu/hotplug: Fix kernel doc warnings for __cpuhp_setup_state_cpuslocked()
        cpu/hotplug: Fix comment typo
        smpboot: Replace deprecated CPU-hotplug functions.
      08403e21
    • Linus Torvalds's avatar
      Merge tag 'core-debugobjects-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e4c3562e
      Linus Torvalds authored
      Pull debugobjects update from Thomas Gleixner:
       "A single commit for debugobjects to make them work on PREEMPT_RT by
        preventing object pool refill in atomic contexts"
      
      * tag 'core-debugobjects-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        debugobjects: Make them PREEMPT_RT aware
      e4c3562e
    • Linus Torvalds's avatar
      Merge tag 'efi-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 46f4945e
      Linus Torvalds authored
      Pull EFI updates from Ingo Molnar:
       "A handful of EFI changes for this cycle:
      
         - EFI CPER parsing improvements
      
         - Don't take the address of efi_guid_t internal fields"
      
      * tag 'efi-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: cper: check section header more appropriately
        efi: Don't use knowledge about efi_guid_t internals
        efi: cper: fix scnprintf() use in cper_mem_err_location()
      46f4945e
    • Linus Torvalds's avatar
      Merge tag 'perf-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4a2b88eb
      Linus Torvalds authored
      Pull x86 perf event updates from Ingo Molnar:
      
       - Add support for Intel Sapphire Rapids server CPU uncore events
      
       - Allow the AMD uncore driver to be built as a module
      
       - Misc cleanups and fixes
      
      * tag 'perf-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        perf/x86/amd/ibs: Add bitfield definitions in new <asm/amd-ibs.h> header
        perf/amd/uncore: Allow the driver to be built as a module
        x86/cpu: Add get_llc_id() helper function
        perf/amd/uncore: Clean up header use, use <linux/ include paths instead of <asm/
        perf/amd/uncore: Simplify code, use free_percpu()'s built-in check for NULL
        perf/hw_breakpoint: Replace deprecated CPU-hotplug functions
        perf/x86/intel: Replace deprecated CPU-hotplug functions
        perf/x86: Remove unused assignment to pointer 'e'
        perf/x86/intel/uncore: Fix IIO cleanup mapping procedure for SNR/ICX
        perf/x86/intel/uncore: Support IMC free-running counters on Sapphire Rapids server
        perf/x86/intel/uncore: Support IIO free-running counters on Sapphire Rapids server
        perf/x86/intel/uncore: Factor out snr_uncore_mmio_map()
        perf/x86/intel/uncore: Add alias PMU name
        perf/x86/intel/uncore: Add Sapphire Rapids server MDF support
        perf/x86/intel/uncore: Add Sapphire Rapids server M3UPI support
        perf/x86/intel/uncore: Add Sapphire Rapids server UPI support
        perf/x86/intel/uncore: Add Sapphire Rapids server M2M support
        perf/x86/intel/uncore: Add Sapphire Rapids server IMC support
        perf/x86/intel/uncore: Add Sapphire Rapids server PCU support
        perf/x86/intel/uncore: Add Sapphire Rapids server M2PCIe support
        ...
      4a2b88eb
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5d3c0db4
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
      
       - The biggest change in this cycle is scheduler support for asymmetric
         scheduling affinity, to support the execution of legacy 32-bit tasks
         on AArch32 systems that also have 64-bit-only CPUs.
      
         Architectures can fill in this functionality by defining their own
         task_cpu_possible_mask(p). When this is done, the scheduler will make
         sure the task will only be scheduled on CPUs that support it.
      
         (The actual arm64 specific changes are not part of this tree.)
      
         For other architectures there will be no change in functionality.
      
       - Add cgroup SCHED_IDLE support
      
       - Increase node-distance flexibility & delay determining it until a CPU
         is brought online. (This enables platforms where node distance isn't
         final until the CPU is only.)
      
       - Deadline scheduler enhancements & fixes
      
       - Misc fixes & cleanups.
      
      * tag 'sched-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
        eventfd: Make signal recursion protection a task bit
        sched/fair: Mark tg_is_idle() an inline in the !CONFIG_FAIR_GROUP_SCHED case
        sched: Introduce dl_task_check_affinity() to check proposed affinity
        sched: Allow task CPU affinity to be restricted on asymmetric systems
        sched: Split the guts of sched_setaffinity() into a helper function
        sched: Introduce task_struct::user_cpus_ptr to track requested affinity
        sched: Reject CPU affinity changes based on task_cpu_possible_mask()
        cpuset: Cleanup cpuset_cpus_allowed_fallback() use in select_fallback_rq()
        cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()
        cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1
        sched: Introduce task_cpu_possible_mask() to limit fallback rq selection
        sched: Cgroup SCHED_IDLE support
        sched/topology: Skip updating masks for non-online nodes
        sched: Replace deprecated CPU-hotplug functions.
        sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
        sched: Fix UCLAMP_FLAG_IDLE setting
        sched/deadline: Fix missing clock update in migrate_task_rq_dl()
        sched/fair: Avoid a second scan of target in select_idle_cpu
        sched/fair: Use prev instead of new target as recent_used_cpu
        sched: Don't report SCHED_FLAG_SUGOV in sched_getattr()
        ...
      5d3c0db4
    • Linus Torvalds's avatar
      Merge tag 'x86_cleanups_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 230bda08
      Linus Torvalds authored
      Pull x86 cleanups from Borislav Petkov:
       "The usual round of minor cleanups and fixes"
      
      * tag 'x86_cleanups_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kaslr: Have process_mem_region() return a boolean
        x86/power: Fix kernel-doc warnings in cpu.c
        x86/mce/inject: Replace deprecated CPU-hotplug functions.
        x86/microcode: Replace deprecated CPU-hotplug functions.
        x86/mtrr: Replace deprecated CPU-hotplug functions.
        x86/mmiotrace: Replace deprecated CPU-hotplug functions.
      230bda08
    • Linus Torvalds's avatar
      Merge tag 'x86_cache_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42f6e869
      Linus Torvalds authored
      Pull x86 resource control updates from Borislav Petkov:
       "A first round of changes towards splitting the arch-specific bits from
        the filesystem bits of resctrl, the ultimate goal being to support
        ARM's equivalent technology MPAM, with the same fs interface (James
        Morse)"
      
      * tag 'x86_cache_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
        x86/resctrl: Make resctrl_arch_get_config() return its value
        x86/resctrl: Merge the CDP resources
        x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range
        x86/resctrl: Remove rdt_cdp_peer_get()
        x86/resctrl: Merge the ctrl_val arrays
        x86/resctrl: Calculate the index from the configuration type
        x86/resctrl: Apply offset correction when config is staged
        x86/resctrl: Make ctrlval arrays the same size
        x86/resctrl: Pass configuration type to resctrl_arch_get_config()
        x86/resctrl: Add a helper to read a closid's configuration
        x86/resctrl: Rename update_domains() to resctrl_arch_update_domains()
        x86/resctrl: Allow different CODE/DATA configurations to be staged
        x86/resctrl: Group staged configuration into a separate struct
        x86/resctrl: Move the schemata names into struct resctrl_schema
        x86/resctrl: Add a helper to read/set the CDP configuration
        x86/resctrl: Swizzle rdt_resource and resctrl_schema in pseudo_lock_region
        x86/resctrl: Pass the schema to resctrl filesystem functions
        x86/resctrl: Add resctrl_arch_get_num_closid()
        x86/resctrl: Store the effective num_closid in the schema
        x86/resctrl: Walk the resctrl schema list instead of an arch list
        ...
      42f6e869
    • Linus Torvalds's avatar
      Merge tag 'x86_build_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ced119b6
      Linus Torvalds authored
      Pull x86 build updates from Borislav Petkov:
      
       - Remove cc-option checks which are old and already supported by the
         minimal compiler version the kernel uses and thus avoid the need to
         invoke the compiler unnecessarily.
      
       - Cleanups
      
      * tag 'x86_build_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/build: Move the install rule to arch/x86/Makefile
        x86/build: Remove the left-over bzlilo target
        x86/tools/relocs: Mark die() with the printf function attr format
        x86/build: Remove stale cc-option checks
      ced119b6
    • Linus Torvalds's avatar
      Merge tag 'ras_core_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8f645b42
      Linus Torvalds authored
      Pull RAS update from Borislav Petkov:
       "A single RAS change for 5.15:
      
         - Do not start processing MCEs logged early because the decoding
           chain is not up yet - delay that processing until everything is
           ready"
      
      * tag 'ras_core_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Defer processing of early errors
      8f645b42
    • Linus Torvalds's avatar
      Merge tag 'edac_updates_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 05b5fdb2
      Linus Torvalds authored
      Pull EDAC updates from Borislav Petkov:
       "The usual EDAC stuff which managed to trickle in for 5.15:
      
         - Add new HBM2 (High Bandwidth Memory Gen 2) type and add support for
           it to the Intel SKx drivers
      
         - Print additional useful per-channel error information on i10nm,
           like on SKL
      
         - Don't load the AMD EDAC decoder in virtual images
      
         - The usual round of fixes and cleanups"
      
      * tag 'edac_updates_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/i10nm: Retrieve and print retry_rd_err_log registers
        EDAC/i10nm: Fix NVDIMM detection
        EDAC/skx_common: Set the memory type correctly for HBM memory
        EDAC/altera: Skip defining unused structures for specific configs
        EDAC/mce_amd: Do not load edac_mce_amd module on guests
        EDAC/mc: Add new HBM2 memory type
        EDAC/amd64: Use DEVICE_ATTR helper macros
      05b5fdb2