- 14 Jul, 2020 1 commit
-
-
Sargun Dhillon authored
The current SECCOMP_RET_USER_NOTIF API allows for syscall supervision over an fd. It is often used in settings where a supervising task emulates syscalls on behalf of a supervised task in userspace, either to further restrict the supervisee's syscall abilities or to circumvent kernel enforced restrictions the supervisor deems safe to lift (e.g. actually performing a mount(2) for an unprivileged container). While SECCOMP_RET_USER_NOTIF allows for the interception of any syscall, only a certain subset of syscalls could be correctly emulated. Over the last few development cycles, the set of syscalls which can't be emulated has been reduced due to the addition of pidfd_getfd(2). With this we are now able to, for example, intercept syscalls that require the supervisor to operate on file descriptors of the supervisee such as connect(2). However, syscalls that cause new file descriptors to be installed can not currently be correctly emulated since there is no way for the supervisor to inject file descriptors into the supervisee. This patch adds a new addfd ioctl to remove this restriction by allowing the supervisor to install file descriptors into the intercepted task. By implementing this feature via seccomp the supervisor effectively instructs the supervisee to install a set of file descriptors into its own file descriptor table during the intercepted syscall. This way it is possible to intercept syscalls such as open() or accept(), and install (or replace, like dup2(2)) the supervisor's resulting fd into the supervisee. One replacement use-case would be to redirect the stdout and stderr of a supervisee into log file descriptors opened by the supervisor. The ioctl handling is based on the discussions[1] of how Extensible Arguments should interact with ioctls. Instead of building size into the addfd structure, make it a function of the ioctl command (which is how sizes are normally passed to ioctls). To support forward and backward compatibility, just mask out the direction and size, and match everything. The size (and any future direction) checks are done along with copy_struct_from_user() logic. As a note, the seccomp_notif_addfd structure is laid out based on 8-byte alignment without requiring packing as there have been packing issues with uapi highlighted before[2][3]. Although we could overload the newfd field and use -1 to indicate that it is not to be used, doing so requires changing the size of the fd field, and introduces struct packing complexity. [1]: https://lore.kernel.org/lkml/87o8w9bcaf.fsf@mid.deneb.enyo.de/ [2]: https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f18c0@rasmusvillemoes.dk/ [3]: https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal Cc: Christoph Hellwig <hch@lst.de> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Tycho Andersen <tycho@tycho.ws> Cc: Jann Horn <jannh@google.com> Cc: Robert Sesek <rsesek@google.com> Cc: Chris Palmer <palmer@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-api@vger.kernel.org Suggested-by: Matt Denton <mpdenton@google.com> Link: https://lore.kernel.org/r/20200603011044.7972-4-sargun@sargun.meSigned-off-by: Sargun Dhillon <sargun@sargun.me> Reviewed-by: Will Drewry <wad@chromium.org> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
-
- 13 Jul, 2020 7 commits
-
-
Kees Cook authored
Expand __receive_fd() with support for replace_fd() for the coming seccomp "addfd" ioctl(). Add new wrapper receive_fd_replace() for the new behavior and update existing wrappers to retain old behavior. Thanks to Colin Ian King <colin.king@canonical.com> for pointing out an uninitialized variable exposure in an earlier version of this patch. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Kadashev <dkadashev@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Replace the open-coded version of receive_fd() with a call to the new helper. Thanks to Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> for catching a missed fput() in an earlier version of this patch. Cc: Christoph Hellwig <hch@lst.de> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
For both pidfd and seccomp, the __user pointer is not used. Update __receive_fd() to make writing to ufd optional via a NULL check. However, for the receive_fd_user() wrapper, ufd is NULL checked so an -EFAULT can be returned to avoid changing the SCM_RIGHTS interface behavior. Add new wrapper receive_fd() for pidfd and seccomp that does not use the ufd argument. For the new helper, the allocated fd needs to be returned on success. Update the existing callers to handle it. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
In preparation for users of the "install a received file" logic outside of net/ (pidfd and seccomp), relocate and rename __scm_install_fd() from net/core/scm.c to __receive_fd() in fs/file.c, and provide a wrapper named receive_fd_user(), as future patches will change the interface to __receive_fd(). Additionally add a comment to fd_install() as a counterpoint to how __receive_fd() interacts with fput(). Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Dmitry Kadashev <dkadashev@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Ido Schimmel <idosch@idosch.org> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: linux-fsdevel@vger.kernel.org Cc: netdev@vger.kernel.org Reviewed-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Duplicate the cleanups from commit 2618d530 ("net/scm: cleanup scm_detach_fds") into the compat code. Replace open-coded __receive_sock() with a call to the helper. Move the check added in commit 1f466e1f ("net: cleanly handle kernel vs user buffers for ->msg_control") to before the compat call, even though it should be impossible for an in-kernel call to also be compat. Correct the int "flags" argument to unsigned int to match fd_install() and similar APIs. Regularize any remaining differences, including a whitespace issue, a checkpatch warning, and add the check from commit 6900317f ("net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which fixed an overflow unique to 64-bit. To avoid confusion when comparing the compat handler to the native handler, just include the same check in the compat handler. Cc: Christoph Hellwig <hch@lst.de> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The sock counting (sock_update_netprioidx() and sock_update_classid()) was missing from pidfd's implementation of received fd installation. Add a call to the new __receive_sock() helper. Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 8649c322 ("pid: Implement pidfd_getfd syscall") Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Add missed sock updates to compat path via a new helper, which will be used more in coming patches. (The net/core/scm.c code is left as-is here to assist with -stable backports for the compat path.) Cc: Christoph Hellwig <hch@lst.de> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jakub Kicinski <kuba@kernel.org> Cc: stable@vger.kernel.org Fixes: 48a87cc2 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly") Fixes: d8429506 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly") Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
- 10 Jul, 2020 21 commits
-
-
Kees Cook authored
There should be no difference between -1 and other negative syscalls while tracing. Cc: Keno Fischer <keno@juliacomputing.com> Tested-by: Will Deacon <will@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Now that the selftest harness has variants, use them to eliminate a bunch of copy/paste duplication. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Will Deacon <will@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The FIXTURE*() macro kern-doc examples had the wrong names for the C code examples associated with them. Fix those and clarify that FIXTURE_DATA() usage should be avoided. Cc: Shuah Khan <shuah@kernel.org> Fixes: 74bc7c97 ("kselftest: add fixture variants") Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The terminator for the mode 1 syscalls list was a 0, but that could be a valid syscall number (e.g. x86_64 __NR_read). By luck, __NR_read was listed first and the loop construct would not test it, so there was no bug. However, this is fragile. Replace the terminator with -1 instead, and make the variable name for mode 1 syscall lists more descriptive. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced it had the wrong direction flag set. While this isn't a big deal as nothing currently enforces these bits in the kernel, it should be defined correctly. Fix the define and provide support for the old command until it is no longer needed for backward compatibility. Fixes: 6a21cc50 ("seccomp: add a return code to trap to userspace") Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The user_trap_syscall() helper creates a filter with SECCOMP_RET_USER_NOTIF. To avoid confusion with SECCOMP_RET_TRAP, rename the helper to user_notif_syscall(). Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@chromium.org> Cc: linux-kselftest@vger.kernel.org Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The seccomp tests are a bit noisy without CONFIG_CHECKPOINT_RESTORE (due to missing the kcmp() syscall). The seccomp tests are more accurate with kcmp(), but it's not strictly required. Refactor the tests to use alternatives (comparing fd numbers), and provide a central test for kcmp() so there is a single SKIP instead of many. Continue to produce warnings for the other tests, though. Additionally adds some more bad flag EINVAL tests to the addfd selftest. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: Andrii Nakryiko <andriin@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@chromium.org> Cc: linux-kselftest@vger.kernel.org Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Avoid open-coding "seccomp: " prefixes for pr_*() calls. Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The seccomp benchmark calibration loop did not need to take so long. Instead, use a simple 1 second timeout and multiply up to target. It does not need to be accurate. Signed-off-by: Kees Cook <keescook@chromium.org>
-
Thadeu Lima de Souza Cascardo authored
As seccomp_benchmark tries to calibrate how many samples will take more than 5 seconds to execute, it may end up picking up a number of samples that take 10 (but up to 12) seconds. As the calibration will take double that time, it takes around 20 seconds. Then, it executes the whole thing again, and then once more, with some added overhead. So, the thing might take more than 40 seconds, which is too close to the 45s timeout. That is very dependent on the system where it's executed, so may not be observed always, but it has been observed on x86 VMs. Using a 90s timeout seems safe enough. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/r/20200601123202.1183526-1-cascardo@canonical.comSigned-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
It's useful to see how much (at a minimum) each filter adds to the syscall overhead. Add additional calculations. Signed-off-by: Kees Cook <keescook@chromium.org>
-
Christian Brauner authored
This verifies we're correctly notified when a seccomp filter becomes unused when a notifier is in use. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200531115031.391515-4-christian.brauner@ubuntu.comSigned-off-by: Kees Cook <keescook@chromium.org>
-
Christian Brauner authored
We've been making heavy use of the seccomp notifier to intercept and handle certain syscalls for containers. This patch allows a syscall supervisor listening on a given notifier to be notified when a seccomp filter has become unused. A container is often managed by a singleton supervisor process the so-called "monitor". This monitor process has an event loop which has various event handlers registered. If the user specified a seccomp profile that included a notifier for various syscalls then we also register a seccomp notify even handler. For any container using a separate pid namespace the lifecycle of the seccomp notifier is bound to the init process of the pid namespace, i.e. when the init process exits the filter must be unused. If a new process attaches to a container we force it to assume a seccomp profile. This can either be the same seccomp profile as the container was started with or a modified one. If the attaching process makes use of the seccomp notifier we will register a new seccomp notifier handler in the monitor's event loop. However, when the attaching process exits we can't simply delete the handler since other child processes could've been created (daemons spawned etc.) that have inherited the seccomp filter and so we need to keep the seccomp notifier fd alive in the event loop. But this is problematic since we don't get a notification when the seccomp filter has become unused and so we currently never remove the seccomp notifier fd from the event loop and just keep accumulating fds in the event loop. We've had this issue for a while but it has recently become more pressing as more and larger users make use of this. To fix this, we introduce a new "users" reference counter that tracks any tasks and dependent filters making use of a filter. When a notifier is registered waiting tasks will be notified that the filter is now empty by receiving a (E)POLLHUP event. The concept in this patch introduces is the same as for signal_struct, i.e. reference counting for life-cycle management is decoupled from reference counting taks using the object. There's probably some trickery possible but the second counter is just the correct way of doing this IMHO and has precedence. Cc: Tycho Andersen <tycho@tycho.ws> Cc: Kees Cook <keescook@chromium.org> Cc: Matt Denton <mpdenton@google.com> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jann Horn <jannh@google.com> Cc: Chris Palmer <palmer@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Robert Sesek <rsesek@google.com> Cc: Jeffrey Vander Stoep <jeffv@google.com> Cc: Linux Containers <containers@lists.linux-foundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200531115031.391515-3-christian.brauner@ubuntu.comSigned-off-by: Kees Cook <keescook@chromium.org>
-
Christian Brauner authored
Lift the wait_queue from struct notification into struct seccomp_filter. This is cleaner overall and lets us avoid having to take the notifier mutex in the future for EPOLLHUP notifications since we need to neither read nor modify the notifier specific aspects of the seccomp filter. In the exit path I'd very much like to avoid having to take the notifier mutex for each filter in the task's filter hierarchy. Cc: Tycho Andersen <tycho@tycho.ws> Cc: Kees Cook <keescook@chromium.org> Cc: Matt Denton <mpdenton@google.com> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jann Horn <jannh@google.com> Cc: Chris Palmer <palmer@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Robert Sesek <rsesek@google.com> Cc: Jeffrey Vander Stoep <jeffv@google.com> Cc: Linux Containers <containers@lists.linux-foundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Christian Brauner authored
The seccomp filter used to be released in free_task() which is called asynchronously via call_rcu() and assorted mechanisms. Since we need to inform tasks waiting on the seccomp notifier when a filter goes empty we will notify them as soon as a task has been marked fully dead in release_task(). To not split seccomp cleanup into two parts, move filter release out of free_task() and into release_task() after we've unhashed struct task from struct pid, exited signals, and unlinked it from the threadgroups' thread list. We'll put the empty filter notification infrastructure into it in a follow up patch. This also renames put_seccomp_filter() to seccomp_filter_release() which is a more descriptive name of what we're doing here especially once we've added the empty filter notification mechanism in there. We're also NULL-ing the task's filter tree entrypoint which seems cleaner than leaving a dangling pointer in there. Note that this shouldn't need any memory barriers since we're calling this when the task is in release_task() which means it's EXIT_DEAD. So it can't modify its seccomp filters anymore. You can also see this from the point where we're calling seccomp_filter_release(). It's after __exit_signal() and at this point, tsk->sighand will already have been NULLed which is required for thread-sync and filter installation alike. Cc: Tycho Andersen <tycho@tycho.ws> Cc: Kees Cook <keescook@chromium.org> Cc: Matt Denton <mpdenton@google.com> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jann Horn <jannh@google.com> Cc: Chris Palmer <palmer@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Robert Sesek <rsesek@google.com> Cc: Jeffrey Vander Stoep <jeffv@google.com> Cc: Linux Containers <containers@lists.linux-foundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200531115031.391515-2-christian.brauner@ubuntu.comSigned-off-by: Kees Cook <keescook@chromium.org>
-
Christian Brauner authored
Naming the lifetime counter of a seccomp filter "usage" suggests a little too strongly that its about tasks that are using this filter while it also tracks other references such as the user notifier or ptrace. This also updates the documentation to note this fact. We'll be introducing an actual usage counter in a follow-up patch. Cc: Tycho Andersen <tycho@tycho.ws> Cc: Kees Cook <keescook@chromium.org> Cc: Matt Denton <mpdenton@google.com> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Jann Horn <jannh@google.com> Cc: Chris Palmer <palmer@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Robert Sesek <rsesek@google.com> Cc: Jeffrey Vander Stoep <jeffv@google.com> Cc: Linux Containers <containers@lists.linux-foundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200531115031.391515-1-christian.brauner@ubuntu.comSigned-off-by: Kees Cook <keescook@chromium.org>
-
Sargun Dhillon authored
This adds a helper which can iterate through a seccomp_filter to find a notification matching an ID. It removes several replicated chunks of code. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Tycho Andersen <tycho@tycho.ws> Cc: Matt Denton <mpdenton@google.com> Cc: Kees Cook <keescook@google.com>, Cc: Jann Horn <jannh@google.com>, Cc: Robert Sesek <rsesek@google.com>, Cc: Chris Palmer <palmer@google.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Tycho Andersen <tycho@tycho.ws> Link: https://lore.kernel.org/r/20200601112532.150158-1-sargun@sargun.meSigned-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
A common question asked when debugging seccomp filters is "how many filters are attached to your process?" Provide a way to easily answer this question through /proc/$pid/status with a "Seccomp_filters" line. Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The TSYNC ESRCH flag test will fail for regular users because NNP was not set yet. Add NNP setting. Fixes: 51891498 ("seccomp: allow TSYNC and USER_NOTIF together") Cc: stable@vger.kernel.org Reviewed-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Running the seccomp tests as a regular user shouldn't just fail tests that require CAP_SYS_ADMIN (for getting a PID namespace). Instead, detect those cases and SKIP them. Additionally, gracefully SKIP missing CONFIG_USER_NS (and add to "config" since we'd prefer to actually test this case). Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
The kselftests will be renaming XFAIL to SKIP in the test harness, and to avoid painful conflicts, rename XFAIL to SKIP now in a future-proofed way. Signed-off-by: Kees Cook <keescook@chromium.org>
-
- 14 Jun, 2020 4 commits
-
-
Linus Torvalds authored
-
git://github.com/micah-morton/linuxLinus Torvalds authored
Pull SafeSetID update from Micah Morton: "Add additional LSM hooks for SafeSetID SafeSetID is capable of making allow/deny decisions for set*uid calls on a system, and we want to add similar functionality for set*gid calls. The work to do that is not yet complete, so probably won't make it in for v5.8, but we are looking to get this simple patch in for v5.8 since we have it ready. We are planning on the rest of the work for extending the SafeSetID LSM being merged during the v5.9 merge window" * tag 'LSM-add-setgid-hook-5.8-author-fix' of git://github.com/micah-morton/linux: security: Add LSM hooks to set*gid syscalls
-
Thomas Cedeno authored
The SafeSetID LSM uses the security_task_fix_setuid hook to filter set*uid() syscalls according to its configured security policy. In preparation for adding analagous support in the LSM for set*gid() syscalls, we add the requisite hook here. Tested by putting print statements in the security_task_fix_setgid hook and seeing them get hit during kernel boot. Signed-off-by: Thomas Cedeno <thomascedeno@google.com> Signed-off-by: Micah Morton <mortonm@chromium.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull btrfs updates from David Sterba: "This reverts the direct io port to iomap infrastructure of btrfs merged in the first pull request. We found problems in invalidate page that don't seem to be fixable as regressions or without changing iomap code that would not affect other filesystems. There are four reverts in total, but three of them are followup cleanups needed to revert a43a67a2 cleanly. The result is the buffer head based implementation of direct io. Reverts are not great, but under current circumstances I don't see better options" * tag 'for-5.8-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Revert "btrfs: switch to iomap_dio_rw() for dio" Revert "fs: remove dio_end_io()" Revert "btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK" Revert "btrfs: split btrfs_direct_IO to read and write part"
-
- 13 Jun, 2020 7 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix cfg80211 deadlock, from Johannes Berg. 2) RXRPC fails to send norigications, from David Howells. 3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from Geliang Tang. 4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu. 5) The ucc_geth driver needs __netdev_watchdog_up exported, from Valentin Longchamp. 6) Fix hashtable memory leak in dccp, from Wang Hai. 7) Fix how nexthops are marked as FDB nexthops, from David Ahern. 8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni. 9) Fix crashes in tipc_disc_rcv(), from Tuong Lien. 10) Fix link speed reporting in iavf driver, from Brett Creeley. 11) When a channel is used for XSK and then reused again later for XSK, we forget to clear out the relevant data structures in mlx5 which causes all kinds of problems. Fix from Maxim Mikityanskiy. 12) Fix memory leak in genetlink, from Cong Wang. 13) Disallow sockmap attachments to UDP sockets, it simply won't work. From Lorenz Bauer. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) net: ethernet: ti: ale: fix allmulti for nu type ale net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init net: atm: Remove the error message according to the atomic context bpf: Undo internal BPF_PROBE_MEM in BPF insns dump libbpf: Support pre-initializing .bss global variables tools/bpftool: Fix skeleton codegen bpf: Fix memlock accounting for sock_hash bpf: sockmap: Don't attach programs to UDP sockets bpf: tcp: Recv() should return 0 when the peer socket is closed ibmvnic: Flush existing work items before device removal genetlink: clean up family attributes allocations net: ipa: header pad field only valid for AP->modem endpoint net: ipa: program upper nibbles of sequencer type net: ipa: fix modem LAN RX endpoint id net: ipa: program metadata mask differently ionic: add pcie_print_link_status rxrpc: Fix race between incoming ACK parser and retransmitter net/mlx5: E-Switch, Fix some error pointer dereferences net/mlx5: Don't fail driver on failure to create debugfs net/mlx5e: CT: Fix ipv6 nat header rewrite actions ...
-
David Sterba authored
This reverts commit a43a67a2. This patch reverts the main part of switching direct io implementation to iomap infrastructure. There's a problem in invalidate page that couldn't be solved as regression in this development cycle. The problem occurs when buffered and direct io are mixed, and the ranges overlap. Although this is not recommended, filesystems implement measures or fallbacks to make it somehow work. In this case, fallback to buffered IO would be an option for btrfs (this already happens when direct io is done on compressed data), but the change would be needed in the iomap code, bringing new semantics to other filesystems. Another problem arises when again the buffered and direct ios are mixed, invalidation fails, then -EIO is set on the mapping and fsync will fail, though there's no real error. There have been discussions how to fix that, but revert seems to be the least intrusive option. Link: https://lore.kernel.org/linux-btrfs/20200528192103.xm45qoxqmkw7i5yl@fiona/Signed-off-by: David Sterba <dsterba@suse.com>
-
Grygorii Strashko authored
On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS allmulti setting does not allow unregistered mcast packets to pass. This happens, because ALE VLAN entries on these SoCs do not contain port masks for reg/unreg mcast packets, but instead store indexes of ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for reg/unreg mcast packets. This path was missed by commit 9d1f6447 ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled"). Hence, fix it by taking into account ALE type in cpsw_ale_set_allmulti(). Fixes: 9d1f6447 ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
The ALE parameters structure is created on stack, so it has to be reset before passing to cpsw_ale_create() to avoid garbage values. Fixes: 93a76530 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf 2020-06-12 The following pull-request contains BPF updates for your *net* tree. We've added 26 non-merge commits during the last 10 day(s) which contain a total of 27 files changed, 348 insertions(+), 93 deletions(-). The main changes are: 1) sock_hash accounting fix, from Andrey. 2) libbpf fix and probe_mem sanitizing, from Andrii. 3) sock_hash fixes, from Jakub. 4) devmap_val fix, from Jesper. 5) load_bytes_relative fix, from YiFei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Liao Pingfang authored
Looking into the context (atomic!) and the error message should be dropped. Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull more cifs updates from Steve French: "12 cifs/smb3 fixes, 2 for stable. - add support for idsfromsid on create and chgrp/chown allowing ability to save owner information more naturally for some workloads - improve query info (getattr) when SMB3.1.1 posix extensions are negotiated by using new query info level" * tag '5.8-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: smb3: Add debug message for new file creation with idsfromsid mount option cifs: fix chown and chgrp when idsfromsid mount option enabled smb3: allow uid and gid owners to be set on create with idsfromsid mount option smb311: Add tracepoints for new compound posix query info smb311: add support for using info level for posix extensions query smb311: Add support for lookup with posix extensions query info smb311: Add support for SMB311 query info (non-compounded) SMB311: Add support for query info using posix extensions (level 100) smb3: add indatalen that can be a non-zero value to calculation of credit charge in smb2 ioctl smb3: fix typo in mount options displayed in /proc/mounts cifs: Add get_security_type_str function to return sec type. smb3: extend fscache mount volume coherency check
-