1. 21 Jan, 2022 15 commits
  2. 20 Jan, 2022 5 commits
  3. 19 Jan, 2022 12 commits
  4. 18 Jan, 2022 8 commits
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: Batching iter for AF_UNIX sockets.' · 712d4793
      Alexei Starovoitov authored
      Kuniyuki Iwashima says:
      
      ====================
      
      Last year the commit afd20b92 ("af_unix: Replace the big lock with
      small locks.") landed on bpf-next.  Now we can use a batching algorithm
      for AF_UNIX bpf iter as TCP bpf iter.
      
      Changelog:
      - Add the 1st patch.
      - Call unix_get_first() in .start()/.next() to always acquire a lock in
        each iteration in the 2nd patch.
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      712d4793
    • Kuniyuki Iwashima's avatar
      selftest/bpf: Fix a stale comment. · a796966b
      Kuniyuki Iwashima authored
      The commit b8a58aa6 ("af_unix: Cut unix_validate_addr() out of
      unix_mkname().") moved the bound test part into unix_validate_addr().
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/r/20220113002849.4384-6-kuniyu@amazon.co.jpSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a796966b
    • Kuniyuki Iwashima's avatar
      selftest/bpf: Test batching and bpf_(get|set)sockopt in bpf unix iter. · 7ff8985c
      Kuniyuki Iwashima authored
      This patch adds a test for the batching and bpf_(get|set)sockopt in bpf
      unix iter.
      
      It does the following.
      
        1. Creates an abstract UNIX domain socket
        2. Call bpf_setsockopt()
        3. Call bpf_getsockopt() and save the value
        4. Call setsockopt()
        5. Call getsockopt() and save the value
        6. Compare the saved values
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/r/20220113002849.4384-5-kuniyu@amazon.co.jpSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7ff8985c
    • Kuniyuki Iwashima's avatar
      bpf: Support bpf_(get|set)sockopt() in bpf unix iter. · eb7d8f1d
      Kuniyuki Iwashima authored
      This patch makes bpf_(get|set)sockopt() available when iterating AF_UNIX
      sockets.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/r/20220113002849.4384-4-kuniyu@amazon.co.jpSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      eb7d8f1d
    • Kuniyuki Iwashima's avatar
      bpf: af_unix: Use batching algorithm in bpf unix iter. · 855d8e77
      Kuniyuki Iwashima authored
      The commit 04c7820b ("bpf: tcp: Bpf iter batching and lock_sock")
      introduces the batching algorithm to iterate TCP sockets with more
      consistency.
      
      This patch uses the same algorithm to iterate AF_UNIX sockets.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/r/20220113002849.4384-3-kuniyu@amazon.co.jpSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      855d8e77
    • Kuniyuki Iwashima's avatar
      af_unix: Refactor unix_next_socket(). · 4408d55a
      Kuniyuki Iwashima authored
      Currently, unix_next_socket() is overloaded depending on the 2nd argument.
      If it is NULL, unix_next_socket() returns the first socket in the hash.  If
      not NULL, it returns the next socket in the same hash list or the first
      socket in the next non-empty hash list.
      
      This patch refactors unix_next_socket() into two functions unix_get_first()
      and unix_get_next().  unix_get_first() newly acquires a lock and returns
      the first socket in the list.  unix_get_next() returns the next socket in a
      list or releases a lock and falls back to unix_get_first().
      
      In the following patch, bpf iter holds entire sockets in a list and always
      releases the lock before .show().  It always calls unix_get_first() to
      acquire a lock in each iteration.  So, this patch makes the change easier
      to follow.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/r/20220113002849.4384-2-kuniyu@amazon.co.jpSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4408d55a
    • Alexei Starovoitov's avatar
      Merge branch 'Introduce unstable CT lookup helpers' · 2a1aff60
      Alexei Starovoitov authored
      Kumar Kartikeya says:
      
      ====================
      
      This series adds unstable conntrack lookup helpers using BPF kfunc support.  The
      patch adding the lookup helper is based off of Maxim's recent patch to aid in
      rebasing their series on top of this, all adjusted to work with module kfuncs [0].
      
        [0]: https://lore.kernel.org/bpf/20211019144655.3483197-8-maximmi@nvidia.com
      
      To enable returning a reference to struct nf_conn, the verifier is extended to
      support reference tracking for PTR_TO_BTF_ID, and kfunc is extended with support
      for working as acquire/release functions, similar to existing BPF helpers. kfunc
      returning pointer (limited to PTR_TO_BTF_ID in the kernel) can also return a
      PTR_TO_BTF_ID_OR_NULL now, typically needed when acquiring a resource can fail.
      kfunc can also receive PTR_TO_CTX and PTR_TO_MEM (with some limitations) as
      arguments now. There is also support for passing a mem, len pair as argument
      to kfunc now. In such cases, passing pointer to unsized type (void) is also
      permitted.
      
      Please see individual commits for details.
      
      Changelog:
      ----------
      v7 -> v8:
      v7: https://lore.kernel.org/bpf/20220111180428.931466-1-memxor@gmail.com
      
       * Move enum btf_kfunc_hook to btf.c (Alexei)
       * Drop verbose log for unlikely failure case in __find_kfunc_desc_btf (Alexei)
       * Remove unnecessary barrier in register_btf_kfunc_id_set (Alexei)
       * Switch macro in bpf_nf test to __always_inline function (Alexei)
      
      v6 -> v7:
      v6: https://lore.kernel.org/bpf/20220102162115.1506833-1-memxor@gmail.com
      
       * Drop try_module_get_live patch, use flag in btf_module struct (Alexei)
       * Add comments and expand commit message detailing why we have to concatenate
         and sort vmlinux kfunc BTF ID sets (Alexei)
       * Use bpf_testmod for testing btf_try_get_module race (Alexei)
       * Use bpf_prog_type for both btf_kfunc_id_set_contains and
         register_btf_kfunc_id_set calls (Alexei)
       * In case of module set registration, directly assign set (Alexei)
       * Add CONFIG_USERFAULTFD=y to selftest config
       * Fix other nits
      
      v5 -> v6:
      v5: https://lore.kernel.org/bpf/20211230023705.3860970-1-memxor@gmail.com
      
       * Fix for a bug in btf_try_get_module leading to use-after-free
       * Drop *kallsyms_on_each_symbol loop, reinstate register_btf_kfunc_id_set (Alexei)
       * btf_free_kfunc_set_tab now takes struct btf, and handles resetting tab to NULL
       * Check return value btf_name_by_offset for param_name
       * Instead of using tmp_set, use btf->kfunc_set_tab directly, and simplify cleanup
      
      v4 -> v5:
      v4: https://lore.kernel.org/bpf/20211217015031.1278167-1-memxor@gmail.com
      
       * Move nf_conntrack helpers code to its own separate file (Toke, Pablo)
       * Remove verifier callbacks, put btf_id_sets in struct btf (Alexei)
        * Convert the in-kernel users away from the old API
       * Change len__ prefix convention to __sz suffix (Alexei)
       * Drop parent_ref_obj_id patch (Alexei)
      
      v3 -> v4:
      v3: https://lore.kernel.org/bpf/20211210130230.4128676-1-memxor@gmail.com
      
       * Guard unstable CT helpers with CONFIG_DEBUG_INFO_BTF_MODULES
       * Move addition of prog_test test kfuncs to selftest commit
       * Move negative kfunc tests to test_verifier suite
       * Limit struct nesting depth to 4, which should be enough for now
      
      v2 -> v3:
      v2: https://lore.kernel.org/bpf/20211209170929.3485242-1-memxor@gmail.com
      
       * Fix build error for !CONFIG_BPF_SYSCALL (Patchwork)
      
      RFC v1 -> v2:
      v1: https://lore.kernel.org/bpf/20211030144609.263572-1-memxor@gmail.com
      
       * Limit PTR_TO_MEM support to pointer to scalar, or struct with scalars (Alexei)
       * Use btf_id_set for checking acquire, release, ret type null (Alexei)
       * Introduce opts struct for CT helpers, move int err parameter to it
       * Add l4proto as parameter to CT helper's opts, remove separate tcp/udp helpers
       * Add support for mem, len argument pair to kfunc
       * Allow void * as pointer type for mem, len argument pair
       * Extend selftests to cover new additions to kfuncs
       * Copy ref_obj_id to PTR_TO_BTF_ID dst_reg on btf_struct_access, test it
       * Fix other misc nits, bugs, and expand commit messages
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2a1aff60
    • Kumar Kartikeya Dwivedi's avatar
      selftests/bpf: Add test for race in btf_try_get_module · 46565696
      Kumar Kartikeya Dwivedi authored
      This adds a complete test case to ensure we never take references to
      modules not in MODULE_STATE_LIVE, which can lead to UAF, and it also
      ensures we never access btf->kfunc_set_tab in an inconsistent state.
      
      The test uses userfaultfd to artificially widen the race.
      
      When run on an unpatched kernel, it leads to the following splat:
      
      [root@(none) bpf]# ./test_progs -t bpf_mod_race/ksym
      [   55.498171] BUG: unable to handle page fault for address: fffffbfff802548b
      [   55.499206] #PF: supervisor read access in kernel mode
      [   55.499855] #PF: error_code(0x0000) - not-present page
      [   55.500555] PGD a4fa9067 P4D a4fa9067 PUD a4fa5067 PMD 1b44067 PTE 0
      [   55.501499] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
      [   55.502195] CPU: 0 PID: 83 Comm: kworker/0:2 Tainted: G           OE     5.16.0-rc4+ #151
      [   55.503388] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.15.0-1 04/01/2014
      [   55.504777] Workqueue: events bpf_prog_free_deferred
      [   55.505563] RIP: 0010:kasan_check_range+0x184/0x1d0
      [   55.509140] RSP: 0018:ffff88800560fcf0 EFLAGS: 00010282
      [   55.509977] RAX: fffffbfff802548b RBX: fffffbfff802548c RCX: ffffffff9337b6ba
      [   55.511096] RDX: fffffbfff802548c RSI: 0000000000000004 RDI: ffffffffc012a458
      [   55.512143] RBP: fffffbfff802548b R08: 0000000000000001 R09: ffffffffc012a45b
      [   55.513228] R10: fffffbfff802548b R11: 0000000000000001 R12: ffff888001b5f598
      [   55.514332] R13: ffff888004f49ac8 R14: 0000000000000000 R15: ffff888092449400
      [   55.515418] FS:  0000000000000000(0000) GS:ffff888092400000(0000) knlGS:0000000000000000
      [   55.516705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   55.517560] CR2: fffffbfff802548b CR3: 0000000007c10006 CR4: 0000000000770ef0
      [   55.518672] PKRU: 55555554
      [   55.519022] Call Trace:
      [   55.519483]  <TASK>
      [   55.519884]  module_put.part.0+0x2a/0x180
      [   55.520642]  bpf_prog_free_deferred+0x129/0x2e0
      [   55.521478]  process_one_work+0x4fa/0x9e0
      [   55.522122]  ? pwq_dec_nr_in_flight+0x100/0x100
      [   55.522878]  ? rwlock_bug.part.0+0x60/0x60
      [   55.523551]  worker_thread+0x2eb/0x700
      [   55.524176]  ? __kthread_parkme+0xd8/0xf0
      [   55.524853]  ? process_one_work+0x9e0/0x9e0
      [   55.525544]  kthread+0x23a/0x270
      [   55.526088]  ? set_kthread_struct+0x80/0x80
      [   55.526798]  ret_from_fork+0x1f/0x30
      [   55.527413]  </TASK>
      [   55.527813] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod]
      [   55.530846] CR2: fffffbfff802548b
      [   55.531341] ---[ end trace 1af41803c054ad6d ]---
      [   55.532136] RIP: 0010:kasan_check_range+0x184/0x1d0
      [   55.535887] RSP: 0018:ffff88800560fcf0 EFLAGS: 00010282
      [   55.536711] RAX: fffffbfff802548b RBX: fffffbfff802548c RCX: ffffffff9337b6ba
      [   55.537821] RDX: fffffbfff802548c RSI: 0000000000000004 RDI: ffffffffc012a458
      [   55.538899] RBP: fffffbfff802548b R08: 0000000000000001 R09: ffffffffc012a45b
      [   55.539928] R10: fffffbfff802548b R11: 0000000000000001 R12: ffff888001b5f598
      [   55.541021] R13: ffff888004f49ac8 R14: 0000000000000000 R15: ffff888092449400
      [   55.542108] FS:  0000000000000000(0000) GS:ffff888092400000(0000) knlGS:0000000000000000
      [   55.543260]CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   55.544136] CR2: fffffbfff802548b CR3: 0000000007c10006 CR4: 0000000000770ef0
      [   55.545317] PKRU: 55555554
      [   55.545671] note: kworker/0:2[83] exited with preempt_count 1
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20220114163953.1455836-11-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      46565696