1. 10 Feb, 2022 9 commits
    • Jakub Kicinski's avatar
      net: ping6: support setting socket options via cmsg · 3ebb0b10
      Jakub Kicinski authored
      Minor reordering of the code and a call to sock_cmsg_send()
      gives us support for setting the common socket options via
      cmsg (the usual ones - SO_MARK, SO_TIMESTAMPING_OLD, SCM_TXTIME).
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ebb0b10
    • Jakub Kicinski's avatar
      net: ping6: support packet timestamping · e7b06046
      Jakub Kicinski authored
      Nothing prevents the user from requesting timestamping
      on ping6 sockets, yet timestamps are not going to be reported.
      Plumb the flags through.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7b06046
    • Jakub Kicinski's avatar
      net: ping6: remove a pr_debug() statement · 42652239
      Jakub Kicinski authored
      We have ftrace and BPF today, there's no need for printing arguments
      at the start of a function.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42652239
    • David S. Miller's avatar
      Merge tag 'ieee802154-for-davem-2022-02-10' of... · 9557167b
      David S. Miller authored
      Merge tag 'ieee802154-for-davem-2022-02-10' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154-next 2022-02-10
      
      An update from ieee802154 for your *net-next* tree.
      
      There is more ongoing in ieee802154 than usual. This will be the first pull
      request for this cycle, but I expect one more. Depending on review and rework
      times.
      
      Pavel Skripkin ported the atusb driver over to the new USB api to avoid unint
      problems as well as making use of the modern api without kmalloc() needs in he
      driver.
      
      Miquel Raynal landed some changes to ensure proper frame checksum checking with
      hwsim, documenting our use of wake and stop_queue and eliding a magic value by
      using the proper define.
      
      David Girault documented the address struct used in ieee802154.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9557167b
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · adc27288
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      100GbE Intel Wired LAN Driver Updates 2022-02-09
      
      This series contains updates to ice driver only.
      
      Brett adds support for QinQ. This begins with code refactoring and
      re-organization of VLAN configuration functions to allow for
      introduction of VSI VLAN ops to enable setting and calling of
      respective operations based on device support of single or double
      VLANs. Implementations are added for outer VLAN support.
      
      To support QinQ, the device must be set to double VLAN mode (DVM).
      In order for this to occur, the DDP package and NVM must also support
      DVM. Functions to determine compatibility and properly configure the
      device are added as well as setting the proper bits to advertise and
      utilize the proper offloads. Support for VIRTCHNL_VF_OFFLOAD_VLAN_V2
      is also included to allow for VF to negotiate and utilize this
      functionality.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adc27288
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next · 45230829
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      1) Conntrack sets on CHECKSUM_UNNECESSARY for UDP packet with no checksum,
         from Kevin Mitchell.
      
      2) skb->priority support for nfqueue, from Nicolas Dichtel.
      
      3) Remove conntrack extension register API, from Florian Westphal.
      
      4) Move nat destroy hook to nf_nat_hook instead, to remove
         nf_ct_ext_destroy(), also from Florian.
      
      5) Wrap pptp conntrack NAT hooks into single structure, from Florian Westphal.
      
      6) Support for tcp option set to noop for nf_tables, also from Florian.
      
      7) Do not run x_tables comment match from packet path in nf_tables,
         from Florian Westphal.
      
      8) Replace spinlock by cmpxchg() loop to update missed ct event,
         from Florian Westphal.
      
      9) Wrap cttimeout hooks into single structure, from Florian.
      
      10) Add fast nft_cmp expression for up to 16-bytes.
      
      11) Use cb->ctx to store context in ctnetlink dump, instead of using
          cb->args[], from Florian Westphal.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
        netfilter: ctnetlink: use dump structure instead of raw args
        nfqueue: enable to set skb->priority
        netfilter: nft_cmp: optimize comparison for 16-bytes
        netfilter: cttimeout: use option structure
        netfilter: ecache: don't use nf_conn spinlock
        netfilter: nft_compat: suppress comment match
        netfilter: exthdr: add support for tcp option removal
        netfilter: conntrack: pptp: use single option structure
        netfilter: conntrack: remove extension register api
        netfilter: conntrack: handle ->destroy hook via nat_ops instead
        netfilter: conntrack: move extension sizes into core
        netfilter: conntrack: make all extensions 8-byte alignned
        netfilter: nfqueue: enable to get skb->priority
        netfilter: conntrack: mark UDP zero checksum as CHECKSUM_UNNECESSARY
      ====================
      
      Link: https://lore.kernel.org/r/20220209133616.165104-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      45230829
    • Sebastian Andrzej Siewior's avatar
      tcp: Don't acquire inet_listen_hashbucket::lock with disabled BH. · 4f9bf2a2
      Sebastian Andrzej Siewior authored
      Commit
         9652dc2e ("tcp: relax listening_hash operations")
      
      removed the need to disable bottom half while acquiring
      listening_hash.lock. There are still two callers left which disable
      bottom half before the lock is acquired.
      
      On PREEMPT_RT the softirqs are preemptible and local_bh_disable() acts
      as a lock to ensure that resources, that are protected by disabling
      bottom halves, remain protected.
      This leads to a circular locking dependency if the lock acquired with
      disabled bottom halves is also acquired with enabled bottom halves
      followed by disabling bottom halves. This is the reverse locking order.
      It has been observed with inet_listen_hashbucket::lock:
      
      local_bh_disable() + spin_lock(&ilb->lock):
        inet_listen()
          inet_csk_listen_start()
            sk->sk_prot->hash() := inet_hash()
      	local_bh_disable()
      	__inet_hash()
      	  spin_lock(&ilb->lock);
      	    acquire(&ilb->lock);
      
      Reverse order: spin_lock(&ilb2->lock) + local_bh_disable():
        tcp_seq_next()
          listening_get_next()
            spin_lock(&ilb2->lock);
      	acquire(&ilb2->lock);
      
        tcp4_seq_show()
          get_tcp4_sock()
            sock_i_ino()
      	read_lock_bh(&sk->sk_callback_lock);
      	  acquire(softirq_ctrl)	// <---- whoops
      	  acquire(&sk->sk_callback_lock)
      
      Drop local_bh_disable() around __inet_hash() which acquires
      listening_hash->lock. Split inet_unhash() and acquire the
      listen_hashbucket lock without disabling bottom halves; the inet_ehash
      lock with disabled bottom halves.
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Link: https://lkml.kernel.org/r/12d6f9879a97cd56c09fb53dee343cbb14f7f1f7.camel@gmx.de
      Link: https://lkml.kernel.org/r/X9CheYjuXWc75Spa@hirez.programming.kicks-ass.net
      Link: https://lore.kernel.org/r/YgQOebeZ10eNx1W6@linutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f9bf2a2
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 1127170d
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2022-02-09
      
      We've added 126 non-merge commits during the last 16 day(s) which contain
      a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).
      
      The main changes are:
      
      1) Add custom BPF allocator for JITs that pack multiple programs into a huge
         page to reduce iTLB pressure, from Song Liu.
      
      2) Add __user tagging support in vmlinux BTF and utilize it from BPF
         verifier when generating loads, from Yonghong Song.
      
      3) Add per-socket fast path check guarding from cgroup/BPF overhead when
         used by only some sockets, from Pavel Begunkov.
      
      4) Continued libbpf deprecation work of APIs/features and removal of their
         usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
         and various others.
      
      5) Improve BPF instruction set documentation by adding byte swap
         instructions and cleaning up load/store section, from Christoph Hellwig.
      
      6) Switch BPF preload infra to light skeleton and remove libbpf dependency
         from it, from Alexei Starovoitov.
      
      7) Fix architecture-agnostic macros in libbpf for accessing syscall
         arguments from BPF progs for non-x86 architectures,
         from Ilya Leoshkevich.
      
      8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
         of 16-bit field with anonymous zero padding, from Jakub Sitnicki.
      
      9) Add new bpf_copy_from_user_task() helper to read memory from a different
         task than current. Add ability to create sleepable BPF iterator progs,
         from Kenny Yu.
      
      10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
          utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.
      
      11) Generate temporary netns names for BPF selftests to avoid naming
          collisions, from Hangbin Liu.
      
      12) Implement bpf_core_types_are_compat() with limited recursion for
          in-kernel usage, from Matteo Croce.
      
      13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
          to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.
      
      14) Misc minor fixes to libbpf and selftests from various folks.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
        selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
        bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
        libbpf: Fix compilation warning due to mismatched printf format
        selftests/bpf: Test BPF_KPROBE_SYSCALL macro
        libbpf: Add BPF_KPROBE_SYSCALL macro
        libbpf: Fix accessing the first syscall argument on s390
        libbpf: Fix accessing the first syscall argument on arm64
        libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
        selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
        libbpf: Fix accessing syscall arguments on riscv
        libbpf: Fix riscv register names
        libbpf: Fix accessing syscall arguments on powerpc
        selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
        libbpf: Add PT_REGS_SYSCALL_REGS macro
        selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
        bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
        bpf: Fix leftover header->pages in sparc and powerpc code.
        libbpf: Fix signedness bug in btf_dump_array_data()
        selftests/bpf: Do not export subtest as standalone test
        bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1127170d
    • Menglong Dong's avatar
      net: drop_monitor: support drop reason · 5cad527d
      Menglong Dong authored
      In the commit c504e5c2 ("net: skb: introduce kfree_skb_reason()")
      drop reason is introduced to the tracepoint of kfree_skb. Therefore,
      drop_monitor is able to report the drop reason to users by netlink.
      
      The drop reasons are reported as string to users, which is exactly
      the same as what we do when reporting it to ftrace.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220209060838.55513-1-imagedong@tencent.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5cad527d
  2. 09 Feb, 2022 31 commits