1. 21 Nov, 2020 9 commits
    • Jamie Iles's avatar
      bonding: wait for sysfs kobject destruction before freeing struct slave · b9ad3e9f
      Jamie Iles authored
      syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a
      struct slave device could result in the following splat:
      
        kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000)
        bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
        ------------[ cut here ]------------
        ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline]
        ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600
        WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S                5.9.0-rc8+ #96
        Hardware name: linux,dummy-virt (DT)
        Workqueue: netns cleanup_net
        Call trace:
         dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239
         show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x174/0x1f8 lib/dump_stack.c:118
         panic+0x360/0x7a0 kernel/panic.c:231
         __warn+0x244/0x2ec kernel/panic.c:600
         report_bug+0x240/0x398 lib/bug.c:198
         bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974
         call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322
         brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329
         do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864
         el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65
         el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93
         el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594
         debug_print_object+0x180/0x240 lib/debugobjects.c:485
         __debug_check_no_obj_freed lib/debugobjects.c:967 [inline]
         debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998
         slab_free_hook mm/slub.c:1536 [inline]
         slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577
         slab_free mm/slub.c:3138 [inline]
         kfree+0x13c/0x460 mm/slub.c:4119
         bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492
         __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190
         bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline]
         bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420
         notifier_call_chain+0xf0/0x200 kernel/notifier.c:83
         __raw_notifier_call_chain kernel/notifier.c:361 [inline]
         raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368
         call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033
         call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
         call_netdevice_notifiers net/core/dev.c:2059 [inline]
         rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347
         unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509
         unregister_netdevice_many net/core/dev.c:10508 [inline]
         default_device_exit_batch+0x294/0x338 net/core/dev.c:10992
         ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189
         cleanup_net+0x44c/0x888 net/core/net_namespace.c:603
         process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269
         worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415
         kthread+0x390/0x498 kernel/kthread.c:292
         ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925
      
      This is a potential use-after-free if the sysfs nodes are being accessed
      whilst removing the struct slave, so wait for the object destruction to
      complete before freeing the struct slave itself.
      
      Fixes: 07699f9a ("bonding: add sysfs /slave dir for bond slave devices.")
      Fixes: a068aab4 ("bonding: Fix reference count leak in bond_sysfs_slave_add.")
      Cc: Qiushi Wu <wu000273@umn.edu>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarJamie Iles <jamie@nuviainc.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20201120142827.879226-1-jamie@nuviainc.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b9ad3e9f
    • Jakub Kicinski's avatar
      Merge branch 's390-qeth-fixes-2020-11-20' · 207d0bfc
      Jakub Kicinski authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2020-11-20
      
      This brings several fixes for qeth's af_iucv-specific code paths.
      
      Also one fix by Alexandra for the recently added BR_LEARNING_SYNC
      support. We want to trust the feature indication bit, so that HW can
      mask it out if there's any issues on their end.
      ====================
      
      Link: https://lore.kernel.org/r/20201120090939.101406-1-jwi@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      207d0bfc
    • Julian Wiedmann's avatar
      s390/qeth: fix tear down of async TX buffers · 7ed10e16
      Julian Wiedmann authored
      When qeth_iqd_tx_complete() detects that a TX buffer requires additional
      async completion via QAOB, it might fail to replace the queue entry's
      metadata (and ends up triggering recovery).
      
      Assume now that the device gets torn down, overruling the recovery.
      If the QAOB notification then arrives before the tear down has
      sufficiently progressed, the buffer state is changed to
      QETH_QDIO_BUF_HANDLED_DELAYED by qeth_qdio_handle_aob().
      
      The tear down code calls qeth_drain_output_queue(), where
      qeth_cleanup_handled_pending() will then attempt to replace such a
      buffer _again_. If it succeeds this time, the buffer ends up dangling in
      its replacement's ->next_pending list ... where it will never be freed,
      since there's no further call to qeth_cleanup_handled_pending().
      
      But the second attempt isn't actually needed, we can simply leave the
      buffer on the queue and re-use it after a potential recovery has
      completed. The qeth_clear_output_buffer() in qeth_drain_output_queue()
      will ensure that it's in a clean state again.
      
      Fixes: 72861ae7 ("qeth: recovery through asynchronous delivery")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7ed10e16
    • Julian Wiedmann's avatar
      s390/qeth: fix af_iucv notification race · 8908f36d
      Julian Wiedmann authored
      The two expected notification sequences are
      1. TX_NOTIFY_PENDING with a subsequent TX_NOTIFY_DELAYED_*, when
         our TX completion code first observed the pending TX and the QAOB
         then completes at a later time; or
      2. TX_NOTIFY_OK, when qeth_qdio_handle_aob() picked up the QAOB
         completion before our TX completion code even noticed that the TX
         was pending.
      
      But as qeth_iqd_tx_complete() and qeth_qdio_handle_aob() can run
      concurrently, we may end up with a race that results in a sequence of
      TX_NOTIFY_DELAYED_* followed by TX_NOTIFY_PENDING. Which would confuse
      the af_iucv code in its tracking of pending transmits.
      
      Rework the notification code, so that qeth_qdio_handle_aob() defers its
      notification if the TX completion code is still active.
      
      Fixes: b3332930 ("qeth: add support for af_iucv HiperSockets transport")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8908f36d
    • Julian Wiedmann's avatar
      s390/qeth: make af_iucv TX notification call more robust · 34c7f50f
      Julian Wiedmann authored
      Calling into socket code is ugly already, at least check whether we are
      dealing with the expected sk_family. Only looking at skb->protocol is
      bound to cause troubles (consider eg. af_packet).
      
      Fixes: b3332930 ("qeth: add support for af_iucv HiperSockets transport")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      34c7f50f
    • Alexandra Winter's avatar
      s390/qeth: Remove pnso workaround · 0d0e2b53
      Alexandra Winter authored
      Remove workaround that supported early hardware implementations
      of PNSO OC3. Rely on the 'enarf' feature bit instead.
      
      Fixes: fa115adf ("s390/qeth: Detect PNSO OC3 capability")
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      [jwi: use logical instead of bit-wise AND]
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0d0e2b53
    • Jakub Kicinski's avatar
      Merge branch 'tcp-address-issues-with-ect0-not-being-set-in-dctcp-packets' · e10823c7
      Jakub Kicinski authored
      Alexander Duyck says:
      
      ====================
      tcp: Address issues with ECT0 not being set in DCTCP packets
      
      This patch set is meant to address issues seen with SYN/ACK packets not
      containing the ECT0 bit when DCTCP is configured as the congestion control
      algorithm for a TCP socket.
      
      A simple test using "tcpdump" and "test_progs -t bpf_tcp_ca" makes the
      issue obvious. Looking at the packets will result in the SYN/ACK packet
      with an ECT0 bit that does not match the other packets for the flow when
      the congestion control agorithm is switch from the default. So for example
      going from non-DCTCP to a DCTCP congestion control algorithm we will see
      the SYN/ACK IPV6 header will not have ECT0 set while the other packets in
      the flow will. Likewise if we switch from a default of DCTCP to cubic we
      will see the ECT0 bit set in the SYN/ACK while the other packets in the
      flow will not.
      ====================
      
      Link: https://lore.kernel.org/r/160582070138.66684.11785214534154816097.stgit@localhost.localdomainSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e10823c7
    • Alexander Duyck's avatar
      tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control · 55472017
      Alexander Duyck authored
      When setting congestion control via a BPF program it is seen that the
      SYN/ACK for packets within a given flow will not include the ECT0 flag. A
      bit of simple printk debugging shows that when this is configured without
      BPF we will see the value INET_ECN_xmit value initialized in
      tcp_assign_congestion_control however when we configure this via BPF the
      socket is in the closed state and as such it isn't configured, and I do not
      see it being initialized when we transition the socket into the listen
      state. The result of this is that the ECT0 bit is configured based on
      whatever the default state is for the socket.
      
      Any easy way to reproduce this is to monitor the following with tcpdump:
      tools/testing/selftests/bpf/test_progs -t bpf_tcp_ca
      
      Without this patch the SYN/ACK will follow whatever the default is. If dctcp
      all SYN/ACK packets will have the ECT0 bit set, and if it is not then ECT0
      will be cleared on all SYN/ACK packets. With this patch applied the SYN/ACK
      bit matches the value seen on the other packets in the given stream.
      
      Fixes: 91b5b21c ("bpf: Add support for changing congestion control")
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      55472017
    • Alexander Duyck's avatar
      tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header · 861602b5
      Alexander Duyck authored
      An issue was recently found where DCTCP SYN/ACK packets did not have the
      ECT bit set in the L3 header. A bit of code review found that the recent
      change referenced below had gone though and added a mask that prevented the
      ECN bits from being populated in the L3 header.
      
      This patch addresses that by rolling back the mask so that it is only
      applied to the flags coming from the incoming TCP request instead of
      applying it to the socket tos/tclass field. Doing this the ECT bits were
      restored in the SYN/ACK packets in my testing.
      
      One thing that is not addressed by this patch set is the fact that
      tcp_reflect_tos appears to be incompatible with ECN based congestion
      avoidance algorithms. At a minimum the feature should likely be documented
      which it currently isn't.
      
      Fixes: ac8f1710 ("tcp: reflect tos value received in SYN to the socket")
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      861602b5
  2. 20 Nov, 2020 8 commits
  3. 19 Nov, 2020 23 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4d02da97
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes for 5.10-rc5, including fixes from the WiFi
        (mac80211), can and bpf (including the strncpy_from_user fix).
      
        Current release - regressions:
      
         - mac80211: fix memory leak of filtered powersave frames
      
         - mac80211: free sta in sta_info_insert_finish() on errors to avoid
           sleeping in atomic context
      
         - netlabel: fix an uninitialized variable warning added in -rc4
      
        Previous release - regressions:
      
         - vsock: forward all packets to the host when no H2G is registered,
           un-breaking AWS Nitro Enclaves
      
         - net: Exempt multicast addresses from five-second neighbor lifetime
           requirement, decreasing the chances neighbor tables fill up
      
         - net/tls: fix corrupted data in recvmsg
      
         - qed: fix ILT configuration of SRC block
      
         - can: m_can: process interrupt only when not runtime suspended
      
        Previous release - always broken:
      
         - page_frag: Recover from memory pressure by not recycling pages
           allocating from the reserves
      
         - strncpy_from_user: Mask out bytes after NUL terminator
      
         - ip_tunnels: Set tunnel option flag only when tunnel metadata is
           present, always setting it confuses Open vSwitch
      
         - bpf, sockmap:
            - Fix partial copy_page_to_iter so progress can still be made
            - Fix socket memory accounting and obeying SO_RCVBUF
      
         - net: Have netpoll bring-up DSA management interface
      
         - net: bridge: add missing counters to ndo_get_stats64 callback
      
         - tcp: brr: only postpone PROBE_RTT if RTT is < current min_rtt
      
         - enetc: Workaround MDIO register access HW bug
      
         - net/ncsi: move netlink family registration to a subsystem init,
           instead of tying it to driver probe
      
         - net: ftgmac100: unregister NC-SI when removing driver to avoid
           crash
      
         - lan743x:
            - prevent interrupt storm on open
            - fix freeing skbs in the wrong context
      
         - net/mlx5e: Fix socket refcount leak on kTLS RX resync
      
         - net: dsa: mv88e6xxx: Avoid VLAN database corruption on 6097
      
         - fix 21 unset return codes and other mistakes on error paths, mostly
           detected by the Hulk Robot"
      
      * tag 'net-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (115 commits)
        fail_function: Remove a redundant mutex unlock
        selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL
        lib/strncpy_from_user.c: Mask out bytes after NUL terminator.
        net/smc: fix direct access to ib_gid_addr->ndev in smc_ib_determine_gid()
        net/smc: fix matching of existing link groups
        ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module
        libbpf: Fix VERSIONED_SYM_COUNT number parsing
        net/mlx4_core: Fix init_hca fields offset
        atm: nicstar: Unmap DMA on send error
        page_frag: Recover from memory pressure
        net: dsa: mv88e6xxx: Wait for EEPROM done after HW reset
        mlxsw: core: Use variable timeout for EMAD retries
        mlxsw: Fix firmware flashing
        net: Have netpoll bring-up DSA management interface
        atl1e: fix error return code in atl1e_probe()
        atl1c: fix error return code in atl1c_probe()
        ah6: fix error return code in ah6_input()
        net: usb: qmi_wwan: Set DTR quirk for MR400
        can: m_can: process interrupt only when not runtime suspended
        can: flexcan: flexcan_chip_start(): fix erroneous flexcan_transceiver_enable() during bus-off recovery
        ...
      4d02da97
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 3be28e93
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "The last two weeks have been quiet here, just the usual smattering of
        long standing bug fixes.
      
        A collection of error case bug fixes:
      
         - Improper nesting of spinlock types in cm
      
         - Missing error codes and kfree()
      
         - Ensure dma_virt_ops users have the right kconfig symbols to work
           properly
      
         - Compilation failure of tools/testing"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        tools/testing/scatterlist: Fix test to compile and run
        IB/hfi1: Fix error return code in hfi1_init_dd()
        RMDA/sw: Don't allow drivers using dma_virt_ops on highmem configs
        RDMA/pvrdma: Fix missing kfree() in pvrdma_register_device()
        RDMA/cm: Make the local_id_table xarray non-irq
      3be28e93
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e6ea60ba
      Jakub Kicinski authored
      Alexei Starovoitov says:
      
      ====================
      1) libbpf should not attempt to load unused subprogs, from Andrii.
      
      2) Make strncpy_from_user() mask out bytes after NUL terminator, from Daniel.
      
      3) Relax return code check for subprograms in the BPF verifier, from Dmitrii.
      
      4) Fix several sockmap issues, from John.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        fail_function: Remove a redundant mutex unlock
        selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL
        lib/strncpy_from_user.c: Mask out bytes after NUL terminator.
        libbpf: Fix VERSIONED_SYM_COUNT number parsing
        bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list
        bpf, sockmap: Handle memory acct if skb_verdict prog redirects to self
        bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self
        bpf, sockmap: Use truesize with sk_rmem_schedule()
        bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect
        bpf, sockmap: Fix partial copy_page_to_iter so progress can still be made
        selftests/bpf: Fix error return code in run_getsockopt_test()
        bpf: Relax return code check for subprograms
        tools, bpftool: Add missing close before bpftool net attach exit
        MAINTAINERS/bpf: Update Andrii's entry.
        selftests/bpf: Fix unused attribute usage in subprogs_unused test
        bpf: Fix unsigned 'datasec_id' compared with zero in check_pseudo_btf_id
        bpf: Fix passing zero to PTR_ERR() in bpf_btf_printf_prepare
        libbpf: Don't attempt to load unused subprog as an entry-point BPF program
      ====================
      
      Link: https://lore.kernel.org/r/20201119200721.288-1-alexei.starovoitov@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e6ea60ba
    • Luo Meng's avatar
      fail_function: Remove a redundant mutex unlock · 2801a5da
      Luo Meng authored
      Fix a mutex_unlock() issue where before copy_from_user() is
      not called mutex_locked.
      
      Fixes: 4b1a29a7 ("error-injection: Support fault injection framework")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarLuo Meng <luomeng12@huawei.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Link: https://lore.kernel.org/bpf/160570737118.263807.8358435412898356284.stgit@devnote2
      2801a5da
    • Alexei Starovoitov's avatar
      Merge branch 'Fix bpf_probe_read_user_str() overcopying' · 14d6d86c
      Alexei Starovoitov authored
      Daniel Xu says:
      
      ====================
      
      6ae08ae3 ("bpf: Add probe_read_{user, kernel} and probe_read_{user,
      kernel}_str helpers") introduced a subtle bug where
      bpf_probe_read_user_str() would potentially copy a few extra bytes after
      the NUL terminator.
      
      This issue is particularly nefarious when strings are used as map keys,
      as seemingly identical strings can occupy multiple entries in a map.
      
      This patchset fixes the issue and introduces a selftest to prevent
      future regressions.
      
      v6 -> v7:
      * Add comments
      
      v5 -> v6:
      * zero-pad up to sizeof(unsigned long) after NUL
      
      v4 -> v5:
      * don't read potentially uninitialized memory
      
      v3 -> v4:
      * directly pass userspace pointer to prog
      * test more strings of different length
      
      v2 -> v3:
      * set pid filter before attaching prog in selftest
      * use long instead of int as bpf_probe_read_user_str() retval
      * style changes
      
      v1 -> v2:
      * add Fixes: tag
      * add selftest
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      14d6d86c
    • Daniel Xu's avatar
      selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL · c8a36aed
      Daniel Xu authored
      Previously, bpf_probe_read_user_str() could potentially overcopy the
      trailing bytes after the NUL due to how do_strncpy_from_user() does the
      copy in long-sized strides. The issue has been fixed in the previous
      commit.
      
      This commit adds a selftest that ensures we don't regress
      bpf_probe_read_user_str() again.
      Signed-off-by: default avatarDaniel Xu <dxu@dxuuu.xyz>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/4d977508fab4ec5b7b574b85bdf8b398868b6ee9.1605642949.git.dxu@dxuuu.xyz
      c8a36aed
    • Daniel Xu's avatar
      lib/strncpy_from_user.c: Mask out bytes after NUL terminator. · 6fa6d280
      Daniel Xu authored
      do_strncpy_from_user() may copy some extra bytes after the NUL
      terminator into the destination buffer. This usually does not matter for
      normal string operations. However, when BPF programs key BPF maps with
      strings, this matters a lot.
      
      A BPF program may read strings from user memory by calling the
      bpf_probe_read_user_str() helper which eventually calls
      do_strncpy_from_user(). The program can then key a map with the
      destination buffer. BPF map keys are fixed-width and string-agnostic,
      meaning that map keys are treated as a set of bytes.
      
      The issue is when do_strncpy_from_user() overcopies bytes after the NUL
      terminator, it can result in seemingly identical strings occupying
      multiple slots in a BPF map. This behavior is subtle and totally
      unexpected by the user.
      
      This commit masks out the bytes following the NUL while preserving
      long-sized stride in the fast path.
      
      Fixes: 6ae08ae3 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
      Signed-off-by: default avatarDaniel Xu <dxu@dxuuu.xyz>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/21efc982b3e9f2f7b0379eed642294caaa0c27a7.1605642949.git.dxu@dxuuu.xyz
      6fa6d280
    • Linus Torvalds's avatar
      Merge tag 'powerpc-cve-2020-4788' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · dda3f425
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Fixes for CVE-2020-4788.
      
        From Daniel's cover letter:
      
        IBM Power9 processors can speculatively operate on data in the L1
        cache before it has been completely validated, via a way-prediction
        mechanism. It is not possible for an attacker to determine the
        contents of impermissible memory using this method, since these
        systems implement a combination of hardware and software security
        measures to prevent scenarios where protected data could be leaked.
      
        However these measures don't address the scenario where an attacker
        induces the operating system to speculatively execute instructions
        using data that the attacker controls. This can be used for example to
        speculatively bypass "kernel user access prevention" techniques, as
        discovered by Anthony Steinhauser of Google's Safeside Project. This
        is not an attack by itself, but there is a possibility it could be
        used in conjunction with side-channels or other weaknesses in the
        privileged code to construct an attack.
      
        This issue can be mitigated by flushing the L1 cache between privilege
        boundaries of concern.
      
        This patch series flushes the L1 cache on kernel entry (patch 2) and
        after the kernel performs any user accesses (patch 3). It also adds a
        self-test and performs some related cleanups"
      
      * tag 'powerpc-cve-2020-4788' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: rename pnv|pseries_setup_rfi_flush to _setup_security_mitigations
        selftests/powerpc: refactor entry and rfi_flush tests
        selftests/powerpc: entry flush test
        powerpc: Only include kup-radix.h for 64-bit Book3S
        powerpc/64s: flush L1D after user accesses
        powerpc/64s: flush L1D on kernel entry
        selftests/powerpc: rfi_flush: disable entry flush if present
      dda3f425
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20201119' of git://github.com/jcmvbkbc/linux-xtensa · 3494d588
      Linus Torvalds authored
      Pull xtensa fixes from Max Filippov:
      
       - fix placement of cache alias remapping area
      
       - disable preemption around cache alias management calls
      
       - add missing __user annotation to strncpy_from_user argument
      
      * tag 'xtensa-20201119' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: uaccess: Add missing __user to strncpy_from_user() prototype
        xtensa: disable preemption around cache alias management calls
        xtensa: fix TLBTEMP area placement
      3494d588
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 131ad0b6
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix recent regression in the APEI code and initialization issue
        in the ACPI fan driver.
      
        Specifics:
      
         - Make the APEI code avoid attempts to obtain logical addresses for
           registers located in the I/O address space to fix initialization
           issues (Aili Yao)
      
         - Fix sysfs attribute initialization in the ACPI fan driver (Guenter
           Roeck)"
      
      * tag 'acpi-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI, APEI, Fix error return value in apei_map_generic_address()
        ACPI: fan: Initialize performance state sysfs attribute
      131ad0b6
    • Linus Torvalds's avatar
      Merge tag 'pm-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 4ca35b4f
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix two issues in ARM cpufreq drivers and one cpuidle driver
        issue.
      
        Specifics:
      
         - Add missing RCU_NONIDLE() annotations to the Tegra cpuidle driver
           (Dmitry Osipenko)
      
         - Fix boot frequency computation in the tegra186 cpufreq driver (Jon
           Hunter)
      
         - Make the SCMI cpufreq driver register a dummy clock provider to
           avoid OPP addition failures (Sudeep Holla)"
      
      * tag 'pm-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: scmi: Fix OPP addition failure with a dummy clock provider
        cpufreq: tegra186: Fix get frequency callback
        cpuidle: tegra: Annotate tegra_pm_set_cpu_in_lp2() with RCU_NONIDLE
      4ca35b4f
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · fee3c824
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "This is a relatively large set of fixes, the bulk of it being a series
        from Lukas Wunner which fixes confusion with the lifetime of driver
        data allocated along with the SPI controller structure that's been
        created as part of the conversion to devm APIs.
      
        The simplest fix, explained in detail in Lukas' commit message, is to
        move to a devm_ function for allocation of the controller and hence
        driver data in order to push the free of that after anything tries to
        reference the driver data in the remove path. This results in a
        relatively large diff due to the addition of a new function but isn't
        particularly complex.
      
        There's also a fix from Sven van Asbroeck which fixes yet more fallout
        from the conflicts between the various different places one can
        configure the polarity of GPIOs in modern systems.
      
        Otherwise everything is fairly small and driver specific"
      
      * tag 'spi-fix-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: npcm-fiu: Don't leak SPI master in probe error path
        spi: dw: Set transfer handler before unmasking the IRQs
        spi: cadence-quadspi: Fix error return code in cqspi_probe
        spi: bcm2835aux: Restore err assignment in bcm2835aux_spi_probe
        spi: lpspi: Fix use-after-free on unbind
        spi: bcm-qspi: Fix use-after-free on unbind
        spi: bcm2835aux: Fix use-after-free on unbind
        spi: bcm2835: Fix use-after-free on unbind
        spi: Introduce device-managed SPI controller allocation
        spi: fsi: Fix transfer returning without finalizing message
        spi: fix client driver breakages when using GPIO descriptors
      fee3c824
    • Jakub Kicinski's avatar
      Merge branch 'net-smc-fixes-2020-11-18' · 90b49784
      Jakub Kicinski authored
      Karsten Graul says:
      
      ====================
      net/smc: fixes 2020-11-18
      
      Patch 1 fixes the matching of link groups because with SMC-Dv2 the vlanid
      should no longer be part of this matching. Patch 2 removes a sparse message.
      ====================
      
      Link: https://lore.kernel.org/r/20201118214038.24039-1-kgraul@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      90b49784
    • Karsten Graul's avatar
      net/smc: fix direct access to ib_gid_addr->ndev in smc_ib_determine_gid() · 41a0be3f
      Karsten Graul authored
      Sparse complaints 3 times about:
      net/smc/smc_ib.c:203:52: warning: incorrect type in argument 1 (different address spaces)
      net/smc/smc_ib.c:203:52:    expected struct net_device const *dev
      net/smc/smc_ib.c:203:52:    got struct net_device [noderef] __rcu *const ndev
      
      Fix that by using the existing and validated ndev variable instead of
      accessing attr->ndev directly.
      
      Fixes: 5102eca9 ("net/smc: Use rdma_read_gid_l2_fields to L2 fields")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      41a0be3f
    • Karsten Graul's avatar
      net/smc: fix matching of existing link groups · 0530bd6e
      Karsten Graul authored
      With the multi-subnet support of SMC-Dv2 the match for existing link
      groups should not include the vlanid of the network device.
      Set ini->smcd_version accordingly before the call to smc_conn_create()
      and use this value in smc_conn_create() to skip the vlanid check.
      
      Fixes: 5c21c4cc ("net/smc: determine accepted ISM devices")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0530bd6e
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v5.10-rc4' of... · d748287a
      Linus Torvalds authored
      Merge tag 'regulator-fix-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "Mostly core fixes here, one set from Michał Mirosław which cleans up
        some issues introduced as part of the coupled regulators work, one
        memory leak during probe and two due to regulators which have an input
        supply name and regulator name which are identical, which is very
        unusual.
      
        There's also a fix for our handling of the similarly unusual case
        where we can't determine if a regulator is enabled during boot"
      
      * tag 'regulator-fix-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: ti-abb: Fix array out of bound read access on the first transition
        regulator: workaround self-referent regulators
        regulator: avoid resolve_supply() infinite recursion
        regulator: fix memory leak with repeated set_machine_constraints()
        regulator: pfuze100: limit pfuze-support-disable-sw to pfuze{100,200}
        regulator: core: don't disable regulator if is_enabled return error.
      d748287a
    • Georg Kohmann's avatar
      ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module · 2d8f6481
      Georg Kohmann authored
      IPV6=m
      NF_DEFRAG_IPV6=y
      
      ld: net/ipv6/netfilter/nf_conntrack_reasm.o: in function
      `nf_ct_frag6_gather':
      net/ipv6/netfilter/nf_conntrack_reasm.c:462: undefined reference to
      `ipv6_frag_thdr_truncated'
      
      Netfilter is depending on ipv6 symbol ipv6_frag_thdr_truncated. This
      dependency is forcing IPV6=y.
      
      Remove this dependency by moving ipv6_frag_thdr_truncated out of ipv6. This
      is the same solution as used with a similar issues: Referring to
      commit 70b095c8 ("ipv6: remove dependency of nf_defrag_ipv6 on ipv6
      module")
      
      Fixes: 9d9e937b ("ipv6/netfilter: Discard first fragment not including all headers")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarGeorg Kohmann <geokohma@cisco.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
      Link: https://lore.kernel.org/r/20201119095833.8409-1-geokohma@cisco.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2d8f6481
    • Linus Torvalds's avatar
      Merge tag 'thermal-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux · 841d6e9e
      Linus Torvalds authored
      Pull thermal fix from Daniel Lezcano:
       "Disable the CPU PM notifier for OMAP4430 for suspend in order to
        prevent wrong temperature leading to a critical shutdown (Peter
        Ujfalusi)"
      
      * tag 'thermal-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
        thermal: ti-soc-thermal: Disable the CPU PM notifier for OMAP4430
      841d6e9e
    • Jiri Olsa's avatar
      libbpf: Fix VERSIONED_SYM_COUNT number parsing · 1fd6cee1
      Jiri Olsa authored
      We remove "other info" from "readelf -s --wide" output when
      parsing GLOBAL_SYM_COUNT variable, which was added in [1].
      But we don't do that for VERSIONED_SYM_COUNT and it's failing
      the check_abi target on powerpc Fedora 33.
      
      The extra "other info" wasn't problem for VERSIONED_SYM_COUNT
      parsing until commit [2] added awk in the pipe, which assumes
      that the last column is symbol, but it can be "other info".
      
      Adding "other info" removal for VERSIONED_SYM_COUNT the same
      way as we did for GLOBAL_SYM_COUNT parsing.
      
      [1] aa915931 ("libbpf: Fix readelf output parsing for Fedora")
      [2] 746f534a ("tools/libbpf: Avoid counting local symbols in ABI check")
      
      Fixes: 746f534a ("tools/libbpf: Avoid counting local symbols in ABI check")
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20201118211350.1493421-1-jolsa@kernel.org
      1fd6cee1
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-fan' · de15e20f
      Rafael J. Wysocki authored
      * acpi-fan:
        ACPI: fan: Initialize performance state sysfs attribute
      de15e20f
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpuidle' · 3a8ac4d3
      Rafael J. Wysocki authored
      * pm-cpuidle:
        cpuidle: tegra: Annotate tegra_pm_set_cpu_in_lp2() with RCU_NONIDLE
      3a8ac4d3
    • Daniel Axtens's avatar
      powerpc/64s: rename pnv|pseries_setup_rfi_flush to _setup_security_mitigations · da631f7f
      Daniel Axtens authored
      pseries|pnv_setup_rfi_flush already does the count cache flush setup, and
      we just added entry and uaccess flushes. So the name is not very accurate
      any more. In both platforms we then also immediately setup the STF flush.
      
      Rename them to _setup_security_mitigations and fold the STF flush in.
      Signed-off-by: default avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      da631f7f
    • Daniel Axtens's avatar
      selftests/powerpc: refactor entry and rfi_flush tests · 0d239f3b
      Daniel Axtens authored
      For simplicity in backporting, the original entry_flush test contained
      a lot of duplicated code from the rfi_flush test. De-duplicate that code.
      Signed-off-by: default avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      0d239f3b