1. 11 Mar, 2021 1 commit
    • Daiyue Zhang's avatar
      configfs: fix a use-after-free in __configfs_open_file · 14fbbc82
      Daiyue Zhang authored
      Commit b0841eef ("configfs: provide exclusion between IO and removals")
      uses ->frag_dead to mark the fragment state, thus no bothering with extra
      refcount on config_item when opening a file. The configfs_get_config_item
      was removed in __configfs_open_file, but not with config_item_put. So the
      refcount on config_item will lost its balance, causing use-after-free
      issues in some occasions like this:
      
      Test:
      1. Mount configfs on /config with read-only items:
      drwxrwx--- 289 root   root            0 2021-04-01 11:55 /config
      drwxr-xr-x   2 root   root            0 2021-04-01 11:54 /config/a
      --w--w--w-   1 root   root         4096 2021-04-01 11:53 /config/a/1.txt
      ......
      
      2. Then run:
      for file in /config
      do
      echo $file
      grep -R 'key' $file
      done
      
      3. __configfs_open_file will be called in parallel, the first one
      got called will do:
      if (file->f_mode & FMODE_READ) {
      	if (!(inode->i_mode & S_IRUGO))
      		goto out_put_module;
      			config_item_put(buffer->item);
      				kref_put()
      					package_details_release()
      						kfree()
      
      the other one will run into use-after-free issues like this:
      BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0
      Read of size 8 at addr fffffff155f02480 by task grep/13096
      CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G        W       4.14.116-kasan #1
      TGID: 13096 Comm: grep
      Call trace:
      dump_stack+0x118/0x160
      kasan_report+0x22c/0x294
      __asan_load8+0x80/0x88
      __configfs_open_file+0x1bc/0x3b0
      configfs_open_file+0x28/0x34
      do_dentry_open+0x2cc/0x5c0
      vfs_open+0x80/0xe0
      path_openat+0xd8c/0x2988
      do_filp_open+0x1c4/0x2fc
      do_sys_open+0x23c/0x404
      SyS_openat+0x38/0x48
      
      Allocated by task 2138:
      kasan_kmalloc+0xe0/0x1ac
      kmem_cache_alloc_trace+0x334/0x394
      packages_make_item+0x4c/0x180
      configfs_mkdir+0x358/0x740
      vfs_mkdir2+0x1bc/0x2e8
      SyS_mkdirat+0x154/0x23c
      el0_svc_naked+0x34/0x38
      
      Freed by task 13096:
      kasan_slab_free+0xb8/0x194
      kfree+0x13c/0x910
      package_details_release+0x524/0x56c
      kref_put+0xc4/0x104
      config_item_put+0x24/0x34
      __configfs_open_file+0x35c/0x3b0
      configfs_open_file+0x28/0x34
      do_dentry_open+0x2cc/0x5c0
      vfs_open+0x80/0xe0
      path_openat+0xd8c/0x2988
      do_filp_open+0x1c4/0x2fc
      do_sys_open+0x23c/0x404
      SyS_openat+0x38/0x48
      el0_svc_naked+0x34/0x38
      
      To fix this issue, remove the config_item_put in
      __configfs_open_file to balance the refcount of config_item.
      
      Fixes: b0841eef ("configfs: provide exclusion between IO and removals")
      Signed-off-by: default avatarDaiyue Zhang <zhangdaiyue1@huawei.com>
      Signed-off-by: default avatarYi Chen <chenyi77@huawei.com>
      Signed-off-by: default avatarGe Qiu <qiuge@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      14fbbc82
  2. 10 Mar, 2021 17 commits
    • Linus Torvalds's avatar
      Merge tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · a74e6a01
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - fix various user space visible copy_to_user() instances which return
         the number of bytes left to copy instead of -EFAULT
      
       - make TMPFS_INODE64 available again for s390 and alpha, now that both
         architectures have been switched to 64-bit ino_t (see commit
         96c0a6a7: "s390,alpha: switch to 64-bit ino_t")
      
       - make sure to release a shared hypervisor resource within the zcore
         device driver also on restart and power down; also remove unneeded
         surrounding debugfs_create return value checks
      
       - for the new hardware counter set device driver rename the uapi header
         file to be a bit more generic; also remove 60 second read limit which
         is not really necessary and without the limit the interface can be
         easier tested
      
       - some small cleanups, the largest being to convert all long long in
         our time and idle code to longs
      
       - update defconfigs
      
      * tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig
        s390: update defconfigs
        s390,alpha: make TMPFS_INODE64 available again
        s390/cio: return -EFAULT if copy_to_user() fails
        s390/tty3270: avoid comma separated statements
        s390/cpumf: remove unneeded semicolon
        s390/crypto: return -EFAULT if copy_to_user() fails
        s390/cio: return -EFAULT if copy_to_user() fails
        s390/cpumf: rename header file to hwctrset.h
        s390/zcore: release dump save area on restart or power down
        s390/zcore: no need to check return value of debugfs_create functions
        s390/cpumf: remove 60 seconds read limit
        s390/topology: remove always false if check
        s390/time,idle: get rid of unsigned long long
      a74e6a01
    • Linus Torvalds's avatar
      Revert "mm, slub: consider rest of partial list if acquire_slab() fails" · 9b1ea29b
      Linus Torvalds authored
      This reverts commit 8ff60eb0.
      
      The kernel test robot reports a huge performance regression due to the
      commit, and the reason seems fairly straightforward: when there is
      contention on the page list (which is what causes acquire_slab() to
      fail), we do _not_ want to just loop and try again, because that will
      transfer the contention to the 'n->list_lock' spinlock we hold, and
      just make things even worse.
      
      This is admittedly likely a problem only on big machines - the kernel
      test robot report comes from a 96-thread dual socket Intel Xeon Gold
      6252 setup, but the regression there really is quite noticeable:
      
         -47.9% regression of stress-ng.rawpkt.ops_per_sec
      
      and the commit that was marked as being fixed (7ced3719: "slub:
      Acquire_slab() avoid loop") actually did the loop exit early very
      intentionally (the hint being that "avoid loop" part of that commit
      message), exactly to avoid this issue.
      
      The correct thing to do may be to pick some kind of reasonable middle
      ground: instead of breaking out of the loop on the very first sign of
      contention, or trying over and over and over again, the right thing may
      be to re-try _once_, and then give up on the second failure (or pick
      your favorite value for "once"..).
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Link: https://lore.kernel.org/lkml/20210301080404.GF12822@xsang-OptiPlex-9020/
      Cc: Jann Horn <jannh@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b1ea29b
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · d3110f25
      Linus Torvalds authored
      Pull detached mounts fix from Christian Brauner:
       "Creating a series of detached mounts, attaching them to the
        filesystem, and unmounting them can be used to trigger an integer
        overflow in ns->mounts causing the kernel to block any new mounts in
        count_mounts() and returning ENOSPC because it falsely assumes that
        the maximum number of mounts in the mount namespace has been reached,
        i.e. it thinks it can't fit the new mounts into the mount namespace
        anymore.
      
        Without this fix heavy use of the new mount API with move_mount() will
        cause the host to become unuseable and thus blocks some xfstest
        patches I want to resend.
      
        Depending on the number of mounts in your system, this can be
        reproduced on any kernel that supportes open_tree() and move_mount().
      
        A reproducer has been sent for inclusion with xfstests. It takes care
        to do this in another mount namespace, not in the host's mount
        namespace so there shouldn't be any risk in running it but if one did
        run it on the host it would require a reboot in order to be able to
        mount again. See
      
            https://lore.kernel.org/fstests/20210309121041.753359-1-christian.brauner@ubuntu.com
      
        The root cause of this is that detached mounts aren't handled
        correctly when source and target mount are identical and reside on a
        shared mount causing a broken mount tree where the detached source
        itself is propagated which propagation prevents for regular
        bind-mounts and new mounts.
      
        This ultimately leads to a miscalculation of the number of mounts in
        the mount namespace.
      
        Detached mounts created via 'open_tree(fd, path, OPEN_TREE_CLONE)' are
        essentially like an unattached bind-mount. They can then later on be
        attached to the filesystem via move_mount() which calls into
        attach_recursive_mount().
      
        Part of attaching it to the filesystem is making sure that mounts get
        correctly propagated in case the destination mountpoint is MS_SHARED,
        i.e. is a shared mountpoint. This is done by calling into
        propagate_mnt() which walks the list of peers calling propagate_one()
        on each mount in this list making sure it receives the propagation
        event. The propagate_one() function thereby skips both new mounts and
        bind mounts to not propagate them "into themselves". Both are
        identified by checking whether the mount is already attached to any
        mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is
        responsible for.
      
        However, detached mounts have an anonymous mount namespace attached to
        them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't
        realize they need to be skipped causing the mount to propagate "into
        itself" breaking the mount table and causing a disconnect between the
        number of mounts recorded as being beneath or reachable from the
        target mountpoint and the number of mounts actually recorded/counted
        in ns->mounts ultimately causing an overflow which in turn prevents
        any new mounts via the ENOSPC issue.
      
        So teach propagation to handle detached mounts by making it aware of
        them. I've been tracking this issue down for the last couple of days
        and then verifying that the fix is correct by unmounting everything in
        my current mount table leaving only /proc and /sys mounted and running
        the reproducer above overnight verifying the number of mounts counted
        in ns->mounts. With this fix the counts are correct and the ENOSPC
        issue can't be reproduced.
      
        This change will only have an effect on mounts created with the new
        mount API since detached mounts cannot be created with the old mount
        API so regressions are extremely unlikely.
      
        Here's an illustration:
      
          #### mount():
          ubuntu@f1-vm:~$ sudo mount --bind /mnt/ /mnt/
          ubuntu@f1-vm:~$ findmnt  | grep -i mnt
          ├─/mnt                                /dev/sda2[/mnt] ext4       rw,relatime
      
          #### open_tree(OPEN_TREE_CLONE) + move_mount() with bug:
          ubuntu@f1-vm:~$ sudo ./mount-new /mnt/ /mnt/
          ubuntu@f1-vm:~$ findmnt  | grep -i mnt
          ├─/mnt                                /dev/sda2[/mnt] ext4       rw,relatime
          │ └─/mnt                              /dev/sda2[/mnt] ext4       rw,relatime
      
          #### open_tree(OPEN_TREE_CLONE) + move_mount() with the fix:
          ubuntu@f1-vm:~$ sudo ./mount-new /mnt /mnt
          ubuntu@f1-vm:~$ findmnt | grep -i mnt
          └─/mnt                                /dev/sda2[/mnt] ext4       rw,relatime"
      
      * tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        mount: fix mounting of detached mounts onto targets that reside on shared mounts
      d3110f25
    • Linus Torvalds's avatar
      Merge tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6 · d0df9aab
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Six cifs/smb3 fixes, three of them for stable, including some
        important mulitchannel crediting fixes, and a fix for statfs error
        handling"
      
      * tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: do not send close in compound create+close requests
        cifs: return proper error code in statfs(2)
        cifs: change noisy error message to FYI
        cifs: print MIDs in decimal notation
        cifs: ask for more credit on async read/write code paths
        cifs: fix credit accounting for extra channel
      d0df9aab
    • Linus Torvalds's avatar
      Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net · 05a59d79
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau.
      
       2) TX skb error handling fix in mt76 driver, also from Felix.
      
       3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman.
      
       4) Avoid double free of percpu pointers when freeing a cloned bpf prog.
          From Cong Wang.
      
       5) Use correct printf format for dma_addr_t in ath11k, from Geert
          Uytterhoeven.
      
       6) Fix resolve_btfids build with older toolchains, from Kun-Chuan
          Hsieh.
      
       7) Don't report truncated frames to mac80211 in mt76 driver, from
          Lorenzop Bianconi.
      
       8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang.
      
       9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann.
      
      10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from
          Arjun Roy.
      
      11) Ignore routes with deleted nexthop object in mlxsw, from Ido
          Schimmel.
      
      12) Need to undo tcp early demux lookup sometimes in nf_nat, from
          Florian Westphal.
      
      13) Fix gro aggregation for udp encaps with zero csum, from Daniel
          Borkmann.
      
      14) Make sure to always use imp*_ndo_send when necessaey, from Jason A.
          Donenfeld.
      
      15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov.
      
      16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin.
      
      17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir
          Oltean.
      
      18) Make sure textsearch copntrol block is large enough, from Wilem de
          Bruijn.
      
      19) Revert MAC changes to r8152 leading to instability, from Hates Wang.
      
      20) Advance iov in 9p even for empty reads, from Jissheng Zhang.
      
      21) Double hook unregister in nftables, from PabloNeira Ayuso.
      
      22) Fix memleak in ixgbe, fropm Dinghao Liu.
      
      23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne.
      
      24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang
          Tang.
      
      25) Fix DOI refcount bugs in cipso, from Paul Moore.
      
      26) One too many irqsave in ibmvnic, from Junlin Yang.
      
      27) Fix infinite loop with MPLS gso segmenting via virtio_net, from
          Balazs Nemeth.
      
      * git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits)
        s390/qeth: fix notification for pending buffers during teardown
        s390/qeth: schedule TX NAPI on QAOB completion
        s390/qeth: improve completion of pending TX buffers
        s390/qeth: fix memory leak after failed TX Buffer allocation
        net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
        net: check if protocol extracted by virtio_net_hdr_set_proto is correct
        net: dsa: xrs700x: check if partner is same as port in hsr join
        net: lapbether: Remove netif_start_queue / netif_stop_queue
        atm: idt77252: fix null-ptr-dereference
        atm: uPD98402: fix incorrect allocation
        atm: fix a typo in the struct description
        net: qrtr: fix error return code of qrtr_sendmsg()
        mptcp: fix length of ADD_ADDR with port sub-option
        net: bonding: fix error return code of bond_neigh_init()
        net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
        net: enetc: set MAC RX FIFO to recommended value
        net: davicom: Use platform_get_irq_optional()
        net: davicom: Fix regulator not turned off on driver removal
        net: davicom: Fix regulator not turned off on failed probe
        net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports
        ...
      05a59d79
    • Linus Torvalds's avatar
      Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc · 6a30bedf
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "Fix opcode filtering for exceptions, and clean up defconfig"
      
      * git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc:
        sparc: sparc64_defconfig: remove duplicate CONFIGs
        sparc64: Fix opcode filtering in handling of no fault loads
      6a30bedf
    • Corentin Labbe's avatar
      sparc: sparc64_defconfig: remove duplicate CONFIGs · 69264b4a
      Corentin Labbe authored
      After my patch there is CONFIG_ATA defined twice.
      Remove the duplicate one.
      Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot
      test with NFS.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Fixes: a57cdeb3 ("sparc: sparc64_defconfig: add necessary configs for qemu")
      Signed-off-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69264b4a
    • Rob Gardner's avatar
      sparc64: Fix opcode filtering in handling of no fault loads · e5e8b80d
      Rob Gardner authored
      is_no_fault_exception() has two bugs which were discovered via random
      opcode testing with stress-ng. Both are caused by improper filtering
      of opcodes.
      
      The first bug can be triggered by a floating point store with a no-fault
      ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.
      
      The code first tests op3[5] (0x1000000), which denotes a floating
      point instruction, and then tests op3[2] (0x200000), which denotes a
      store instruction. But these bits are not mutually exclusive, and the
      above mentioned opcode has both bits set. The intent is to filter out
      stores, so the test for stores must be done first in order to have
      any effect.
      
      The second bug can be triggered by a floating point load with one of
      the invalid ASI values 0x8e or 0x8f, which pass this check in
      is_no_fault_exception():
           if ((asi & 0xf2) == ASI_PNF)
      
      An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
      opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
      in handle_ldf_stq(), and is_no_fault_exception() must not allow these
      invalid asi values to make it that far.
      
      In both of these cases, handle_ldf_stq() reacts by calling
      sun4v_data_access_exception() or spitfire_data_access_exception(),
      which call is_no_fault_exception() and results in an infinite
      recursion.
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Tested-by: default avatarAnatoly Pugachev <matorola@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5e8b80d
    • David S. Miller's avatar
      Merge branch 's390-qeth-fixes' · 85154557
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2021-03-09
      
      please apply the following patch series to netdev's net tree.
      
      This brings one fix for a memleak in an error path of the setup code.
      Also several fixes for dealing with pending TX buffers - two for old
      bugs in their completion handling, and one recent regression in a
      teardown path.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85154557
    • Julian Wiedmann's avatar
      s390/qeth: fix notification for pending buffers during teardown · 7eefda7f
      Julian Wiedmann authored
      The cited commit reworked the state machine for pending TX buffers.
      In qeth_iqd_tx_complete() it turned PENDING into a transient state, and
      uses NEED_QAOB for buffers that get parked while waiting for their QAOB
      completion.
      
      But it missed to adjust the check in qeth_tx_complete_buf(). So if
      qeth_tx_complete_pending_bufs() is called during teardown to drain
      the parked TX buffers, we no longer raise a notification for af_iucv.
      
      Instead of updating the checked state, just move this code into
      qeth_tx_complete_pending_bufs() itself. This also gets rid of the
      special-case in the common TX completion path.
      
      Fixes: 8908f36d ("s390/qeth: fix af_iucv notification race")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7eefda7f
    • Julian Wiedmann's avatar
      s390/qeth: schedule TX NAPI on QAOB completion · 3e83d467
      Julian Wiedmann authored
      When a QAOB notifies us that a pending TX buffer has been delivered, the
      actual TX completion processing by qeth_tx_complete_pending_bufs()
      is done within the context of a TX NAPI instance. We shouldn't rely on
      this instance being scheduled by some other TX event, but just do it
      ourselves.
      
      qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI
      instance. To avoid touching the TX queue's NAPI instance
      before/after it is (un-)registered, reorder the code in qeth_open()
      and qeth_stop() accordingly.
      
      Fixes: 0da9581d ("qeth: exploit asynchronous delivery of storage blocks")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e83d467
    • Julian Wiedmann's avatar
      s390/qeth: improve completion of pending TX buffers · c20383ad
      Julian Wiedmann authored
      The current design attaches a pending TX buffer to a custom
      single-linked list, which is anchored at the buffer's slot on the
      TX ring. The buffer is then checked for final completion whenever
      this slot is processed during a subsequent TX NAPI poll cycle.
      
      But if there's insufficient traffic on the ring, we might never make
      enough progress to get back to this ring slot and discover the pending
      buffer's final TX completion. In particular if this missing TX
      completion blocks the application from sending further traffic.
      
      So convert the custom single-linked list code to a per-queue list_head,
      and scan this list on every TX NAPI cycle.
      
      Fixes: 0da9581d ("qeth: exploit asynchronous delivery of storage blocks")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c20383ad
    • Julian Wiedmann's avatar
      s390/qeth: fix memory leak after failed TX Buffer allocation · e7a36d27
      Julian Wiedmann authored
      When qeth_alloc_qdio_queues() fails to allocate one of the buffers that
      back an Output Queue, the 'out_freeoutqbufs' path will free all
      previously allocated buffers for this queue. But it misses to free the
      half-finished queue struct itself.
      
      Move the buffer allocation into qeth_alloc_output_queue(), and deal with
      such errors internally.
      
      Fixes: 0da9581d ("qeth: exploit asynchronous delivery of storage blocks")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7a36d27
    • David S. Miller's avatar
      Merge branch 'virtio_net-infinite-loop' · b005c9ef
      David S. Miller authored
      Balazs Nemeth says:
      
      ====================
      net: prevent infinite loop caused by incorrect proto from virtio_net_hdr_set_proto
      
      These patches prevent an infinite loop for gso packets with a protocol
      from virtio net hdr that doesn't match the protocol in the packet.
      Note that packets coming from a device without
      header_ops->parse_protocol being implemented will not be caught by
      the check in virtio_net_hdr_to_skb, but the infinite loop will still
      be prevented by the check in the gso layer.
      
      Changes from v2 to v3:
        - Remove unused *eth.
        - Use MPLS_HLEN to also check if the MPLS header length is a multiple
          of four.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b005c9ef
    • Balazs Nemeth's avatar
      net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 · d348ede3
      Balazs Nemeth authored
      A packet with skb_inner_network_header(skb) == skb_network_header(skb)
      and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers
      from the packet. Subsequently, the call to skb_mac_gso_segment will
      again call mpls_gso_segment with the same packet leading to an infinite
      loop. In addition, ensure that the header length is a multiple of four,
      which should hold irrespective of the number of stacked labels.
      Signed-off-by: default avatarBalazs Nemeth <bnemeth@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d348ede3
    • Balazs Nemeth's avatar
      net: check if protocol extracted by virtio_net_hdr_set_proto is correct · 924a9bc3
      Balazs Nemeth authored
      For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
      set) based on the type in the virtio net hdr, but the skb could contain
      anything since it could come from packet_snd through a raw socket. If
      there is a mismatch between what virtio_net_hdr_set_proto sets and
      the actual protocol, then the skb could be handled incorrectly later
      on.
      
      An example where this poses an issue is with the subsequent call to
      skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
      correctly. A specially crafted packet could fool
      skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
      
      Avoid blindly trusting the information provided by the virtio net header
      by checking that the protocol in the packet actually matches the
      protocol set by virtio_net_hdr_set_proto. Note that since the protocol
      is only checked if skb->dev implements header_ops->parse_protocol,
      packets from devices without the implementation are not checked at this
      stage.
      
      Fixes: 9274124f ("net: stricter validation of untrusted gso packets")
      Signed-off-by: default avatarBalazs Nemeth <bnemeth@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      924a9bc3
    • George McCollister's avatar
      net: dsa: xrs700x: check if partner is same as port in hsr join · 286a8624
      George McCollister authored
      Don't assign dp to partner if it's the same port that xrs700x_hsr_join
      was called with. The partner port is supposed to be the other port in
      the HSR/PRP redundant pair not the same port. This fixes an issue
      observed in testing where forwarding between redundant HSR ports on this
      switch didn't work depending on the order the ports were added to the
      hsr device.
      
      Fixes: bd62e6f5 ("net: dsa: xrs700x: add HSR offloading support")
      Signed-off-by: default avatarGeorge McCollister <george.mccollister@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      286a8624
  3. 09 Mar, 2021 9 commits
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 4b3d9f9c
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
       "A bunch of fixes for the GPIO subsystem. We have two regressions in
        the core code spotted right after the merge window, a series of fixes
        for ACPI GPIO and a subsequent fix for a related regression in
        gpio-pca953x + a minor tweak in .gitignore and a rework of handling of
        the gpio-line-names to remedy a regression in stm32mp151.
      
        Summary:
      
         - fix two regressions in core GPIO subsystem code: one NULL-pointer
           dereference and one list corruption
      
         - read GPIO line names from fwnode instead of using the generic
           device properties to fix a regression on stm32mp151
      
         - fixes to ACPI GPIO and gpio-pca953x to handle a regression in IRQ
           handling on Intel Galileo
      
         - update .gitignore in GPIO selftests"
      
      * tag 'gpio-fixes-for-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpiolib: Read "gpio-line-names" from a firmware node
        gpio: pca953x: Set IRQ type when handle Intel Galileo Gen 2
        gpiolib: acpi: Allow to find GpioInt() resource by name and index
        gpiolib: acpi: Add ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk
        gpiolib: acpi: Add missing IRQF_ONESHOT
        gpio: fix gpio-device list corruption
        gpio: fix NULL-deref-on-deregistration regression
        selftests: gpio: update .gitignore
      4b3d9f9c
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes_5.12_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 9c39198a
      Linus Torvalds authored
      Pull MIPS fixes from Thomas Bogendoerfer:
      
       - fixes for boot breakage because of misaligned FDTs
      
       - fix for overwritten exception handlers
      
       - enable MIPS optimized crypto for all MIPS CPUs to improve wireguard
         performance
      
      * tag 'mips-fixes_5.12_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: kernel: Reserve exception base early to prevent corruption
        MIPS: vmlinux.lds.S: align raw appended dtb to 8 bytes
        crypto: mips/poly1305 - enable for all MIPS processors
        MIPS: boot/compressed: Copy DTB to aligned address
      9c39198a
    • Xie He's avatar
      net: lapbether: Remove netif_start_queue / netif_stop_queue · f7d9d485
      Xie He authored
      For the devices in this driver, the default qdisc is "noqueue",
      because their "tx_queue_len" is 0.
      
      In function "__dev_queue_xmit" in "net/core/dev.c", devices with the
      "noqueue" qdisc are specially handled. Packets are transmitted without
      being queued after a "dev->flags & IFF_UP" check. However, it's possible
      that even if this check succeeds, "ops->ndo_stop" may still have already
      been called. This is because in "__dev_close_many", "ops->ndo_stop" is
      called before clearing the "IFF_UP" flag.
      
      If we call "netif_stop_queue" in "ops->ndo_stop", then it's possible in
      "__dev_queue_xmit", it sees the "IFF_UP" flag is present, and then it
      checks "netif_xmit_stopped" and finds that the queue is already stopped.
      In this case, it will complain that:
      "Virtual device ... asks to queue packet!"
      
      To prevent "__dev_queue_xmit" from generating this complaint, we should
      not call "netif_stop_queue" in "ops->ndo_stop".
      
      We also don't need to call "netif_start_queue" in "ops->ndo_open",
      because after a netdev is allocated and registered, the
      "__QUEUE_STATE_DRV_XOFF" flag is initially not set, so there is no need
      to call "netif_start_queue" to clear it.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarXie He <xie.he.0141@gmail.com>
      Acked-by: default avatarMartin Schiller <ms@dev.tdt.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7d9d485
    • Thomas Bogendoerfer's avatar
      MIPS: kernel: Reserve exception base early to prevent corruption · bd67b711
      Thomas Bogendoerfer authored
      BMIPS is one of the few platforms that do change the exception base.
      After commit 2dcb3964 ("memblock: do not start bottom-up allocations
      with kernel_end") we started seeing BMIPS boards fail to boot with the
      built-in FDT being corrupted.
      
      Before the cited commit, early allocations would be in the [kernel_end,
      RAM_END] range, but after commit they would be within [RAM_START +
      PAGE_SIZE, RAM_END].
      
      The custom exception base handler that is installed by
      bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the
      memory region allocated by unflatten_and_copy_device_tree() thus
      corrupting the FDT used by the kernel.
      
      To fix this, we need to perform an early reservation of the custom
      exception space. Additional we reserve the first 4k (1k for R3k) for
      either normal exception vector space (legacy CPUs) or special vectors
      like cache exceptions.
      
      Huge thanks to Serge for analysing and proposing a solution to this
      issue.
      
      Fixes: 2dcb3964 ("memblock: do not start bottom-up allocations with kernel_end")
      Reported-by: default avatarKamal Dasu <kdasu.kdev@gmail.com>
      Debugged-by: default avatarSerge Semin <Sergey.Semin@baikalelectronics.ru>
      Acked-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Tested-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarThomas Bogendoerfer <tsbogend@alpha.franken.de>
      bd67b711
    • Linus Torvalds's avatar
      Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc · 987a0874
      Linus Torvalds authored
      Pull sparc updates from David Miller:
       "Just some more random bits from Al, including a conversion over to
        generic extables"
      
      * git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc:
        sparc32: take ->thread.flags out
        sparc32: get rid of fake_swapper_regs
        sparc64: get rid of fake_swapper_regs
        sparc32: switch to generic extables
        sparc32: switch copy_user.S away from range exception table entries
        sparc32: get rid of range exception table entries in checksum_32.S
        sparc32: switch __bzero() away from range exception table entries
        sparc32: kill lookup_fault()
        sparc32: don't bother with lookup_fault() in __bzero()
      987a0874
    • Paulo Alcantara's avatar
      cifs: do not send close in compound create+close requests · 04ad69c3
      Paulo Alcantara authored
      In case of interrupted syscalls, prevent sending CLOSE commands for
      compound CREATE+CLOSE requests by introducing an
      CIFS_CP_CREATE_CLOSE_OP flag to indicate lower layers that it should
      not send a CLOSE command to the MIDs corresponding the compound
      CREATE+CLOSE request.
      
      A simple reproducer:
      
          #!/bin/bash
      
          mount //server/share /mnt -o username=foo,password=***
          tc qdisc add dev eth0 root netem delay 450ms
          stat -f /mnt &>/dev/null & pid=$!
          sleep 0.01
          kill $pid
          tc qdisc del dev eth0 root
          umount /mnt
      
      Before patch:
      
          ...
          6 0.256893470 192.168.122.2 → 192.168.122.15 SMB2 402 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request
          7 0.257144491 192.168.122.15 → 192.168.122.2 SMB2 498 Create Response File: ;GetInfo Response;Close Response
          9 0.260798209 192.168.122.2 → 192.168.122.15 SMB2 146 Close Request File:
         10 0.260841089 192.168.122.15 → 192.168.122.2 SMB2 130 Close Response, Error: STATUS_FILE_CLOSED
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      04ad69c3
    • Paulo Alcantara's avatar
      cifs: return proper error code in statfs(2) · 14302ee3
      Paulo Alcantara authored
      In cifs_statfs(), if server->ops->queryfs is not NULL, then we should
      use its return value rather than always returning 0.  Instead, use rc
      variable as it is properly set to 0 in case there is no
      server->ops->queryfs.
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      14302ee3
    • Paulo Alcantara's avatar
      cifs: change noisy error message to FYI · e3d100ea
      Paulo Alcantara authored
      A customer has reported that their dmesg were being flooded by
      
        CIFS: VFS: \\server Cancelling wait for mid xxx cmd: a
        CIFS: VFS: \\server Cancelling wait for mid yyy cmd: b
        CIFS: VFS: \\server Cancelling wait for mid zzz cmd: c
      
      because some processes that were performing statfs(2) on the share had
      been interrupted due to their automount setup when certain users
      logged in and out.
      
      Change it to FYI as they should be mostly informative rather than
      error messages.
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      e3d100ea
    • Paulo Alcantara's avatar
      cifs: print MIDs in decimal notation · bf1bc694
      Paulo Alcantara authored
      The MIDs are mostly printed as decimal, so let's make it consistent.
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      bf1bc694
  4. 08 Mar, 2021 13 commits