1. 24 Sep, 2021 2 commits
    • Florian Westphal's avatar
      mptcp: don't return sockets in foreign netns · ea1300b9
      Florian Westphal authored
      mptcp_token_get_sock() may return a mptcp socket that is in
      a different net namespace than the socket that received the token value.
      
      The mptcp syncookie code path had an explicit check for this,
      this moves the test into mptcp_token_get_sock() function.
      
      Eventually token.c should be converted to pernet storage, but
      such change is not suitable for net tree.
      
      Fixes: 2c5ebd00 ("mptcp: refactor token container")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea1300b9
    • Xin Long's avatar
      sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb · f7e745f8
      Xin Long authored
      We should always check if skb_header_pointer's return is NULL before
      using it, otherwise it may cause null-ptr-deref, as syzbot reported:
      
        KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
        RIP: 0010:sctp_rcv_ootb net/sctp/input.c:705 [inline]
        RIP: 0010:sctp_rcv+0x1d84/0x3220 net/sctp/input.c:196
        Call Trace:
        <IRQ>
         sctp6_rcv+0x38/0x60 net/sctp/ipv6.c:1109
         ip6_protocol_deliver_rcu+0x2e9/0x1ca0 net/ipv6/ip6_input.c:422
         ip6_input_finish+0x62/0x170 net/ipv6/ip6_input.c:463
         NF_HOOK include/linux/netfilter.h:307 [inline]
         NF_HOOK include/linux/netfilter.h:301 [inline]
         ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:472
         dst_input include/net/dst.h:460 [inline]
         ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
         NF_HOOK include/linux/netfilter.h:307 [inline]
         NF_HOOK include/linux/netfilter.h:301 [inline]
         ipv6_rcv+0x28c/0x3c0 net/ipv6/ip6_input.c:297
      
      Fixes: 3acb50c1 ("sctp: delay as much as possible skb_linearize")
      Reported-by: syzbot+581aff2ae6b860625116@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7e745f8
  2. 23 Sep, 2021 9 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 9bc62afe
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Current release - regressions:
      
         - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()
      
        Previous releases - regressions:
      
         - introduce a shutdown method to mdio device drivers, and make DSA
           switch drivers compatible with masters disappearing on shutdown;
           preventing infinite reference wait
      
         - fix issues in mdiobus users related to ->shutdown vs ->remove
      
         - virtio-net: fix pages leaking when building skb in big mode
      
         - xen-netback: correct success/error reporting for the
           SKB-with-fraglist
      
         - dsa: tear down devlink port regions when tearing down the devlink
           port on error
      
         - nexthop: fix division by zero while replacing a resilient group
      
         - hns3: check queue, vf, vlan ids range before using
      
        Previous releases - always broken:
      
         - napi: fix race against netpoll causing NAPI getting stuck
      
         - mlx4_en: ensure link operstate is updated even if link comes up
           before netdev registration
      
         - bnxt_en: fix TX timeout when TX ring size is set to the smallest
      
         - enetc: fix illegal access when reading affinity_hint; prevent oops
           on sysfs access
      
         - mtk_eth_soc: avoid creating duplicate offload entries
      
        Misc:
      
         - core: correct the sock::sk_lock.owned lockdep annotations"
      
      * tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
        atlantic: Fix issue in the pm resume flow.
        net/mlx4_en: Don't allow aRFS for encapsulated packets
        net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
        net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
        nfc: st-nci: Add SPI ID matching DT compatible
        MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
        nexthop: Fix memory leaks in nexthop notification chain listeners
        mptcp: ensure tx skbs always have the MPTCP ext
        qed: rdma - don't wait for resources under hw error recovery flow
        s390/qeth: fix deadlock during failing recovery
        s390/qeth: Fix deadlock in remove_discipline
        s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
        net: dsa: realtek: register the MDIO bus under devres
        net: dsa: don't allocate the slave_mii_bus using devres
        Doc: networking: Fox a typo in ice.rst
        net: dsa: fix dsa_tree_setup error path
        net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
        net/smc: add missing error check in smc_clc_prfx_set()
        net: hns3: fix a return value error in hclge_get_reset_status()
        net: hns3: check vlan id before using it
        ...
      9bc62afe
    • Shakeel Butt's avatar
      memcg: flush lruvec stats in the refault · 1f828223
      Shakeel Butt authored
      Prior to the commit 7e1c0d6f ("memcg: switch lruvec stats to rstat")
      and the commit aa48e47e ("memcg: infrastructure to flush memcg
      stats"), each lruvec memcg stats can be off by (nr_cgroups * nr_cpus *
      32) at worst and for unbounded amount of time.  The commit aa48e47e
      moved the lruvec stats to rstat infrastructure and the commit
      7e1c0d6f bounded the error for all the lruvec stats to (nr_cpus *
      32) at worst for at most 2 seconds.  More specifically it decoupled the
      number of stats and the number of cgroups from the error rate.
      
      However this reduction in error comes with the cost of triggering the
      slowpath of stats update more frequently.  Previously in the slowpath
      the kernel adds the stats up the memcg tree.  After aa48e47e, the
      kernel triggers the asyn lruvec stats flush through queue_work().  This
      causes regression reports from 0day kernel bot [1] as well as from
      phoronix test suite [2].
      
      We tried two options to fix the regression:
      
       1) Increase the threshold to trigger the slowpath in lruvec stats
          update codepath from 32 to 512.
      
       2) Remove the slowpath from lruvec stats update codepath and instead
          flush the stats in the page refault codepath. The assumption is that
          the kernel timely flush the stats, so, the update tree would be
          small in the refault codepath to not cause the preformance impact.
      
      Following are the results of will-it-scale/page_fault[1|2|3] benchmark
      on four settings i.e.  (1) 5.15-rc1 as baseline (2) 5.15-rc1 with
      aa48e47e and 7e1c0d6f reverted (3) 5.15-rc1 with option-1
      (4) 5.15-rc1 with option-2.
      
        test       (1)      (2)               (3)               (4)
        pg_f1   368563   406277 (10.23%)   399693  (8.44%)   416398 (12.97%)
        pg_f2   338399   372133  (9.96%)   369180  (9.09%)   381024 (12.59%)
        pg_f3   500853   575399 (14.88%)   570388 (13.88%)   576083 (15.02%)
      
      From the above result, it seems like the option-2 not only solves the
      regression but also improves the performance for at least these
      benchmarks.
      
      Feng Tang (intel) ran the aim7 benchmark with these two options and
      confirms that option-1 reduces the regression but option-2 removes the
      regression.
      
      Michael Larabel (phoronix) ran multiple benchmarks with these options
      and reported the results at [3] and it shows for most benchmarks
      option-2 removes the regression introduced by the commit aa48e47e
      ("memcg: infrastructure to flush memcg stats").
      
      Based on the experiment results, this patch proposed the option-2 as the
      solution to resolve the regression.
      
      Link: https://lore.kernel.org/all/20210726022421.GB21872@xsang-OptiPlex-9020 [1]
      Link: https://www.phoronix.com/scan.php?page=article&item=linux515-compile-regress [2]
      Link: https://openbenchmarking.org/result/2109226-DEBU-LINUX5104 [3]
      Fixes: aa48e47e ("memcg: infrastructure to flush memcg stats")
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Tested-by: default avatarMichael Larabel <Michael@phoronix.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hillf Danton <hdanton@sina.com>,
      Cc: Michal Koutný <mkoutny@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>,
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f828223
    • Sudarsana Reddy Kalluru's avatar
      atlantic: Fix issue in the pm resume flow. · 4d88c339
      Sudarsana Reddy Kalluru authored
      After fixing hibernation resume flow, another usecase was found which
      should be explicitly handled - resume when device is in "down" state.
      Invoke aq_nic_init jointly with aq_nic_start only if ndev was already
      up during suspend/hibernate. We still need to perform nic_deinit() if
      caller requests for it, to handle the freeze/resume scenarios.
      
      Fixes: 57f780f1 ("atlantic: Fix driver resume flow.")
      Signed-off-by: default avatarSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d88c339
    • Aya Levin's avatar
      net/mlx4_en: Don't allow aRFS for encapsulated packets · fdbccea4
      Aya Levin authored
      Driver doesn't support aRFS for encapsulated packets, return early error
      in such a case.
      
      Fixes: 1eb8c695 ("net/mlx4_en: Add accelerated RFS support")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdbccea4
    • Vladimir Oltean's avatar
      net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled · acc64f52
      Vladimir Oltean authored
      The blamed commit made the fatally incorrect assumption that ports which
      aren't in the FORWARDING STP state should not have packets forwarded
      towards them, and that is all that needs to be done.
      
      However, that logic alone permits BLOCKING ports to forward to
      FORWARDING ports, which of course allows packet storms to occur when
      there is an L2 loop.
      
      The ocelot_get_bridge_fwd_mask should not only ask "what can the bridge
      do for you", but "what can you do for the bridge". This way, only
      FORWARDING ports forward to the other FORWARDING ports from the same
      bridging domain, and we are still compatible with the idea of multiple
      bridges.
      
      Fixes: df291e54 ("net: ocelot: support multiple bridges")
      Suggested-by: default avatarColin Foster <colin.foster@in-advantage.com>
      Reported-by: default avatarColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarColin Foster <colin.foster@in-advantage.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acc64f52
    • Felix Fietkau's avatar
      net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries · e68daf61
      Felix Fietkau authored
      Sometimes multiple CLS_REPLACE calls are issued for the same connection.
      rhashtable_insert_fast does not check for these duplicates, so multiple
      hardware flow entries can be created.
      Fix this by checking for an existing entry early
      
      Fixes: 502e84e2 ("net: ethernet: mtk_eth_soc: add flow offloading support")
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarIlya Lipnitskiy <ilya.lipnitskiy@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e68daf61
    • Mark Brown's avatar
      nfc: st-nci: Add SPI ID matching DT compatible · 31339440
      Mark Brown authored
      Currently autoloading for SPI devices does not use the DT ID table, it uses
      SPI modalises. Supporting OF modalises is going to be difficult if not
      impractical, an attempt was made but has been reverted, so ensure that
      module autoloading works for this driver by adding the part name used in
      the compatible to the list of SPI IDs.
      
      Fixes: 96c8395e ("spi: Revert modalias changes")
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31339440
    • Guvenc Gulce's avatar
      MAINTAINERS: remove Guvenc Gulce as net/smc maintainer · 5b099870
      Guvenc Gulce authored
      Remove myself as net/smc maintainer, as I am
      leaving IBM soon and can not maintain net/smc anymore.
      
      Cc: Julian Wiedmann <jwi@linux.ibm.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarGuvenc Gulce <guvenc@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b099870
    • Ido Schimmel's avatar
      nexthop: Fix memory leaks in nexthop notification chain listeners · 3106a084
      Ido Schimmel authored
      syzkaller discovered memory leaks [1] that can be reduced to the
      following commands:
      
       # ip nexthop add id 1 blackhole
       # devlink dev reload pci/0000:06:00.0
      
      As part of the reload flow, mlxsw will unregister its netdevs and then
      unregister from the nexthop notification chain. Before unregistering
      from the notification chain, mlxsw will receive delete notifications for
      nexthop objects using netdevs registered by mlxsw or their uppers. mlxsw
      will not receive notifications for nexthops using netdevs that are not
      dismantled as part of the reload flow. For example, the blackhole
      nexthop above that internally uses the loopback netdev as its nexthop
      device.
      
      One way to fix this problem is to have listeners flush their nexthop
      tables after unregistering from the notification chain. This is
      error-prone as evident by this patch and also not symmetric with the
      registration path where a listener receives a dump of all the existing
      nexthops.
      
      Therefore, fix this problem by replaying delete notifications for the
      listener being unregistered. This is symmetric to the registration path
      and also consistent with the netdev notification chain.
      
      The above means that unregister_nexthop_notifier(), like
      register_nexthop_notifier(), will have to take RTNL in order to iterate
      over the existing nexthops and that any callers of the function cannot
      hold RTNL. This is true for mlxsw and netdevsim, but not for the VXLAN
      driver. To avoid a deadlock, change the latter to unregister its nexthop
      listener without holding RTNL, making it symmetric to the registration
      path.
      
      [1]
      unreferenced object 0xffff88806173d600 (size 512):
        comm "syz-executor.0", pid 1290, jiffies 4295583142 (age 143.507s)
        hex dump (first 32 bytes):
          41 9d 1e 60 80 88 ff ff 08 d6 73 61 80 88 ff ff  A..`......sa....
          08 d6 73 61 80 88 ff ff 01 00 00 00 00 00 00 00  ..sa............
        backtrace:
          [<ffffffff81a6b576>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<ffffffff81a6b576>] slab_post_alloc_hook+0x96/0x490 mm/slab.h:522
          [<ffffffff81a716d3>] slab_alloc_node mm/slub.c:3206 [inline]
          [<ffffffff81a716d3>] slab_alloc mm/slub.c:3214 [inline]
          [<ffffffff81a716d3>] kmem_cache_alloc_trace+0x163/0x370 mm/slub.c:3231
          [<ffffffff82e8681a>] kmalloc include/linux/slab.h:591 [inline]
          [<ffffffff82e8681a>] kzalloc include/linux/slab.h:721 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_group_create drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4918 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_new drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5054 [inline]
          [<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_event+0x59a/0x2910 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5239
          [<ffffffff813ef67d>] notifier_call_chain+0xbd/0x210 kernel/notifier.c:83
          [<ffffffff813f0662>] blocking_notifier_call_chain kernel/notifier.c:318 [inline]
          [<ffffffff813f0662>] blocking_notifier_call_chain+0x72/0xa0 kernel/notifier.c:306
          [<ffffffff8384b9c6>] call_nexthop_notifiers+0x156/0x310 net/ipv4/nexthop.c:244
          [<ffffffff83852bd8>] insert_nexthop net/ipv4/nexthop.c:2336 [inline]
          [<ffffffff83852bd8>] nexthop_add net/ipv4/nexthop.c:2644 [inline]
          [<ffffffff83852bd8>] rtm_new_nexthop+0x14e8/0x4d10 net/ipv4/nexthop.c:2913
          [<ffffffff833e9a78>] rtnetlink_rcv_msg+0x448/0xbf0 net/core/rtnetlink.c:5572
          [<ffffffff83608703>] netlink_rcv_skb+0x173/0x480 net/netlink/af_netlink.c:2504
          [<ffffffff833de032>] rtnetlink_rcv+0x22/0x30 net/core/rtnetlink.c:5590
          [<ffffffff836069de>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
          [<ffffffff836069de>] netlink_unicast+0x5ae/0x7f0 net/netlink/af_netlink.c:1340
          [<ffffffff83607501>] netlink_sendmsg+0x8e1/0xe30 net/netlink/af_netlink.c:1929
          [<ffffffff832fde84>] sock_sendmsg_nosec net/socket.c:704 [inline]
          [<ffffffff832fde84>] sock_sendmsg net/socket.c:724 [inline]
          [<ffffffff832fde84>] ____sys_sendmsg+0x874/0x9f0 net/socket.c:2409
          [<ffffffff83304a44>] ___sys_sendmsg+0x104/0x170 net/socket.c:2463
          [<ffffffff83304c01>] __sys_sendmsg+0x111/0x1f0 net/socket.c:2492
          [<ffffffff83304d5d>] __do_sys_sendmsg net/socket.c:2501 [inline]
          [<ffffffff83304d5d>] __se_sys_sendmsg net/socket.c:2499 [inline]
          [<ffffffff83304d5d>] __x64_sys_sendmsg+0x7d/0xc0 net/socket.c:2499
      
      Fixes: 2a014b20 ("mlxsw: spectrum_router: Add support for nexthop objects")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3106a084
  3. 22 Sep, 2021 14 commits
    • Geert Uytterhoeven's avatar
      init: Revert accidental changes to print irqs_disabled() · 58e2cf5d
      Geert Uytterhoeven authored
      Commit f8ade8dd ("xsurf100: drop include of lib8390.c") accidentally
      changed init/main.c.  Revert that part.
      
      Fixes: f8ade8dd ("xsurf100: drop include of lib8390.c")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      58e2cf5d
    • Konrad Rzeszutek Wilk's avatar
      MAINTAINERS: Update Xen-[PCI,SWIOTLB,Block] maintainership · 40575257
      Konrad Rzeszutek Wilk authored
      Konrad's new job role is putting a serious cramp on him
      being a responsive maintainer and as such he is handing off
      the reins to Juergen, Roger, and Stefano.
      
      Thank you!
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Acked-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      40575257
    • Konrad Rzeszutek Wilk's avatar
      MAINTAINERS: Update SWIOTLB maintainership · 2e36a964
      Konrad Rzeszutek Wilk authored
      Konrad's new job role is putting a serious cramp on him
      being a responsive maintainer and as such he is handing off
      the reins to Christoph Hellwig.
      
      Thank you!
      Acked-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e36a964
    • Dinh Nguyen's avatar
      MAINTAINERS: update entry for NIOS2 · c4aa1eeb
      Dinh Nguyen authored
      Ley Foon has left Intel and will no longer be able to maintain NIOS2.
      Update the MAINTAINER's entry to Dinh Nguyen.
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      Signed-off-by: default avatarDinh Nguyen <dinguyen@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c4aa1eeb
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 9bedf10b
      Linus Torvalds authored
      Pull spi modalias fix from Mark Brown:
       "Fix modalias issues
      
        As reported by Russell King the change to use OF style modaliases for
        DT enumerated broke at least the spi-nor driver, the patch here
        reverts that change to fix the regression.
      
        Sadly this will mean that anything that started loading since the
        change to OF modaliases will run into issues, there doesn't seem to be
        any approach which doesn't cause some problems and thi seems like the
        least bad approach - gory details are in the commit log for the
        change.
      
        I'm currently working through the SPI drivers to add ID tables and
        missing IDs to tables which should address things from the other end,
        this seems more straightforward and robust than any other options"
      
      * tag 'spi-fix-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: Revert modalias changes
      9bedf10b
    • Linus Torvalds's avatar
      Merge tag 'nfsd-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · cf1d2c3e
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
       "Critical bug fixes:
      
         - Fix crash in NLM TEST procedure
      
         - NFSv4.1+ backchannel not restored after PATH_DOWN"
      
      * tag 'nfsd-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN
        NLM: Fix svcxdr_encode_owner()
      cf1d2c3e
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.15-2' of... · bee42512
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Hans de Goede:
       "The first round of bug-fixes for platform-drivers-x86 for 5.15,
        highlights:
      
         - amd-pmc fix for some suspend/resume issues
      
         - intel-hid fix to avoid false-positive SW_TABLET_MODE=1 reporting
      
         - some build error/warning fixes
      
         - various DMI quirk additions"
      
      * tag 'platform-drivers-x86-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86: gigabyte-wmi: add support for B550I Aorus Pro AX
        platform/x86/intel: hid: Add DMI switches allow list
        platform/x86: dell: fix DELL_WMI_PRIVACY dependencies & build error
        platform/x86: amd-pmc: Increase the response register timeout
        platform/x86: touchscreen_dmi: Update info for the Chuwi Hi10 Plus (CWI527) tablet
        platform/x86: touchscreen_dmi: Add info for the Chuwi HiBook (CWI514) tablet
        lg-laptop: Correctly handle dmi_get_system_info() returning NULL
        platform/x86/intel: punit_ipc: Drop wrong use of ACPI_PTR()
      bee42512
    • Jiri Slaby's avatar
      MAINTAINERS: ARM/VT8500, remove defunct e-mail · 8f1b7ba5
      Jiri Slaby authored
      linux@prisktech.co.nz is defunct:
      
        4.1.2 <linux@prisktech.co.nz>: Recipient address rejected: Domain not found
      
      Remove it from MAINTAINERS and mark the ARM/VT8500 entry orphan.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f1b7ba5
    • Paolo Abeni's avatar
      mptcp: ensure tx skbs always have the MPTCP ext · 977d293e
      Paolo Abeni authored
      Due to signed/unsigned comparison, the expression:
      
      	info->size_goal - skb->len > 0
      
      evaluates to true when the size goal is smaller than the
      skb size. That results in lack of tx cache refill, so that
      the skb allocated by the core TCP code lacks the required
      MPTCP skb extensions.
      
      Due to the above, syzbot is able to trigger the following WARN_ON():
      
      WARNING: CPU: 1 PID: 810 at net/mptcp/protocol.c:1366 mptcp_sendmsg_frag+0x1362/0x1bc0 net/mptcp/protocol.c:1366
      Modules linked in:
      CPU: 1 PID: 810 Comm: syz-executor.4 Not tainted 5.14.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:mptcp_sendmsg_frag+0x1362/0x1bc0 net/mptcp/protocol.c:1366
      Code: ff 4c 8b 74 24 50 48 8b 5c 24 58 e9 0f fb ff ff e8 13 44 8b f8 4c 89 e7 45 31 ed e8 98 57 2e fe e9 81 f4 ff ff e8 fe 43 8b f8 <0f> 0b 41 bd ea ff ff ff e9 6f f4 ff ff 4c 89 e7 e8 b9 8e d2 f8 e9
      RSP: 0018:ffffc9000531f6a0 EFLAGS: 00010216
      RAX: 000000000000697f RBX: 0000000000000000 RCX: ffffc90012107000
      RDX: 0000000000040000 RSI: ffffffff88eac9e2 RDI: 0000000000000003
      RBP: ffff888078b15780 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff88eac017 R11: 0000000000000000 R12: ffff88801de0a280
      R13: 0000000000006b58 R14: ffff888066278280 R15: ffff88803c2fe9c0
      FS:  00007fd9f866e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007faebcb2f718 CR3: 00000000267cb000 CR4: 00000000001506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __mptcp_push_pending+0x1fb/0x6b0 net/mptcp/protocol.c:1547
       mptcp_release_cb+0xfe/0x210 net/mptcp/protocol.c:3003
       release_sock+0xb4/0x1b0 net/core/sock.c:3206
       sk_stream_wait_memory+0x604/0xed0 net/core/stream.c:145
       mptcp_sendmsg+0xc39/0x1bc0 net/mptcp/protocol.c:1749
       inet6_sendmsg+0x99/0xe0 net/ipv6/af_inet6.c:643
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:724
       sock_write_iter+0x2a0/0x3e0 net/socket.c:1057
       call_write_iter include/linux/fs.h:2163 [inline]
       new_sync_write+0x40b/0x640 fs/read_write.c:507
       vfs_write+0x7cf/0xae0 fs/read_write.c:594
       ksys_write+0x1ee/0x250 fs/read_write.c:647
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x4665f9
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fd9f866e188 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 000000000056c038 RCX: 00000000004665f9
      RDX: 00000000000e7b78 RSI: 0000000020000000 RDI: 0000000000000003
      RBP: 00000000004bfcc4 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056c038
      R13: 0000000000a9fb1f R14: 00007fd9f866e300 R15: 0000000000022000
      
      Fix the issue rewriting the relevant expression to avoid
      sign-related problems - note: size_goal is always >= 0.
      
      Additionally, ensure that the skb in the tx cache always carries
      the relevant extension.
      
      Reported-and-tested-by: syzbot+263a248eec3e875baa7b@syzkaller.appspotmail.com
      Fixes: 1094c6fe ("mptcp: fix possible divide by zero")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      977d293e
    • Shai Malin's avatar
      qed: rdma - don't wait for resources under hw error recovery flow · 1ea78123
      Shai Malin authored
      If the HW device is during recovery, the HW resources will never return,
      hence we shouldn't wait for the CID (HW context ID) bitmaps to clear.
      This fix speeds up the error recovery flow.
      
      Fixes: 64515dc8 ("qed: Add infrastructure for error detection and recovery")
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarShai Malin <smalin@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ea78123
    • Jakub Kicinski's avatar
      Merge branch 's390-qeth-fixes-2021-09-21' · b52d3161
      Jakub Kicinski authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2021-09-21
      
      This brings two fixes for deadlocks when a device is removed while it
      has certain types of async work pending. And one additional fix for a
      missing NULL check in an error case.
      ====================
      
      Link: https://lore.kernel.org/r/20210921145217.1584654-1-jwi@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b52d3161
    • Alexandra Winter's avatar
      s390/qeth: fix deadlock during failing recovery · d2b59bd4
      Alexandra Winter authored
      Commit 0b9902c1 ("s390/qeth: fix deadlock during recovery") removed
      taking discipline_mutex inside qeth_do_reset(), fixing potential
      deadlocks. An error path was missed though, that still takes
      discipline_mutex and thus has the original deadlock potential.
      
      Intermittent deadlocks were seen when a qeth channel path is configured
      offline, causing a race between qeth_do_reset and ccwgroup_remove.
      Call qeth_set_offline() directly in the qeth_do_reset() error case and
      then a new variant of ccwgroup_set_offline(), without taking
      discipline_mutex.
      
      Fixes: b41b554c ("s390/qeth: fix locking for discipline setup / removal")
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d2b59bd4
    • Alexandra Winter's avatar
      s390/qeth: Fix deadlock in remove_discipline · ee909d0b
      Alexandra Winter authored
      Problem: qeth_close_dev_handler is a worker that tries to acquire
      card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline().
      Since commit b41b554c
      ("s390/qeth: fix locking for discipline setup / removal")
      qeth_remove_discipline() is called under card->discipline_mutex and
      cancels the work and waits for it to finish.
      
      STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the
      only situation that schedules close_dev_work. In that situation scheduling
      qeth recovery will also result in an offline interface, when resetting the
      isolation mode fails, if the external switch is still set to VEB.
      And since commit 0b9902c1 ("s390/qeth: fix deadlock during recovery")
      qeth recovery does not aquire card->discipline_mutex anymore.
      
      So we accept the longer pathlength of qeth_schedule_recovery in this
      error situation and re-use the existing function.
      
      As a side-benefit this changes the hwtrap to behave like during recovery
      instead of like during a user-triggered set_offline.
      
      Fixes: b41b554c ("s390/qeth: fix locking for discipline setup / removal")
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Acked-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ee909d0b
    • Julian Wiedmann's avatar
      s390/qeth: fix NULL deref in qeth_clear_working_pool_list() · 248f064a
      Julian Wiedmann authored
      When qeth_set_online() calls qeth_clear_working_pool_list() to roll
      back after an error exit from qeth_hardsetup_card(), we are at risk of
      accessing card->qdio.in_q before it was allocated by
      qeth_alloc_qdio_queues() via qeth_mpc_initialize().
      
      qeth_clear_working_pool_list() then dereferences NULL, and by writing to
      queue->bufs[i].pool_entry scribbles all over the CPU's lowcore.
      Resulting in a crash when those lowcore areas are used next (eg. on
      the next machine-check interrupt).
      
      Such a scenario would typically happen when the device is first set
      online and its queues aren't allocated yet. An early IO error or certain
      misconfigs (eg. mismatched transport mode, bad portno) then cause us to
      error out from qeth_hardsetup_card() with card->qdio.in_q still being
      NULL.
      
      Fix it by checking the pointer for NULL before accessing it.
      
      Note that we also have (rare) paths inside qeth_mpc_initialize() where
      a configuration change can cause us to free the existing queues,
      expecting that subsequent code will allocate them again. If we then
      error out before that re-allocation happens, the same bug occurs.
      
      Fixes: eff73e16 ("s390/qeth: tolerate pre-filled RX buffer")
      Reported-by: default avatarStefan Raspl <raspl@linux.ibm.com>
      Root-caused-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      248f064a
  4. 21 Sep, 2021 14 commits
    • Mark Brown's avatar
      spi: Revert modalias changes · 96c8395e
      Mark Brown authored
      During the v5.13 cycle we updated the SPI subsystem to generate OF style
      modaliases for SPI devices, replacing the old Linux style modalises we
      used to generate based on spi_device_id which are the DT style name with
      the vendor removed.  Unfortunately this means that we start only
      reporting OF style modalises and not the old ones and there is nothing
      that ensures that drivers list every possible OF compatible string in
      their OF ID table.  The result is that there are systems which have been
      relying on loading modules based on the old style that are now broken,
      as found by Russell King with spi-nor on Macchiatobin.
      
      spi-nor is a particularly problematic case for this, it only lists a
      single generic DT compatible jedec,spi-nor in the driver but supports a
      huge raft of device specific compatibles, with a large set of part
      numbers many of which are offered by multiple vendors.  Russell's
      searches of upstream device trees has turned up examples with vendor
      names written in non-standard ways too.  To make matters worse up until
      8ff16cf7 ("Documentation: devicetree: m25p80: add "nor-jedec"
      binding") the generic compatible was not part of the binding so there
      are device trees out there written to that binding version which don't
      list it all.  The sheer number of parts supported together with our
      previous approach of ignoring the vendor ID makes robustly fixing this
      by adding compatibles to the spi-nor driver seem problematic, the
      current DT binding document does not list all the parts supported by the
      driver at the minute (further patches will fix this).
      
      I've also investigated supporting both formats of modalias
      simultaneously but that doesn't seem possible, especially without
      breaking our userspace ABI which is obviously not viable.
      
      Instead revert the relevant changes for now:
      
      e09f2ab8 ("spi: update modalias_show after of_device_uevent_modalias support")
      3ce6c9e2 ("spi: add of_device_uevent_modalias support")
      
      This will unfortunately mean that any system which had started having
      modules autoload based on the OF compatibles for drivers that list
      things there but not in the spi_device_ids will now not have those
      modules load which is itself a regression.  Since it affects a narrower
      time window and the particularly problematic spi-nor driver may be
      critical to system boot on smaller systems this seems the best of a
      series of bad options.  I will start an audit of SPI drivers to identify
      and fix cases where things won't autoload using spi_device_id, this is
      not great but seems to be the best way forward that anyone has been able
      to identify.
      
      Thanks to Russell for both his report and the additional diagnostic and
      analysis work he has done here, the detailed research above was his
      work.
      
      Fixes: e09f2ab8 ("spi: update modalias_show after of_device_uevent_modalias support")
      Fixes: 3ce6c9e2 ("spi: add of_device_uevent_modalias support")
      Reported-by: default avatarRussell King (Oracle) <linux@armlinux.org.uk>
      Suggested-by: default avatarRussell King (Oracle) <linux@armlinux.org.uk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Tested-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Cc: Andreas Schwab <schwab@suse.de>
      Cc: Marco Felsch <m.felsch@pengutronix.de>
      96c8395e
    • Linus Torvalds's avatar
      Merge tag 's390-5.15-ebpf-jit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 92477dd1
      Linus Torvalds authored
      Pull s390 eBPF fixes from Vasily Gorbik:
       "Johan Almbladh has implemented a number of new testcases for eBPF [1],
        which uncovered three miscompilation issues in the s390 eBPF JIT"
      
      Link: https://lore.kernel.org/bpf/20210902185229.1840281-1-johan.almbladh@anyfinetworks.com/ [1]
      
      * tag 's390-5.15-ebpf-jit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/bpf: Fix optimizing out zero-extensions
        s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant
        s390/bpf: Fix branch shortening during codegen pass
      92477dd1
    • Linus Torvalds's avatar
      qnx4: work around gcc false positive warning bug · d5f65459
      Linus Torvalds authored
      In commit b7213ffa ("qnx4: avoid stringop-overread errors") I tried
      to teach gcc about how the directory entry structure can be two
      different things depending on a status flag.  It made the code clearer,
      and it seemed to make gcc happy.
      
      However, Arnd points to a gcc bug, where despite using two different
      members of a union, gcc then gets confused, and uses the size of one of
      the members to decide if a string overrun happens.  And not necessarily
      the rigth one.
      
      End result: with some configurations, gcc-11 will still complain about
      the source buffer size being overread:
      
        fs/qnx4/dir.c: In function 'qnx4_readdir':
        fs/qnx4/dir.c:76:32: error: 'strnlen' specified bound [16, 48] exceeds source size 1 [-Werror=stringop-overread]
           76 |                         size = strnlen(name, size);
              |                                ^~~~~~~~~~~~~~~~~~~
        fs/qnx4/dir.c:26:22: note: source object declared here
           26 |                 char de_name;
              |                      ^~~~~~~
      
      because gcc will get confused about which union member entry is actually
      getting accessed, even when the source code is very clear about it.  Gcc
      internally will have combined two "redundant" pointers (pointing to
      different union elements that are at the same offset), and takes the
      size checking from one or the other - not necessarily the right one.
      
      This is clearly a gcc bug, but we can work around it fairly easily.  The
      biggest thing here is the big honking comment about why we do what we
      do.
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578#c6Reported-and-tested-by: default avatarArnd Bergmann <arnd@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d5f65459
    • Tobias Jakobi's avatar
    • José Expósito's avatar
      platform/x86/intel: hid: Add DMI switches allow list · b201cb0e
      José Expósito authored
      Some devices, even non convertible ones, can send incorrect
      SW_TABLET_MODE reports.
      
      Add an allow list and accept such reports only from devices in it.
      
      Bug reported for Dell XPS 17 9710 on:
      https://gitlab.freedesktop.org/libinput/libinput/-/issues/662Reported-by: default avatarTobias Gurtzick <magic@wizardtales.com>
      Suggested-by: default avatarHans de Goede <hdegoede@redhat.com>
      Tested-by: default avatarTobias Gurtzick <magic@wizardtales.com>
      Signed-off-by: default avatarJosé Expósito <jose.exposito89@gmail.com>
      Link: https://lore.kernel.org/r/20210920160312.9787-1-jose.exposito89@gmail.com
      [hdegoede@redhat.com: Check dmi_switches_auto_add_allow_list only once]
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      b201cb0e
    • Randy Dunlap's avatar
      platform/x86: dell: fix DELL_WMI_PRIVACY dependencies & build error · 5b72dafa
      Randy Dunlap authored
      When DELL_WMI=y, DELL_WMI_PRIVACY=y, and LEDS_TRIGGER_AUDIO=m, there
      is a linker error since the LEDS trigger code is built as a loadable
      module. This happens because DELL_WMI_PRIVACY is a bool that depends
      on a tristate (LEDS_TRIGGER_AUDIO=m), which can be dangerous.
      
      ld: drivers/platform/x86/dell/dell-wmi-privacy.o: in function `dell_privacy_wmi_probe':
      dell-wmi-privacy.c:(.text+0x3df): undefined reference to `ledtrig_audio_get'
      
      Fixes: 8af9fa37 ("platform/x86: dell-privacy: Add support for Dell hardware privacy")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Perry Yuan <Perry.Yuan@dell.com>
      Cc: Dell.Client.Kernel@dell.com
      Cc: platform-driver-x86@vger.kernel.org
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Mark Gross <mgross@linux.intel.com>
      Link: https://lore.kernel.org/r/20210918044829.19222-1-rdunlap@infradead.orgReviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      5b72dafa
    • David S. Miller's avatar
      Merge branch 'dsa-devres' · b3f98404
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Fix mdiobus users with devres
      
      Commit ac3a68d5 ("net: phy: don't abuse devres in
      devm_mdiobus_register()") by Bartosz Golaszewski has introduced two
      classes of potential bugs by making the devres callback of
      devm_mdiobus_alloc stop calling mdiobus_unregister.
      
      The exact buggy circumstances are presented in the individual commit
      messages. I have searched the tree for other occurrences, but at the
      moment:
      
      - for issue (a) I have no concrete proof that other buses except SPI and
        I2C suffer from it, and the only SPI or I2C device drivers that call
        of_mdiobus_alloc are the DSA drivers that leave a NULL
        ds->slave_mii_bus and a non-NULL ds->ops->phy_read, aka ksz9477,
        ksz8795, lan9303_i2c, vsc73xx-spi.
      
      - for issue (b), all drivers which call of_mdiobus_alloc either use
        of_mdiobus_register too, or call mdiobus_unregister sometime within
        the ->remove path.
      
      Although at this point I've seen enough strangeness caused by this
      "device_del during ->shutdown" that I'm just going to copy the SPI and
      I2C subsystem maintainers to this patch series, to get their feedback
      whether they've had reports about things like this before. I don't think
      other buses behave in this way, it forces SPI and I2C devices to have to
      protect themselves from a really strange set of issues.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3f98404
    • Vladimir Oltean's avatar
      net: dsa: realtek: register the MDIO bus under devres · 74b6d7d1
      Vladimir Oltean authored
      The Linux device model permits both the ->shutdown and ->remove driver
      methods to get called during a shutdown procedure. Example: a DSA switch
      which sits on an SPI bus, and the SPI bus driver calls this on its
      ->shutdown method:
      
      spi_unregister_controller
      -> device_for_each_child(&ctlr->dev, NULL, __unregister);
         -> spi_unregister_device(to_spi_device(dev));
            -> device_del(&spi->dev);
      
      So this is a simple pattern which can theoretically appear on any bus,
      although the only other buses on which I've been able to find it are
      I2C:
      
      i2c_del_adapter
      -> device_for_each_child(&adap->dev, NULL, __unregister_client);
         -> i2c_unregister_device(client);
            -> device_unregister(&client->dev);
      
      The implication of this pattern is that devices on these buses can be
      unregistered after having been shut down. The drivers for these devices
      might choose to return early either from ->remove or ->shutdown if the
      other callback has already run once, and they might choose that the
      ->shutdown method should only perform a subset of the teardown done by
      ->remove (to avoid unnecessary delays when rebooting).
      
      So in other words, the device driver may choose on ->remove to not
      do anything (therefore to not unregister an MDIO bus it has registered
      on ->probe), because this ->remove is actually triggered by the
      device_shutdown path, and its ->shutdown method has already run and done
      the minimally required cleanup.
      
      This used to be fine until the blamed commit, but now, the following
      BUG_ON triggers:
      
      void mdiobus_free(struct mii_bus *bus)
      {
      	/* For compatibility with error handling in drivers. */
      	if (bus->state == MDIOBUS_ALLOCATED) {
      		kfree(bus);
      		return;
      	}
      
      	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
      	bus->state = MDIOBUS_RELEASED;
      
      	put_device(&bus->dev);
      }
      
      In other words, there is an attempt to free an MDIO bus which was not
      unregistered. The attempt to free it comes from the devres release
      callbacks of the SPI device, which are executed after the device is
      unregistered.
      
      I'm not saying that the fact that MDIO buses allocated using devres
      would automatically get unregistered wasn't strange. I'm just saying
      that the commit didn't care about auditing existing call paths in the
      kernel, and now, the following code sequences are potentially buggy:
      
      (a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
          located on a bus that unregisters its children on shutdown. After
          the blamed patch, either both the alloc and the register should use
          devres, or none should.
      
      (b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
          mdiobus_unregister at all in the remove path. After the blamed
          patch, nobody unregisters the MDIO bus anymore, so this is even more
          buggy than the previous case which needs a specific bus
          configuration to be seen, this one is an unconditional bug.
      
      In this case, the Realtek drivers fall under category (b). To solve it,
      we can register the MDIO bus under devres too, which restores the
      previous behavior.
      
      Fixes: ac3a68d5 ("net: phy: don't abuse devres in devm_mdiobus_register()")
      Reported-by: default avatarLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reported-by: default avatarAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74b6d7d1
    • Vladimir Oltean's avatar
      net: dsa: don't allocate the slave_mii_bus using devres · 5135e96a
      Vladimir Oltean authored
      The Linux device model permits both the ->shutdown and ->remove driver
      methods to get called during a shutdown procedure. Example: a DSA switch
      which sits on an SPI bus, and the SPI bus driver calls this on its
      ->shutdown method:
      
      spi_unregister_controller
      -> device_for_each_child(&ctlr->dev, NULL, __unregister);
         -> spi_unregister_device(to_spi_device(dev));
            -> device_del(&spi->dev);
      
      So this is a simple pattern which can theoretically appear on any bus,
      although the only other buses on which I've been able to find it are
      I2C:
      
      i2c_del_adapter
      -> device_for_each_child(&adap->dev, NULL, __unregister_client);
         -> i2c_unregister_device(client);
            -> device_unregister(&client->dev);
      
      The implication of this pattern is that devices on these buses can be
      unregistered after having been shut down. The drivers for these devices
      might choose to return early either from ->remove or ->shutdown if the
      other callback has already run once, and they might choose that the
      ->shutdown method should only perform a subset of the teardown done by
      ->remove (to avoid unnecessary delays when rebooting).
      
      So in other words, the device driver may choose on ->remove to not
      do anything (therefore to not unregister an MDIO bus it has registered
      on ->probe), because this ->remove is actually triggered by the
      device_shutdown path, and its ->shutdown method has already run and done
      the minimally required cleanup.
      
      This used to be fine until the blamed commit, but now, the following
      BUG_ON triggers:
      
      void mdiobus_free(struct mii_bus *bus)
      {
      	/* For compatibility with error handling in drivers. */
      	if (bus->state == MDIOBUS_ALLOCATED) {
      		kfree(bus);
      		return;
      	}
      
      	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
      	bus->state = MDIOBUS_RELEASED;
      
      	put_device(&bus->dev);
      }
      
      In other words, there is an attempt to free an MDIO bus which was not
      unregistered. The attempt to free it comes from the devres release
      callbacks of the SPI device, which are executed after the device is
      unregistered.
      
      I'm not saying that the fact that MDIO buses allocated using devres
      would automatically get unregistered wasn't strange. I'm just saying
      that the commit didn't care about auditing existing call paths in the
      kernel, and now, the following code sequences are potentially buggy:
      
      (a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
          located on a bus that unregisters its children on shutdown. After
          the blamed patch, either both the alloc and the register should use
          devres, or none should.
      
      (b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
          mdiobus_unregister at all in the remove path. After the blamed
          patch, nobody unregisters the MDIO bus anymore, so this is even more
          buggy than the previous case which needs a specific bus
          configuration to be seen, this one is an unconditional bug.
      
      In this case, DSA falls into category (a), it tries to be helpful and
      registers an MDIO bus on behalf of the switch, which might be on such a
      bus. I've no idea why it does it under devres.
      
      It does this on probe:
      
      	if (!ds->slave_mii_bus && ds->ops->phy_read)
      		alloc and register mdio bus
      
      and this on remove:
      
      	if (ds->slave_mii_bus && ds->ops->phy_read)
      		unregister mdio bus
      
      I _could_ imagine using devres because the condition used on remove is
      different than the condition used on probe. So strictly speaking, DSA
      cannot determine whether the ds->slave_mii_bus it sees on remove is the
      ds->slave_mii_bus that _it_ has allocated on probe. Using devres would
      have solved that problem. But nonetheless, the existing code already
      proceeds to unregister the MDIO bus, even though it might be
      unregistering an MDIO bus it has never registered. So I can only guess
      that no driver that implements ds->ops->phy_read also allocates and
      registers ds->slave_mii_bus itself.
      
      So in that case, if unregistering is fine, freeing must be fine too.
      
      Stop using devres and free the MDIO bus manually. This will make devres
      stop attempting to free a still registered MDIO bus on ->shutdown.
      
      Fixes: ac3a68d5 ("net: phy: don't abuse devres in devm_mdiobus_register()")
      Reported-by: default avatarLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5135e96a
    • Masanari Iida's avatar
      Doc: networking: Fox a typo in ice.rst · 3e95cfa2
      Masanari Iida authored
      This patch fixes a spelling typo in ice.rst
      Signed-off-by: default avatarMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e95cfa2
    • Vladimir Oltean's avatar
      net: dsa: fix dsa_tree_setup error path · e5845aa0
      Vladimir Oltean authored
      Since the blamed commit, dsa_tree_teardown_switches() was split into two
      smaller functions, dsa_tree_teardown_switches and dsa_tree_teardown_ports.
      
      However, the error path of dsa_tree_setup stopped calling dsa_tree_teardown_ports.
      
      Fixes: a57d8c21 ("net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5845aa0
    • David S. Miller's avatar
      Merge branch 'smc-fixes' · 431db53c
      David S. Miller authored
      Karsten Graul says:
      
      ====================
      net/smc: fixes 2021-09-20
      
      Please apply the following patches for smc to netdev's net tree.
      
      The first patch adds a missing error check, and the second patch
      fixes a possible leak of a lock in a worker.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      431db53c
    • Karsten Graul's avatar
      net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work · a18cee47
      Karsten Graul authored
      The abort_work is scheduled when a connection was detected to be
      out-of-sync after a link failure. The work calls smc_conn_kill(),
      which calls smc_close_active_abort() and that might end up calling
      smc_close_cancel_work().
      smc_close_cancel_work() cancels any pending close_work and tx_work but
      needs to release the sock_lock before and acquires the sock_lock again
      afterwards. So when the sock_lock was NOT acquired before then it may
      be held after the abort_work completes. Thats why the sock_lock is
      acquired before the call to smc_conn_kill() in __smc_lgr_terminate(),
      but this is missing in smc_conn_abort_work().
      
      Fix that by acquiring the sock_lock first and release it after the
      call to smc_conn_kill().
      
      Fixes: b286a065 ("net/smc: handle incoming CDC validation message")
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a18cee47
    • Karsten Graul's avatar
      net/smc: add missing error check in smc_clc_prfx_set() · 6c907319
      Karsten Graul authored
      Coverity stumbled over a missing error check in smc_clc_prfx_set():
      
      *** CID 1475954:  Error handling issues  (CHECKED_RETURN)
      /net/smc/smc_clc.c: 233 in smc_clc_prfx_set()
      >>>     CID 1475954:  Error handling issues  (CHECKED_RETURN)
      >>>     Calling "kernel_getsockname" without checking return value (as is done elsewhere 8 out of 10 times).
      233     	kernel_getsockname(clcsock, (struct sockaddr *)&addrs);
      
      Add the return code check in smc_clc_prfx_set().
      
      Fixes: c246d942 ("net/smc: restructure netinfo for CLC proposal msgs")
      Reported-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c907319
  5. 20 Sep, 2021 1 commit
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · d9fb6784
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
       "Fixes for AFS problems that can cause data corruption due to
        interaction with another client modifying data cached locally:
      
         - When d_revalidating a dentry, don't look at the inode to which it
           points. Only check the directory to which the dentry belongs. This
           was confusing things and causing the silly-rename cleanup code to
           remove the file now at the dentry of a file that got deleted.
      
         - Fix mmap data coherency. When a callback break is received that
           relates to a file that we have cached, the data content may have
           been changed (there are other reasons, such as the user's rights
           having been changed). However, we're checking it lazily, only on
           entry to the kernel, which doesn't happen if we have a writeable
           shared mapped page on that file.
      
           We make the kernel keep track of mmapped files and clear all PTEs
           mapping to that file as soon as the callback comes in by calling
           unmap_mapping_pages() (we don't necessarily want to zap the
           pagecache). This causes the kernel to be reentered when userspace
           tries to access the mmapped address range again - and at that point
           we can query the server and, if we need to, zap the page cache.
      
           Ideally, I would check each file at the point of notification, but
           that involves poking the server[*] - which is holding an exclusive
           lock on the vnode it is changing, waiting for all the clients it
           notified to reply. This could then deadlock against the server.
           Further, invalidating the pagecache might call ->launder_page(),
           which would try to write to the file, which would definitely
           deadlock. (AFS doesn't lease file access).
      
           [*] Checking to see if the file content has changed is a matter of
               comparing the current data version number, but we have to ask
               the server for that. We also need to get a new callback promise
               and we need to poke the server for that too.
      
         - Add some more points at which the inode is validated, since we're
           doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but
           also when performing some directory operations.
      
           Ideally, checking in ->read_iter() would be done in some derivation
           of filemap_read(). If we're going to call the server to read the
           file, then we get the file status fetch as part of that.
      
         - The above is now causing us to make a lot more calls to
           afs_validate() to check the inode - and afs_validate() takes the
           RCU read lock each time to make a quick check (ie.
           afs_check_validity()). This is entirely for the purpose of checking
           cb_s_break to see if the server we're using reinitialised its list
           of callbacks - however this isn't a very common event, so most of
           the time we're taking this needlessly.
      
           Add a new cell-wide counter to count the number of
           reinitialisations done by any server and check that - and only if
           that changes, take the RCU read lock and check the server list (the
           server list may change, but the cell a file is part of won't).
      
         - Don't update vnode->cb_s_break and ->cb_v_break inside the validity
           checking loop. The cb_lock is done with read_seqretry, so we might
           go round the loop a second time after resetting those values - and
           that could cause someone else checking validity to miss something
           (I think).
      
        Also included are patches for fixes for some bugs encountered whilst
        debugging this:
      
         - Fix a leak of afs_read objects and fix a leak of keys hidden by
           that.
      
         - Fix a leak of pages that couldn't be added to extend a writeback.
      
         - Fix the maintenance of i_blocks when i_size is changed by a local
           write or a local dir edit"
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1]
      Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1
      Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch
      
      * tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Fix updating of i_blocks on file/dir extension
        afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server
        afs: Try to avoid taking RCU read lock when checking vnode validity
        afs: Fix mmap coherency vs 3rd-party changes
        afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation
        afs: Add missing vnode validation checks
        afs: Fix page leak
        afs: Fix missing put on afs_read objects and missing get on the key therein
      d9fb6784