1. 05 Oct, 2020 3 commits
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.9-2' of git://git.infradead.org/linux-platform-drivers-x86 · 7575fdda
      Linus Torvalds authored
      Pull x86 platform driver fixes from Andy Shevchenko:
       "We have some fixes for Tablet Mode reporting in particular, that users
        are complaining a lot about.
      
        Summary:
      
         - Attempt #3 of enabling Tablet Mode reporting w/o regressions
      
         - Improve battery recognition code in ASUS WMI driver
      
         - Fix Kconfig dependency warning for Fujitsu and LG laptop drivers
      
         - Add fixes in Thinkpad ACPI driver for _BCL method and NVRAM polling
      
         - Fix power supply extended topology in Mellanox driver
      
         - Fix memory leak in OLPC EC driver
      
         - Avoid static struct device in Intel PMC core driver
      
         - Add support for the touchscreen found in MPMAN Converter9 2-in-1
      
         - Update MAINTAINERS to reflect the real state of affairs"
      
      * tag 'platform-drivers-x86-v5.9-2' of git://git.infradead.org/linux-platform-drivers-x86:
        platform/x86: thinkpad_acpi: re-initialize ACPI buffer size when reuse
        MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers maintainers
        platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting
        platform/x86: intel-vbtn: Revert "Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360"
        platform/x86: intel_pmc_core: do not create a static struct device
        platform/x86: mlx-platform: Fix extended topology configuration for power supply units
        platform/x86: pcengines-apuv2: Fix typo on define of AMD_FCH_GPIO_REG_GPIO55_DEVSLP0
        platform/x86: fix kconfig dependency warning for FUJITSU_LAPTOP
        platform/x86: fix kconfig dependency warning for LG_LAPTOP
        platform/x86: thinkpad_acpi: initialize tp_nvram_state variable
        platform/x86: intel-vbtn: Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360
        platform/x86: asus-wmi: Add BATC battery name to the list of supported
        platform/x86: asus-nb-wmi: Revert "Do not load on Asus T100TA and T200TA"
        platform/x86: touchscreen_dmi: Add info for the MPMAN Converter9 2-in-1
        Documentation: laptops: thinkpad-acpi: fix underline length build warning
        Platform: OLPC: Fix memleak in olpc_ec_probe
      7575fdda
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 165563c0
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Make sure SKB control block is in the proper state during IPSEC
          ESP-in-TCP encapsulation. From Sabrina Dubroca.
      
       2) Various kinds of attributes were not being cloned properly when we
          build new xfrm_state objects from existing ones. Fix from Antony
          Antony.
      
       3) Make sure to keep BTF sections, from Tony Ambardar.
      
       4) TX DMA channels need proper locking in lantiq driver, from Hauke
          Mehrtens.
      
       5) Honour route MTU during forwarding, always. From Maciej
          Żenczykowski.
      
       6) Fix races in kTLS which can result in crashes, from Rohit
          Maheshwari.
      
       7) Skip TCP DSACKs with rediculous sequence ranges, from Priyaranjan
          Jha.
      
       8) Use correct address family in xfrm state lookups, from Herbert Xu.
      
       9) A bridge FDB flush should not clear out user managed fdb entries
          with the ext_learn flag set, from Nikolay Aleksandrov.
      
      10) Fix nested locking of netdev address lists, from Taehee Yoo.
      
      11) Fix handling of 32-bit DATA_FIN values in mptcp, from Mat Martineau.
      
      12) Fix r8169 data corruptions on RTL8402 chips, from Heiner Kallweit.
      
      13) Don't free command entries in mlx5 while comp handler could still be
          running, from Eran Ben Elisha.
      
      14) Error flow of request_irq() in mlx5 is busted, due to an off by one
          we try to free and IRQ never allocated. From Maor Gottlieb.
      
      15) Fix leak when dumping netlink policies, from Johannes Berg.
      
      16) Sendpage cannot be performed when a page is a slab page, or the page
          count is < 1. Some subsystems such as nvme were doing so. Create a
          "sendpage_ok()" helper and use it as needed, from Coly Li.
      
      17) Don't leak request socket when using syncookes with mptcp, from
          Paolo Abeni.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
        net/core: check length before updating Ethertype in skb_mpls_{push,pop}
        net: mvneta: fix double free of txq->buf
        net_sched: check error pointer in tcf_dump_walker()
        net: team: fix memory leak in __team_options_register
        net: typhoon: Fix a typo Typoon --> Typhoon
        net: hinic: fix DEVLINK build errors
        net: stmmac: Modify configuration method of EEE timers
        tcp: fix syn cookied MPTCP request socket leak
        libceph: use sendpage_ok() in ceph_tcp_sendpage()
        scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
        drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage()
        tcp: use sendpage_ok() to detect misused .sendpage
        nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
        net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send
        net: introduce helper sendpage_ok() in include/linux/net.h
        net: usb: pegasus: Proper error handing when setting pegasus' MAC address
        net: core: document two new elements of struct net_device
        netlink: fix policy dump leak
        net/mlx5e: Fix race condition on nhe->n pointer in neigh update
        net/mlx5e: Fix VLAN create flow
        ...
      165563c0
    • Aaron Ma's avatar
      platform/x86: thinkpad_acpi: re-initialize ACPI buffer size when reuse · 720ef73d
      Aaron Ma authored
      Evaluating ACPI _BCL could fail, then ACPI buffer size will be set to 0.
      When reuse this ACPI buffer, AE_BUFFER_OVERFLOW will be triggered.
      
      Re-initialize buffer size will make ACPI evaluate successfully.
      
      Fixes: 46445b6b ("thinkpad-acpi: fix handle locate for video and query of _BCL")
      Signed-off-by: default avatarAaron Ma <aaron.ma@canonical.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      720ef73d
  2. 04 Oct, 2020 6 commits
  3. 03 Oct, 2020 12 commits
  4. 02 Oct, 2020 19 commits
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2020-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · ab0faf5f
      David S. Miller authored
      From: Saeed Mahameed <saeedm@nvidia.com>
      
      ====================
      This series introduces some fixes to mlx5 driver.
      
      v1->v2:
       - Patch #1 Don't return while mutex is held. (Dave)
      
      v2->v3:
       - Drop patch #1, will consider a better approach (Jakub)
       - use cpu_relax() instead of cond_resched() (Jakub)
       - while(i--) to reveres a loop (Jakub)
       - Drop old mellanox email sign-off and change the committer email
         (Jakub)
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.15
       ('net/mlx5e: Fix VLAN cleanup flow')
       ('net/mlx5e: Fix VLAN create flow')
      
      For -stable v4.16
       ('net/mlx5: Fix request_irqs error flow')
      
      For -stable v5.4
       ('net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU')
       ('net/mlx5: Avoid possible free of command entry while timeout comp handler')
      
      For -stable v5.7
       ('net/mlx5e: Fix return status when setting unsupported FEC mode')
      
      For -stable v5.8
       ('net/mlx5e: Fix race condition on nhe->n pointer in neigh update')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab0faf5f
    • Paolo Abeni's avatar
      tcp: fix syn cookied MPTCP request socket leak · 9d8c05ad
      Paolo Abeni authored
      If a syn-cookies request socket don't pass MPTCP-level
      validation done in syn_recv_sock(), we need to release
      it immediately, or it will be leaked.
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/89
      Fixes: 9466a1cc ("mptcp: enable JOIN requests even if cookies are in use")
      Reported-and-tested-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d8c05ad
    • David S. Miller's avatar
      Merge branch 'Introduce-sendpage_ok-to-detect-misused-sendpage-in-network-related-drivers' · e7d4005d
      David S. Miller authored
      Coly Li says:
      
      ====================
      Introduce sendpage_ok() to detect misused sendpage in network related drivers
      
      As Sagi Grimberg suggested, the original fix is refind to a more common
      inline routine:
          static inline bool sendpage_ok(struct page *page)
          {
              return  (!PageSlab(page) && page_count(page) >= 1);
          }
      If sendpage_ok() returns true, the checking page can be handled by the
      concrete zero-copy sendpage method in network layer.
      
      The v10 series has 7 patches, fixes a WARN_ONCE() usage from v9 series,
      - The 1st patch in this series introduces sendpage_ok() in header file
        include/linux/net.h.
      - The 2nd patch adds WARN_ONCE() for improper zero-copy send in
        kernel_sendpage().
      - The 3rd patch fixes the page checking issue in nvme-over-tcp driver.
      - The 4th patch adds page_count check by using sendpage_ok() in
        do_tcp_sendpages() as Eric Dumazet suggested.
      - The 5th and 6th patches just replace existing open coded checks with
        the inline sendpage_ok() routine.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7d4005d
    • Coly Li's avatar
      libceph: use sendpage_ok() in ceph_tcp_sendpage() · 40efc4dc
      Coly Li authored
      In libceph, ceph_tcp_sendpage() does the following checks before handle
      the page by network layer's zero copy sendpage method,
      	if (page_count(page) >= 1 && !PageSlab(page))
      
      This check is exactly what sendpage_ok() does. This patch replace the
      open coded checks by sendpage_ok() as a code cleanup.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40efc4dc
    • Coly Li's avatar
      scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map() · 6aa25c73
      Coly Li authored
      In iscsci driver, iscsi_tcp_segment_map() uses the following code to
      check whether the page should or not be handled by sendpage:
          if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
      
      The "page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)" part is to
      make sure the page can be sent to network layer's zero copy path. This
      part is exactly what sendpage_ok() does.
      
      This patch uses  use sendpage_ok() in iscsi_tcp_segment_map() to replace
      the original open coded checks.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Cc: Vasily Averin <vvs@virtuozzo.com>
      Cc: Cong Wang <amwang@redhat.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Chris Leech <cleech@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6aa25c73
    • Coly Li's avatar
      drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() · fb25ebe1
      Coly Li authored
      In _drbd_send_page() a page is checked by following code before sending
      it by kernel_sendpage(),
              (page_count(page) < 1) || PageSlab(page)
      If the check is true, this page won't be send by kernel_sendpage() and
      handled by sock_no_sendpage().
      
      This kind of check is exactly what macro sendpage_ok() does, which is
      introduced into include/linux/net.h to solve a similar send page issue
      in nvme-tcp code.
      
      This patch uses macro sendpage_ok() to replace the open coded checks to
      page type and refcount in _drbd_send_page(), as a code cleanup.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb25ebe1
    • Coly Li's avatar
      tcp: use sendpage_ok() to detect misused .sendpage · cf83a17e
      Coly Li authored
      commit a10674bf ("tcp: detecting the misuse of .sendpage for Slab
      objects") adds the checks for Slab pages, but the pages don't have
      page_count are still missing from the check.
      
      Network layer's sendpage method is not designed to send page_count 0
      pages neither, therefore both PageSlab() and page_count() should be
      both checked for the sending page. This is exactly what sendpage_ok()
      does.
      
      This patch uses sendpage_ok() in do_tcp_sendpages() to detect misused
      .sendpage, to make the code more robust.
      
      Fixes: a10674bf ("tcp: detecting the misuse of .sendpage for Slab objects")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Vasily Averin <vvs@virtuozzo.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf83a17e
    • Coly Li's avatar
      nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() · 7d4194ab
      Coly Li authored
      Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
      send slab pages. But for pages allocated by __get_free_pages() without
      __GFP_COMP, which also have refcount as 0, they are still sent by
      kernel_sendpage() to remote end, this is problematic.
      
      The new introduced helper sendpage_ok() checks both PageSlab tag and
      page_count counter, and returns true if the checking page is OK to be
      sent by kernel_sendpage().
      
      This patch fixes the page checking issue of nvme_tcp_try_send_data()
      with sendpage_ok(). If sendpage_ok() returns true, send this page by
      kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Vlastimil Babka <vbabka@suse.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d4194ab
    • Coly Li's avatar
      net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send · 7b62d31d
      Coly Li authored
      If a page sent into kernel_sendpage() is a slab page or it doesn't have
      ref_count, this page is improper to send by the zero copy sendpage()
      method. Otherwise such page might be unexpected released in network code
      path and causes impredictable panic due to kernel memory management data
      structure corruption.
      
      This path adds a WARN_ON() on the sending page before sends it into the
      concrete zero-copy sendpage() method, if the page is improper for the
      zero-copy sendpage() method, a warning message can be observed before
      the consequential unpredictable kernel panic.
      
      This patch does not change existing kernel_sendpage() behavior for the
      improper page zero-copy send, it just provides hint warning message for
      following potential panic due the kernel memory heap corruption.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Cong Wang <amwang@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Sridhar Samudrala <sri@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b62d31d
    • Coly Li's avatar
      net: introduce helper sendpage_ok() in include/linux/net.h · c381b079
      Coly Li authored
      The original problem was from nvme-over-tcp code, who mistakenly uses
      kernel_sendpage() to send pages allocated by __get_free_pages() without
      __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
      tail pages, sending them by kernel_sendpage() may trigger a kernel panic
      from a corrupted kernel heap, because these pages are incorrectly freed
      in network stack as page_count 0 pages.
      
      This patch introduces a helper sendpage_ok(), it returns true if the
      checking page,
      - is not slab page: PageSlab(page) is false.
      - has page refcount: page_count(page) is not zero
      
      All drivers who want to send page to remote end by kernel_sendpage()
      may use this helper to check whether the page is OK. If the helper does
      not return true, the driver should try other non sendpage method (e.g.
      sock_no_sendpage()) to handle the page.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
      Cc: Philipp Reisner <philipp.reisner@linbit.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Vlastimil Babka <vbabka@suse.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c381b079
    • Petko Manolov's avatar
      net: usb: pegasus: Proper error handing when setting pegasus' MAC address · f30e25a9
      Petko Manolov authored
      v2:
      
      If reading the MAC address from eeprom fail don't throw an error, use randomly
      generated MAC instead.  Either way the adapter will soldier on and the return
      type of set_ethernet_addr() can be reverted to void.
      
      v1:
      
      Fix a bug in set_ethernet_addr() which does not take into account possible
      errors (or partial reads) returned by its helpers.  This can potentially lead to
      writing random data into device's MAC address registers.
      Signed-off-by: default avatarPetko Manolov <petko.manolov@konsulko.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f30e25a9
    • Mauro Carvalho Chehab's avatar
      net: core: document two new elements of struct net_device · a93bdcb9
      Mauro Carvalho Chehab authored
      As warned by "make htmldocs", there are two new struct elements
      that aren't documented:
      
      	../include/linux/netdevice.h:2159: warning: Function parameter or member 'unlink_list' not described in 'net_device'
      	../include/linux/netdevice.h:2159: warning: Function parameter or member 'nested_level' not described in 'net_device'
      
      Fixes: 1fc70edb ("net: core: add nested_level variable in net_device")
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a93bdcb9
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · d3d45f82
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "Some pin control fixes here. All of them are driver fixes, the Intel
        Cherryview being the most interesting one.
      
         - Fix a mux problem for I2C in the MVEBU driver.
      
         - Fix a really hairy inversion problem in the Intel Cherryview
           driver.
      
         - Fix the register for the sdc2_clk in the Qualcomm SM8250 driver.
      
         - Check the virtual GPIO boot failur in the Mediatek driver"
      
      * tag 'pinctrl-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: mediatek: check mtk_is_virt_gpio input parameter
        pinctrl: qcom: sm8250: correct sdc2_clk
        pinctrl: cherryview: Preserve CHV_PADCTRL1_INVRXTX_TXDATA flag on GPIOs
        pinctrl: mvebu: Fix i2c sda definition for 98DX3236
      d3d45f82
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 4d9c3a68
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - Fix rockchip regression in rockchip_pcie_valid_device() (Lorenzo
         Pieralisi)
      
       - Add Pali Rohár as aardvark PCI maintainer (Pali Rohár)
      
      * tag 'pci-v5.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        MAINTAINERS: Add Pali Rohár as aardvark PCI maintainer
        PCI: rockchip: Fix bus checks in rockchip_pcie_valid_device()
      4d9c3a68
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · cb6f55af
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two patches in driver frameworks. The iscsi one corrects a bug induced
        by a BPF change to network locking and the other is a regression we
        introduced"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()
        scsi: target: Fix lun lookup for TARGET_SCF_LOOKUP_LUN_FROM_TAG case
      cb6f55af
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-10-02' of git://git.kernel.dk/linux-block · 702bfc89
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - fix for async buffered reads if read-ahead is fully disabled (Hao)
      
       - double poll match fix
      
       - ->show_fdinfo() potential ABBA deadlock complaint fix
      
      * tag 'io_uring-5.9-2020-10-02' of git://git.kernel.dk/linux-block:
        io_uring: fix async buffered reads when readahead is disabled
        io_uring: fix potential ABBA deadlock in ->show_fdinfo()
        io_uring: always delete double poll wait entry on match
      702bfc89
    • Linus Torvalds's avatar
      Merge tag 'block-5.9-2020-10-02' of git://git.kernel.dk/linux-block · f016a540
      Linus Torvalds authored
      Pull block fix from Jens Axboe:
       "Single fix for a ->commit_rqs failure case"
      
      * tag 'block-5.9-2020-10-02' of git://git.kernel.dk/linux-block:
        blk-mq: call commit_rqs while list empty but error happen
      f016a540
    • Johannes Berg's avatar
      netlink: fix policy dump leak · a95bc734
      Johannes Berg authored
      If userspace doesn't complete the policy dump, we leak the
      allocated state. Fix this.
      
      Fixes: d07dcf9a ("netlink: add infrastructure to expose policies to userspace")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a95bc734
    • Vlad Buslov's avatar
      net/mlx5e: Fix race condition on nhe->n pointer in neigh update · 1253935a
      Vlad Buslov authored
      Current neigh update event handler implementation takes reference to
      neighbour structure, assigns it to nhe->n, tries to schedule workqueue task
      and releases the reference if task was already enqueued. This results
      potentially overwriting existing nhe->n pointer with another neighbour
      instance, which causes double release of the instance (once in neigh update
      handler that failed to enqueue to workqueue and another one in neigh update
      workqueue task that processes updated nhe->n pointer instead of original
      one):
      
      [ 3376.512806] ------------[ cut here ]------------
      [ 3376.513534] refcount_t: underflow; use-after-free.
      [ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd
       grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c
      _core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw]
      [ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6
      [ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
      [ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0
      [ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13
      [ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286
      [ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000
      [ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40
      [ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d
      [ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000
      [ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900
      [ 3376.541444] FS:  0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000
      [ 3376.542732] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0
      [ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 3376.546344] PKRU: 55555554
      [ 3376.546911] Call Trace:
      [ 3376.547479]  mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core]
      [ 3376.548299]  process_one_work+0x1d8/0x390
      [ 3376.548977]  worker_thread+0x4d/0x3e0
      [ 3376.549631]  ? rescuer_thread+0x3e0/0x3e0
      [ 3376.550295]  kthread+0x118/0x130
      [ 3376.550914]  ? kthread_create_worker_on_cpu+0x70/0x70
      [ 3376.551675]  ret_from_fork+0x1f/0x30
      [ 3376.552312] ---[ end trace d84e8f46d2a77eec ]---
      
      Fix the bug by moving work_struct to dedicated dynamically-allocated
      structure. This enabled every event handler to work on its own private
      neighbour pointer and removes the need for handling the case when task is
      already enqueued.
      
      Fixes: 232c0013 ("net/mlx5e: Add support to neighbour update flow")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1253935a