1. 13 Jan, 2019 40 commits
    • Nicholas Piggin's avatar
      powerpc: avoid -mno-sched-epilog on GCC 4.9 and newer · f716d5e5
      Nicholas Piggin authored
      commit 6977f95e upstream.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [nc: Adjust context due to lack of f2910f0e and 2a056f58]
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f716d5e5
    • Vasily Averin's avatar
      sunrpc: use SVC_NET() in svcauth_gss_* functions · aa71dcfe
      Vasily Averin authored
      commit b8be5674 upstream.
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa71dcfe
    • Vasily Averin's avatar
      sunrpc: fix cache_head leak due to queued request · 76da0179
      Vasily Averin authored
      commit 4ecd55ea upstream.
      
      After commit d202cce8, an expired cache_head can be removed from the
      cache_detail's hash.
      
      However, the expired cache_head may be waiting for a reply from a
      previously submitted request. Such a cache_head has an increased
      refcounter and therefore it won't be freed after cache_put(freeme).
      
      Because the cache_head was removed from the hash it cannot be found
      during cache_clean() and can be leaked forever, together with stalled
      cache_request and other taken resources.
      
      In our case we noticed it because an entry in the export cache was
      holding a reference on a filesystem.
      
      Fixes d202cce8 ("sunrpc: never return expired entries in sunrpc_cache_lookup")
      Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Cc: stable@kernel.org # 2.6.35
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76da0179
    • Huang Ying's avatar
      mm, swap: fix swapoff with KSM pages · 89b03877
      Huang Ying authored
      commit 7af7a8e1 upstream.
      
      KSM pages may be mapped to the multiple VMAs that cannot be reached from
      one anon_vma.  So during swapin, a new copy of the page need to be
      generated if a different anon_vma is needed, please refer to comments of
      ksm_might_need_to_copy() for details.
      
      During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA and
      virtual address mapped to the page, so not all mappings to a swapped out
      KSM page could be found.  So in try_to_unuse(), even if the swap count of
      a swap entry isn't zero, the page needs to be deleted from swap cache, so
      that, in the next round a new page could be allocated and swapin for the
      other mappings of the swapped out KSM page.
      
      But this contradicts with the THP swap support.  Where the THP could be
      deleted from swap cache only after the swap count of every swap entry in
      the huge swap cluster backing the THP has reach 0.  So try_to_unuse() is
      changed in commit e0709829 ("mm, THP, swap: support to reclaim swap
      space for THP swapped out") to check that before delete a page from swap
      cache, but this has broken KSM swapoff too.
      
      Fortunately, KSM is for the normal pages only, so the original behavior
      for KSM pages could be restored easily via checking PageTransCompound().
      That is how this patch works.
      
      The bug is introduced by e0709829 ("mm, THP, swap: support to reclaim
      swap space for THP swapped out"), which is merged by v4.14-rc1.  So I
      think we should backport the fix to from 4.14 on.  But Hugh thinks it may
      be rare for the KSM pages being in the swap device when swapoff, so nobody
      reports the bug so far.
      
      Link: http://lkml.kernel.org/r/20181226051522.28442-1-ying.huang@intel.com
      Fixes: e0709829 ("mm, THP, swap: support to reclaim swap space for THP swapped out")
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reported-by: default avatarHugh Dickins <hughd@google.com>
      Tested-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89b03877
    • Dan Williams's avatar
      mm, hmm: mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL · c5a2c79d
      Dan Williams authored
      commit 02917e9f upstream.
      
      At Maintainer Summit, Greg brought up a topic I proposed around
      EXPORT_SYMBOL_GPL usage.  The motivation was considerations for when
      EXPORT_SYMBOL_GPL is warranted and the criteria for taking the exceptional
      step of reclassifying an existing export.  Specifically, I wanted to make
      the case that although the line is fuzzy and hard to specify in abstract
      terms, it is nonetheless clear that devm_memremap_pages() and HMM
      (Heterogeneous Memory Management) have crossed it.  The
      devm_memremap_pages() facility should have been EXPORT_SYMBOL_GPL from the
      beginning, and HMM as a derivative of that functionality should have
      naturally picked up that designation as well.
      
      Contrary to typical rules, the HMM infrastructure was merged upstream with
      zero in-tree consumers.  There was a promise at the time that those users
      would be merged "soon", but it has been over a year with no drivers
      arriving.  While the Nouveau driver is about to belatedly make good on
      that promise it is clear that HMM was targeted first and foremost at an
      out-of-tree consumer.
      
      HMM is derived from devm_memremap_pages(), a facility Christoph and I
      spearheaded to support persistent memory.  It combines a device lifetime
      model with a dynamically created 'struct page' / memmap array for any
      physical address range.  It enables coordination and control of the many
      code paths in the kernel built to interact with memory via 'struct page'
      objects.  With HMM the integration goes even deeper by allowing device
      drivers to hook and manipulate page fault and page free events.
      
      One interpretation of when EXPORT_SYMBOL is suitable is when it is
      exporting stable and generic leaf functionality.  The
      devm_memremap_pages() facility continues to see expanding use cases,
      peer-to-peer DMA being the most recent, with no clear end date when it
      will stop attracting reworks and semantic changes.  It is not suitable to
      export devm_memremap_pages() as a stable 3rd party driver API due to the
      fact that it is still changing and manipulates core behavior.  Moreover,
      it is not in the best interest of the long term development of the core
      memory management subsystem to permit any external driver to effectively
      define its own system-wide memory management policies with no
      encouragement to engage with upstream.
      
      I am also concerned that HMM was designed in a way to minimize further
      engagement with the core-MM.  That, with these hooks in place,
      device-drivers are free to implement their own policies without much
      consideration for whether and how the core-MM could grow to meet that
      need.  Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the
      core-MM should be allowed the opportunity and stimulus to change and
      address these new use cases as first class functionality.
      
      Original changelog:
      
      hmm_devmem_add(), and hmm_devmem_add_resource() duplicated
      devm_memremap_pages() and are now simple now wrappers around the core
      facility to inject a dev_pagemap instance into the global pgmap_radix and
      hook page-idle events.  The devm_memremap_pages() interface is base
      infrastructure for HMM.  HMM has more and deeper ties into the kernel
      memory management implementation than base ZONE_DEVICE which is itself a
      EXPORT_SYMBOL_GPL facility.
      
      Originally, the HMM page structure creation routines copied the
      devm_memremap_pages() code and reused ZONE_DEVICE.  A cleanup to unify the
      implementations was discussed during the initial review:
      http://lkml.iu.edu/hypermail/linux/kernel/1701.2/00812.html Recent work to
      extend devm_memremap_pages() for the peer-to-peer-DMA facility enabled
      this cleanup to move forward.
      
      In addition to the integration with devm_memremap_pages() HMM depends on
      other GPL-only symbols:
      
          mmu_notifier_unregister_no_release
          percpu_ref
          region_intersects
          __class_create
      
      It goes further to consume / indirectly expose functionality that is not
      exported to any other driver:
      
          alloc_pages_vma
          walk_page_range
      
      HMM is derived from devm_memremap_pages(), and extends deep core-kernel
      fundamentals. Similar to devm_memremap_pages(), mark its entry points
      EXPORT_SYMBOL_GPL().
      
      [logang@deltatee.com: PCI/P2PDMA: match interface changes to devm_memremap_pages()]
        Link: http://lkml.kernel.org/r/20181130225911.2900-1-logang@deltatee.com
      Link: http://lkml.kernel.org/r/154275560565.76910.15919297436557795278.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Balbir Singh <bsingharora@gmail.com>,
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5a2c79d
    • Dan Williams's avatar
      mm, hmm: use devm semantics for hmm_devmem_{add, remove} · 465c5cf0
      Dan Williams authored
      commit 58ef15b7 upstream.
      
      devm semantics arrange for resources to be torn down when
      device-driver-probe fails or when device-driver-release completes.
      Similar to devm_memremap_pages() there is no need to support an explicit
      remove operation when the users properly adhere to devm semantics.
      
      Note that devm_kzalloc() automatically handles allocating node-local
      memory.
      
      Link: http://lkml.kernel.org/r/154275559545.76910.9186690723515469051.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      465c5cf0
    • Dan Williams's avatar
      mm, devm_memremap_pages: kill mapping "System RAM" support · 14327542
      Dan Williams authored
      commit 06489cfb upstream.
      
      Given the fact that devm_memremap_pages() requires a percpu_ref that is
      torn down by devm_memremap_pages_release() the current support for mapping
      RAM is broken.
      
      Support for remapping "System RAM" has been broken since the beginning and
      there is no existing user of this this code path, so just kill the support
      and make it an explicit error.
      
      This cleanup also simplifies a follow-on patch to fix the error path when
      setting a devm release action for devm_memremap_pages_release() fails.
      
      Link: http://lkml.kernel.org/r/154275557997.76910.14689813630968180480.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatar"Jérôme Glisse" <jglisse@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14327542
    • Dan Williams's avatar
      mm, devm_memremap_pages: mark devm_memremap_pages() EXPORT_SYMBOL_GPL · 47d24f8c
      Dan Williams authored
      commit 808153e1 upstream.
      
      devm_memremap_pages() is a facility that can create struct page entries
      for any arbitrary range and give drivers the ability to subvert core
      aspects of page management.
      
      Specifically the facility is tightly integrated with the kernel's memory
      hotplug functionality.  It injects an altmap argument deep into the
      architecture specific vmemmap implementation to allow allocating from
      specific reserved pages, and it has Linux specific assumptions about page
      structure reference counting relative to get_user_pages() and
      get_user_pages_fast().  It was an oversight and a mistake that this was
      not marked EXPORT_SYMBOL_GPL from the outset.
      
      Again, devm_memremap_pagex() exposes and relies upon core kernel internal
      assumptions and will continue to evolve along with 'struct page', memory
      hotplug, and support for new memory types / topologies.  Only an in-kernel
      GPL-only driver is expected to keep up with this ongoing evolution.  This
      interface, and functionality derived from this interface, is not suitable
      for kernel-external drivers.
      
      Link: http://lkml.kernel.org/r/154275557457.76910.16923571232582744134.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47d24f8c
    • Michal Hocko's avatar
      hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined · 2c25071b
      Michal Hocko authored
      commit b15c8726 upstream.
      
      We have received a bug report that an injected MCE about faulty memory
      prevents memory offline to succeed on 4.4 base kernel.  The underlying
      reason was that the HWPoison page has an elevated reference count and the
      migration keeps failing.  There are two problems with that.  First of all
      it is dubious to migrate the poisoned page because we know that accessing
      that memory is possible to fail.  Secondly it doesn't make any sense to
      migrate a potentially broken content and preserve the memory corruption
      over to a new location.
      
      Oscar has found out that 4.4 and the current upstream kernels behave
      slightly differently with his simply testcase
      
      ===
      
      int main(void)
      {
              int ret;
              int i;
              int fd;
              char *array = malloc(4096);
              char *array_locked = malloc(4096);
      
              fd = open("/tmp/data", O_RDONLY);
              read(fd, array, 4095);
      
              for (i = 0; i < 4096; i++)
                      array_locked[i] = 'd';
      
              ret = mlock((void *)PAGE_ALIGN((unsigned long)array_locked), sizeof(array_locked));
              if (ret)
                      perror("mlock");
      
              sleep (20);
      
              ret = madvise((void *)PAGE_ALIGN((unsigned long)array_locked), 4096, MADV_HWPOISON);
              if (ret)
                      perror("madvise");
      
              for (i = 0; i < 4096; i++)
                      array_locked[i] = 'd';
      
              return 0;
      }
      ===
      
      + offline this memory.
      
      In 4.4 kernels he saw the hwpoisoned page to be returned back to the LRU
      list
      kernel:  [<ffffffff81019ac9>] dump_trace+0x59/0x340
      kernel:  [<ffffffff81019e9a>] show_stack_log_lvl+0xea/0x170
      kernel:  [<ffffffff8101ac71>] show_stack+0x21/0x40
      kernel:  [<ffffffff8132bb90>] dump_stack+0x5c/0x7c
      kernel:  [<ffffffff810815a1>] warn_slowpath_common+0x81/0xb0
      kernel:  [<ffffffff811a275c>] __pagevec_lru_add_fn+0x14c/0x160
      kernel:  [<ffffffff811a2eed>] pagevec_lru_move_fn+0xad/0x100
      kernel:  [<ffffffff811a334c>] __lru_cache_add+0x6c/0xb0
      kernel:  [<ffffffff81195236>] add_to_page_cache_lru+0x46/0x70
      kernel:  [<ffffffffa02b4373>] extent_readpages+0xc3/0x1a0 [btrfs]
      kernel:  [<ffffffff811a16d7>] __do_page_cache_readahead+0x177/0x200
      kernel:  [<ffffffff811a18c8>] ondemand_readahead+0x168/0x2a0
      kernel:  [<ffffffff8119673f>] generic_file_read_iter+0x41f/0x660
      kernel:  [<ffffffff8120e50d>] __vfs_read+0xcd/0x140
      kernel:  [<ffffffff8120e9ea>] vfs_read+0x7a/0x120
      kernel:  [<ffffffff8121404b>] kernel_read+0x3b/0x50
      kernel:  [<ffffffff81215c80>] do_execveat_common.isra.29+0x490/0x6f0
      kernel:  [<ffffffff81215f08>] do_execve+0x28/0x30
      kernel:  [<ffffffff81095ddb>] call_usermodehelper_exec_async+0xfb/0x130
      kernel:  [<ffffffff8161c045>] ret_from_fork+0x55/0x80
      
      And that latter confuses the hotremove path because an LRU page is
      attempted to be migrated and that fails due to an elevated reference
      count.  It is quite possible that the reuse of the HWPoisoned page is some
      kind of fixed race condition but I am not really sure about that.
      
      With the upstream kernel the failure is slightly different.  The page
      doesn't seem to have LRU bit set but isolate_movable_page simply fails and
      do_migrate_range simply puts all the isolated pages back to LRU and
      therefore no progress is made and scan_movable_pages finds same set of
      pages over and over again.
      
      Fix both cases by explicitly checking HWPoisoned pages before we even try
      to get reference on the page, try to unmap it if it is still mapped.  As
      explained by Naoya:
      
      : Hwpoison code never unmapped those for no big reason because
      : Ksm pages never dominate memory, so we simply didn't have strong
      : motivation to save the pages.
      
      Also put WARN_ON(PageLRU) in case there is a race and we can hit LRU
      HWPoison pages which shouldn't happen but I couldn't convince myself about
      that.  Naoya has noted the following:
      
      : Theoretically no such gurantee, because try_to_unmap() doesn't have a
      : guarantee of success and then memory_failure() returns immediately
      : when hwpoison_user_mappings fails.
      : Or the following code (comes after hwpoison_user_mappings block) also impli=
      : es
      : that the target page can still have PageLRU flag.
      :
      :         /*
      :          * Torn down by someone else?
      :          */
      :         if (PageLRU(p) && !PageSwapCache(p) && p->mapping =3D=3D NULL) {
      :                 action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED);
      :                 res =3D -EBUSY;
      :                 goto out;
      :         }
      :
      : So I think it's OK to keep "if (WARN_ON(PageLRU(page)))" block in
      : current version of your patch.
      
      Link: http://lkml.kernel.org/r/20181206120135.14079-1-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.com>
      Debugged-by: default avatarOscar Salvador <osalvador@suse.com>
      Tested-by: default avatarOscar Salvador <osalvador@suse.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c25071b
    • Minchan Kim's avatar
      zram: fix double free backing device · ce8daa28
      Minchan Kim authored
      commit 5547932d upstream.
      
      If blkdev_get fails, we shouldn't do blkdev_put.  Otherwise, kernel emits
      below log.  This patch fixes it.
      
        WARNING: CPU: 0 PID: 1893 at fs/block_dev.c:1828 blkdev_put+0x105/0x120
        Modules linked in:
        CPU: 0 PID: 1893 Comm: swapoff Not tainted 4.19.0+ #453
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
        RIP: 0010:blkdev_put+0x105/0x120
        Call Trace:
          __x64_sys_swapoff+0x46d/0x490
          do_syscall_64+0x5a/0x190
          entry_SYSCALL_64_after_hwframe+0x49/0xbe
        irq event stamp: 4466
        hardirqs last  enabled at (4465):  __free_pages_ok+0x1e3/0x490
        hardirqs last disabled at (4466):  trace_hardirqs_off_thunk+0x1a/0x1c
        softirqs last  enabled at (3420):  __do_softirq+0x333/0x446
        softirqs last disabled at (3407):  irq_exit+0xd1/0xe0
      
      Link: http://lkml.kernel.org/r/20181127055429.251614-3-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: default avatarJoey Pabalinas <joeypabalinas@gmail.com>
      Cc: <stable@vger.kernel.org>	[4.14+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce8daa28
    • David Herrmann's avatar
      fork: record start_time late · 3f2e4e1d
      David Herrmann authored
      commit 7b558513 upstream.
      
      This changes the fork(2) syscall to record the process start_time after
      initializing the basic task structure but still before making the new
      process visible to user-space.
      
      Technically, we could record the start_time anytime during fork(2).  But
      this might lead to scenarios where a start_time is recorded long before
      a process becomes visible to user-space.  For instance, with
      userfaultfd(2) and TLS, user-space can delay the execution of fork(2)
      for an indefinite amount of time (and will, if this causes network
      access, or similar).
      
      By recording the start_time late, it much closer reflects the point in
      time where the process becomes live and can be observed by other
      processes.
      
      Lastly, this makes it much harder for user-space to predict and control
      the start_time they get assigned.  Previously, user-space could fork a
      process and stall it in copy_thread_tls() before its pid is allocated,
      but after its start_time is recorded.  This can be misused to later-on
      cycle through PIDs and resume the stalled fork(2) yielding a process
      that has the same pid and start_time as a process that existed before.
      This can be used to circumvent security systems that identify processes
      by their pid+start_time combination.
      
      Even though user-space was always aware that start_time recording is
      flaky (but several projects are known to still rely on start_time-based
      identification), changing the start_time to be recorded late will help
      mitigate existing attacks and make it much harder for user-space to
      control the start_time a process gets assigned.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarTom Gundersen <teg@jklm.no>
      Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f2e4e1d
    • Martin Kelly's avatar
      tools: fix cross-compile var clobbering · 5ee254ef
      Martin Kelly authored
      commit 7ed1c190 upstream.
      
      Currently a number of Makefiles break when used with toolchains that
      pass extra flags in CC and other cross-compile related variables (such
      as --sysroot).
      
      Thus we get this error when we use a toolchain that puts --sysroot in
      the CC var:
      
        ~/src/linux/tools$ make iio
        [snip]
        iio_event_monitor.c:18:10: fatal error: unistd.h: No such file or directory
          #include <unistd.h>
                   ^~~~~~~~~~
      
      This occurs because we clobber several env vars related to
      cross-compiling with lines like this:
      
        CC = $(CROSS_COMPILE)gcc
      
      Although this will point to a valid cross-compiler, we lose any extra
      flags that might exist in the CC variable, which can break toolchains
      that rely on them (for example, those that use --sysroot).
      
      This easily shows up using a Yocto SDK:
      
        $ . [snip]/sdk/environment-setup-cortexa8hf-neon-poky-linux-gnueabi
      
        $ echo $CC
        arm-poky-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
        -mcpu=cortex-a8
        --sysroot=[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi
      
        $ echo $CROSS_COMPILE
        arm-poky-linux-gnueabi-
      
        $ echo ${CROSS_COMPILE}gcc
        krm-poky-linux-gnueabi-gcc
      
      Although arm-poky-linux-gnueabi-gcc is a cross-compiler, we've lost the
      --sysroot and other flags that enable us to find the right libraries to
      link against, so we can't find unistd.h and other libraries and headers.
      Normally with the --sysroot flag we would find unistd.h in the sdk
      directory in the sysroot:
      
        $ find [snip]/sdk/sysroots -path '*/usr/include/unistd.h'
        [snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/include/unistd.h
      
      The perf Makefile adds CC = $(CROSS_COMPILE)gcc if and only if CC is not
      already set, and it compiles correctly with the above toolchain.
      
      So, generalize the logic that perf uses in the common Makefile and
      remove the manual CC = $(CROSS_COMPILE)gcc lines from each Makefile.
      
      Note that this patch does not fix cross-compile for all the tools (some
      have other bugs), but it does fix it for all except usb and acpi, which
      still have other unrelated issues.
      
      I tested both with and without the patch on native and cross-build and
      there appear to be no regressions.
      
      Link: http://lkml.kernel.org/r/20180107214028.23771-1-martin@martingkelly.comSigned-off-by: default avatarMartin Kelly <martin@martingkelly.com>
      Acked-by: default avatarMark Brown <broonie@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Jonathan Cameron <jic23@kernel.org>
      Cc: Pali Rohar <pali.rohar@gmail.com>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Valentina Manea <valentina.manea.m@gmail.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Mario Limonciello <mario.limonciello@dell.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Ignat Korchagin <ignat@cloudflare.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ee254ef
    • Thomas Gleixner's avatar
      genirq/affinity: Don't return with empty affinity masks on error · 76369ed5
      Thomas Gleixner authored
      commit 0211e12d upstream.
      
      When the allocation of node_to_possible_cpumask fails, then
      irq_create_affinity_masks() returns with a pointer to the empty affinity
      masks array, which will cause malfunction.
      
      Reorder the allocations so the masks array allocation comes last and every
      failure path returns NULL.
      
      Fixes: 9a0ef98e ("genirq/affinity: Assign vectors to all present CPUs")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Mihai Carabas <mihai.carabas@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76369ed5
    • Ewan D. Milne's avatar
      scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is invalid · 0840feb3
      Ewan D. Milne authored
      commit 4e87eb2f upstream.
      
      Certain older adapters such as the OneConnect OCe10100 may not have a valid
      wqpcnt value.  In this case, do not set queue->page_count to 0 in
      lpfc_sli4_queue_alloc() as this will prevent the driver from initializing.
      
      Fixes: 895427bd ("scsi: lpfc: NVME Initiator: Base modifications")
      Cc: stable@vger.kernel.org # 4.11+
      Signed-off-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reviewed-by: default avatarLaurence Oberman <loberman@redhat.com>
      Tested-by: default avatarLaurence Oberman <loberman@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0840feb3
    • Steffen Maier's avatar
      scsi: zfcp: fix posting too many status read buffers leading to adapter shutdown · 8eb9f245
      Steffen Maier authored
      commit 60a161b7 upstream.
      
      Suppose adapter (open) recovery is between opened QDIO queues and before
      (the end of) initial posting of status read buffers (SRBs). This time
      window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING
      causing by design looping with exponential increase sleeps in the function
      performing exchange config data during recovery
      [zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up.
      
      Suppose an event occurs for which the FCP channel would send an unsolicited
      notification to zfcp by means of a previously posted SRB.  We saw it with
      local cable pull (link down) in multi-initiator zoning with multiple
      NPIV-enabled subchannels of the same shared FCP channel.
      
      As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial
      status read buffers from within the adapter's ERP thread, the channel does
      send an unsolicited notification.
      
      Since v2.6.27 commit d26ab06e ("[SCSI] zfcp: receiving an unsolicted
      status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules
      adapter->stat_work to re-fill the just consumed SRB from a work item.
      
      Now the ERP thread and the work item post SRBs in parallel.  Both contexts
      call the helper function zfcp_status_read_refill().  The tracking of
      missing (to be posted / re-filled) SRBs is not thread-safe due to separate
      atomic_read() and atomic_dec(), in order to depend on posting
      success. Hence, both contexts can see
      atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts
      one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue
      (trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in
      zfcp_qdio_handler_error().
      
      An obvious and seemingly clean fix would be to schedule stat_work from the
      ERP thread and wait for it to finish. This would serialize all SRB
      re-fills. However, we already have another work item wait on the ERP
      thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls
      zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the
      open port recoveries during zfcp auto port scan, but in fact it waits for
      any pending recovery including an adapter recovery. This approach leads to
      a deadlock.  [see also v3.19 commit 18f87a67 ("zfcp: auto port scan
      resiliency"); v2.6.37 commit d3e1088d
      ("[SCSI] zfcp: No ERP escalation on gpn_ft eval");
      v2.6.28 commit fca55b6f
      ("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP")
      fixing v2.6.27 commit c57a39a4
      ("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port");
      v2.6.27 commit cc8c2829
      ("[SCSI] zfcp: Automatically attach remote ports")]
      
      Instead make the accounting of missing SRBs atomic for parallel execution
      in both the ERP thread and adapter->stat_work.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: d26ab06e ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall")
      Cc: <stable@vger.kernel.org> #2.6.27+
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8eb9f245
    • Yangtao Li's avatar
      serial/sunsu: fix refcount leak · d87abc2a
      Yangtao Li authored
      [ Upstream commit d430aff8 ]
      
      The function of_find_node_by_path() acquires a reference to the node
      returned by it and that reference needs to be dropped by its caller.
      
      su_get_type() doesn't do that. The match node are used as an identifier
      to compare against the current node, so we can directly drop the refcount
      after getting the node from the path as it is not used as pointer.
      
      Fix this by use a single variable and drop the refcount right after
      of_find_node_by_path().
      Signed-off-by: default avatarYangtao Li <tiny.windzz@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d87abc2a
    • Daniele Palmas's avatar
      qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup · be899fe4
      Daniele Palmas authored
      [ Upstream commit d667044f ]
      
      This patch fixes qmap header retrieval when modem is configured for
      dl data aggregation.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      be899fe4
    • Kangjie Lu's avatar
      net: netxen: fix a missing check and an uninitialized use · 29b7def0
      Kangjie Lu authored
      [ Upstream commit d134e486 ]
      
      When netxen_rom_fast_read() fails, "bios" is left uninitialized and may
      contain random value, thus should not be used.
      
      The fix ensures that if netxen_rom_fast_read() fails, we return "-EIO".
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      29b7def0
    • Mantas Mikulėnas's avatar
      Input: synaptics - enable SMBus for HP EliteBook 840 G4 · dd707bdc
      Mantas Mikulėnas authored
      [ Upstream commit 7a717122 ]
      
      dmesg reports that "Your touchpad (PNP: SYN3052 SYN0100 SYN0002 PNP0f13)
      says it can support a different bus."
      
      I've tested the offered psmouse.synaptics_intertouch=1 with 4.18.x and
      4.19.x and it seems to work well. No problems seen with suspend/resume.
      
      Also, it appears that RMI/SMBus mode is actually required for 3-4 finger
      multitouch gestures to work -- otherwise they are not reported at all.
      
      Information from dmesg in both modes:
      
        psmouse serio3: synaptics: Touchpad model: 1, fw: 8.2, id: 0x1e2b1,
            caps: 0xf00123/0x840300/0x2e800/0x0, board id: 3139, fw id: 2000742
      
        psmouse serio3: synaptics: Trying to set up SMBus access
        rmi4_smbus 6-002c: registering SMbus-connected sensor
        rmi4_f01 rmi4-00.fn01: found RMI device,
            manufacturer: Synaptics, product: TM3139-001, fw id: 2000742
      Signed-off-by: default avatarMantas Mikulėnas <grawity@gmail.com>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dd707bdc
    • Uwe Kleine-König's avatar
      gpio: mvebu: only fail on missing clk if pwm is actually to be used · 7bee9f9a
      Uwe Kleine-König authored
      [ Upstream commit c8da642d ]
      
      The gpio IP on Armada 370 at offset 0x18180 has neither a clk nor pwm
      registers. So there is no need for a clk as the pwm isn't used anyhow.
      So only check for the clk in the presence of the pwm registers. This fixes
      a failure to probe the gpio driver for the above mentioned gpio device.
      
      Fixes: 757642f9 ("gpio: mvebu: Add limited PWM support")
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: default avatarGregory CLEMENT <gregory.clement@bootlin.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7bee9f9a
    • Michael S. Tsirkin's avatar
      virtio: fix test build after uio.h change · 9adf9d71
      Michael S. Tsirkin authored
      [ Upstream commit c5c08bed ]
      
      Fixes: d3849953 ("fs: decouple READ and WRITE from the block layer ops")
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9adf9d71
    • Masahiro Yamada's avatar
      kbuild: fix false positive warning/error about missing libelf · b08f331e
      Masahiro Yamada authored
      [ Upstream commit ef7cfd00 ]
      
      For the same reason as commit 25896d07 ("x86/build: Fix compiler
      support check for CONFIG_RETPOLINE"), you cannot put this $(error ...)
      into the parse stage of the top Makefile.
      
      Perhaps I'd propose a more sophisticated solution later, but this is
      the best I can do for now.
      
      Link: https://lkml.org/lkml/2017/12/25/211Reported-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Reported-by: default avatarBernd Edlinger <bernd.edlinger@hotmail.de>
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Tested-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b08f331e
    • Sara Sharon's avatar
      mac80211: free skb fraglist before freeing the skb · bb2509c1
      Sara Sharon authored
      [ Upstream commit 34b1e0e9 ]
      
      mac80211 uses the frag list to build AMSDU. When freeing
      the skb, it may not be really freed, since someone is still
      holding a reference to it.
      In that case, when TCP skb is being retransmitted, the
      pointer to the frag list is being reused, while the data
      in there is no longer valid.
      Since we will never get frag list from the network stack,
      as mac80211 doesn't advertise the capability, we can safely
      free and nullify it before releasing the SKB.
      Signed-off-by: default avatarSara Sharon <sara.sharon@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bb2509c1
    • Colin Ian King's avatar
      vxge: ensure data0 is initialized in when fetching firmware version information · fb34793c
      Colin Ian King authored
      [ Upstream commit f7db2beb ]
      
      Currently variable data0 is not being initialized so a garbage value is
      being passed to vxge_hw_vpath_fw_api and this value is being written to
      the rts_access_steer_data0 register.  There are other occurrances where
      data0 is being initialized to zero (e.g. in function
      vxge_hw_upgrade_read_version) so I think it makes sense to ensure data0
      is initialized likewise to 0.
      
      Detected by CoverityScan, CID#140696 ("Uninitialized scalar variable")
      
      Fixes: 8424e00d ("vxge: serialize access to steering control register")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fb34793c
    • Jason Martinsen's avatar
      lan78xx: Resolve issue with changing MAC address · 9c5239ee
      Jason Martinsen authored
      [ Upstream commit 15515aaa ]
      
      Current state for the lan78xx driver does not allow for changing the
      MAC address of the interface, without either removing the module (if
      you compiled it that way) or rebooting the machine.  If you attempt to
      change the MAC address, ifconfig will show the new address, however,
      the system/interface will not respond to any traffic using that
      configuration.  A few short-term options to work around this are to
      unload the module and reload it with the new MAC address, change the
      interface to "promisc", or reboot with the correct configuration to
      change the MAC.
      
      This patch enables the ability to change the MAC address via fairly normal means...
      ifdown <interface>
      modify entry in /etc/network/interfaces OR a similar method
      ifup <interface>
      Then test via any network communication, such as ICMP requests to gateway.
      
      My only test platform for this patch has been a raspberry pi model 3b+.
      Signed-off-by: default avatarJason Martinsen <jasonmartinsen@msn.com>
      
      -----
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9c5239ee
    • Anssi Hannula's avatar
      net: macb: fix dropped RX frames due to a race · 4d45ed2d
      Anssi Hannula authored
      [ Upstream commit 8159ecab ]
      
      Bit RX_USED set to 0 in the address field allows the controller to write
      data to the receive buffer descriptor.
      
      The driver does not ensure the ctrl field is ready (cleared) when the
      controller sees the RX_USED=0 written by the driver. The ctrl field might
      only be cleared after the controller has already updated it according to
      a newly received frame, causing the frame to be discarded in gem_rx() due
      to unexpected ctrl field contents.
      
      A message is logged when the above scenario occurs:
      
        macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor
      
      Fix the issue by ensuring that when the controller sees RX_USED=0 the
      ctrl field is already cleared.
      
      This issue was observed on a ZynqMP based system.
      
      Fixes: 4df95131 ("net/macb: change RX path for GEM")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Tested-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4d45ed2d
    • Anssi Hannula's avatar
      net: macb: fix random memory corruption on RX with 64-bit DMA · 358e56b2
      Anssi Hannula authored
      [ Upstream commit e100a897 ]
      
      64-bit DMA addresses are split in upper and lower halves that are
      written in separate fields on GEM. For RX, bit 0 of the address is used
      as the ownership bit (RX_USED). When the RX_USED bit is unset the
      controller is allowed to write data to the buffer.
      
      The driver does not guarantee that the controller already sees the upper
      half when the RX_USED bit is cleared, possibly resulting in the
      controller writing an incoming frame to an address with an incorrect
      upper half and therefore possibly corrupting unrelated system memory.
      
      Fix that by adding the necessary DMA memory barrier between the writes.
      
      This corruption was observed on a ZynqMP based system.
      
      Fixes: fff8019a ("net: macb: Add 64 bit addressing support for GEM")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Acked-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Tested-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      358e56b2
    • Dan Carpenter's avatar
      qed: Fix an error code qed_ll2_start_xmit() · 6de29b8c
      Dan Carpenter authored
      [ Upstream commit f07d4276 ]
      
      We accidentally deleted the code to set "rc = -ENOMEM;" and this patch
      adds it back.
      
      Fixes: d2201a21 ("qed: No need for LL2 frags indication")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6de29b8c
    • Trond Myklebust's avatar
      SUNRPC: Fix a race with XPRT_CONNECTING · 8265e34e
      Trond Myklebust authored
      [ Upstream commit cf76785d ]
      
      Ensure that we clear XPRT_CONNECTING before releasing the XPRT_LOCK so that
      we don't have races between the (asynchronous) socket setup code and
      tasks in xprt_connect().
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Tested-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8265e34e
    • Yonglong Liu's avatar
      net: hns: Fix ping failed when use net bridge and send multicast · 1eac41ba
      Yonglong Liu authored
      [ Upstream commit 6adafc35 ]
      
      Create a net bridge, add eth and vnet to the bridge. The vnet is used
      by a virtual machine. When ping the virtual machine from the outside
      host and the virtual machine send multicast at the same time, the ping
      package will lost.
      
      The multicast package send to the eth, eth will send it to the bridge too,
      and the bridge learn the mac of eth. When outside host ping the virtual
      mechine, it will match the promisc entry of the eth which is not expected,
      and the bridge send it to eth not to vnet, cause ping lost.
      
      So this patch change promisc tcam entry position to the END of 512 tcam
      entries, which indicate lower priority. And separate one promisc entry to
      two: mc & uc, to avoid package match the wrong tcam entry.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1eac41ba
    • Yonglong Liu's avatar
      net: hns: Add mac pcs config when enable|disable mac · 31212d94
      Yonglong Liu authored
      [ Upstream commit 726ae5c9 ]
      
      In some case, when mac enable|disable and adjust link, may cause hard to
      link(or abnormal) between mac and phy. This patch adds the code for rx PCS
      to avoid this bug.
      
      Disable the rx PCS when driver disable the gmac, and enable the rx PCS
      when driver enable the mac.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      31212d94
    • Yonglong Liu's avatar
      net: hns: Fix ntuple-filters status error. · 4f64bc8f
      Yonglong Liu authored
      [ Upstream commit 7e74a19c ]
      
      The ntuple-filters features is forced on by chip.
      But it shows "ntuple-filters: off [fixed]" when use ethtool.
      This patch make it correct with "ntuple-filters: on [fixed]".
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4f64bc8f
    • Yonglong Liu's avatar
      net: hns: Avoid net reset caused by pause frames storm · b984806d
      Yonglong Liu authored
      [ Upstream commit a57275d3 ]
      
      There will be a large number of MAC pause frames on the net,
      which caused tx timeout of net device. And then the net device
      was reset to try to recover it. So that is not useful, and will
      cause some other problems.
      
      So need doubled ndev->watchdog_timeo if device watchdog occurred
      until watchdog_timeo up to 40s and then try resetting to recover
      it.
      
      When collecting dfx information such as hardware registers when tx timeout.
      Some registers for count were cleared when read. So need move this task
      before update net state which also read the count registers.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b984806d
    • Yonglong Liu's avatar
      net: hns: Free irq when exit from abnormal branch · 02ad6ced
      Yonglong Liu authored
      [ Upstream commit c82bd077 ]
      
      1.In "hns_nic_init_irq", if request irq fail at index i,
        the function return directly without releasing irq resources
        that already requested.
      
      2.In "hns_nic_net_up" after "hns_nic_init_irq",
        if exceptional branch occurs, irqs that already requested
        are not release.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      02ad6ced
    • Yonglong Liu's avatar
      net: hns: Clean rx fbd when ae stopped. · 5ddca1a9
      Yonglong Liu authored
      [ Upstream commit 31f6b61d ]
      
      If there are packets in hardware when changing the speed or duplex,
      it may cause hardware hang up.
      
      This patch adds the code to wait rx fbd clean up when ae stopped.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5ddca1a9
    • Yonglong Liu's avatar
      net: hns: Fixed bug that netdev was opened twice · 1ef17642
      Yonglong Liu authored
      [ Upstream commit 5778b13b ]
      
      After resetting dsaf to try to repair chip error such as ecc error,
      the net device will be open if net interface is up. But at this time
      if there is the users set the net device up with the command ifconfig,
      the net device will be opened twice consecutively.
      
      Function napi_enable was called when open device. And Kernel panic will
      be occurred if it was called twice consecutively. Such as follow:
      static inline void napi_enable(struct napi_struct *n)
      {
               BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
               smp_mb__before_clear_bit();
               clear_bit(NAPI_STATE_SCHED, &n->state);
      }
      
      [37255.571996] Kernel panic - not syncing: BUG!
      [37255.595234] Call trace:
      [37255.597694] [<ffff80000008ab48>] dump_backtrace+0x0/0x1a0
      [37255.603114] [<ffff80000008ad08>] show_stack+0x20/0x28
      [37255.608187] [<ffff8000009c4944>] dump_stack+0x98/0xb8
      [37255.613258] [<ffff8000009c149c>] panic+0x10c/0x26c
      [37255.618070] [<ffff80000070f134>] hns_nic_net_up+0x30c/0x4e0
      [37255.623664] [<ffff80000070f39c>] hns_nic_net_open+0x94/0x12c
      [37255.629346] [<ffff80000084be78>] __dev_open+0xf4/0x168
      [37255.634504] [<ffff80000084c1ac>] __dev_change_flags+0x98/0x15c
      [37255.640359] [<ffff80000084c29c>] dev_change_flags+0x2c/0x68
      [37255.769580] [<ffff8000008dc400>] devinet_ioctl+0x650/0x704
      [37255.775086] [<ffff8000008ddc38>] inet_ioctl+0x98/0xb4
      [37255.780159] [<ffff800000827b7c>] sock_do_ioctl+0x44/0x84
      [37255.785490] [<ffff800000828e04>] sock_ioctl+0x248/0x30c
      [37255.790737] [<ffff80000026dc6c>] do_vfs_ioctl+0x480/0x618
      [37255.796156] [<ffff80000026de94>] SyS_ioctl+0x90/0xa4
      [37255.801139] SMP: stopping secondary CPUs
      [37255.805079] kbox: catch panic event.
      [37255.809586] collected_len = 128928, LOG_BUF_LEN_LOCAL = 131072
      [37255.816103] flush cache 0xffff80003f000000  size 0x800000
      [37255.822192] flush cache 0xffff80003f000000  size 0x800000
      [37255.828289] flush cache 0xffff80003f000000  size 0x800000
      [37255.834378] kbox: no notify die func register. no need to notify
      [37255.840413] ---[ end Kernel panic - not syncing: BUG!
      
      This patchset fix this bug according to the flag NIC_STATE_DOWN.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ef17642
    • Yonglong Liu's avatar
      net: hns: Some registers use wrong address according to the datasheet. · 5fed62f0
      Yonglong Liu authored
      [ Upstream commit 4ad26f11 ]
      
      According to the hip06 datasheet:
      1.Six registers use wrong address:
        RCB_COM_SF_CFG_INTMASK_RING
        RCB_COM_SF_CFG_RING_STS
        RCB_COM_SF_CFG_RING
        RCB_COM_SF_CFG_INTMASK_BD
        RCB_COM_SF_CFG_BD_RINT_STS
        DSAF_INODE_VC1_IN_PKT_NUM_0_REG
      2.The offset of DSAF_INODE_VC1_IN_PKT_NUM_0_REG should be
        0x103C + 0x80 * all_chn_num
      3.The offset to show the value of DSAF_INODE_IN_DATA_STP_DISC_0_REG
        is wrong, so the value of DSAF_INODE_SW_VLAN_TAG_DISC_0_REG will be
        overwrite
      
      These registers are only used in "ethtool -d", so that did not cause ndev
      to misfunction.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5fed62f0
    • Yonglong Liu's avatar
      net: hns: All ports can not work when insmod hns ko after rmmod. · 64917117
      Yonglong Liu authored
      [ Upstream commit 308c6caf ]
      
      There are two test cases:
      1. Remove the 4 modules:hns_enet_drv/hns_dsaf/hnae/hns_mdio,
         and install them again, must use "ifconfig down/ifconfig up"
         command pair to bring port to work.
      
         This patch calls phy_stop function when init phy to fix this bug.
      
      2. Remove the 2 modules:hns_enet_drv/hns_dsaf, and install them again,
         all ports can not use anymore, because of the phy devices register
         failed(phy devices already exists).
      
         Phy devices are registered when hns_dsaf installed, this patch
         removes them when hns_dsaf removed.
      
      The two cases are sometimes related, fixing the second case also requires
      fixing the first case, so fix them together.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      64917117
    • Yonglong Liu's avatar
      net: hns: Incorrect offset address used for some registers. · 9c97aac0
      Yonglong Liu authored
      [ Upstream commit 4e1d4be6 ]
      
      According to the hip06 Datasheet:
      1. The offset of INGRESS_SW_VLAN_TAG_DISC should be 0x1A00+4*all_chn_num
      2. The offset of INGRESS_IN_DATA_STP_DISC should be 0x1A50+4*all_chn_num
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9c97aac0
    • Arnd Bergmann's avatar
      w90p910_ether: remove incorrect __init annotation · 491fe3ac
      Arnd Bergmann authored
      [ Upstream commit 51367e42 ]
      
      The get_mac_address() function is normally inline, but when it is
      not, we get a warning that this configuration is broken:
      
      WARNING: vmlinux.o(.text+0x4aff00): Section mismatch in reference from the function w90p910_ether_setup() to the function .init.text:get_mac_address()
      The function w90p910_ether_setup() references
      the function __init get_mac_address().
      This is often because w90p910_ether_setup lacks a __init
      
      Remove the __init to make it always do the right thing.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      491fe3ac