1. 06 Aug, 2019 40 commits
    • Vlastimil Babka's avatar
      x86, mm, gup: prevent get_page() race with munmap in paravirt guest · d73af797
      Vlastimil Babka authored
      The x86 version of get_user_pages_fast() relies on disabled interrupts to
      synchronize gup_pte_range() between gup_get_pte(ptep); and get_page() against
      a parallel munmap. The munmap side nulls the pte, then flushes TLBs, then
      releases the page. As TLB flush is done synchronously via IPI disabling
      interrupts blocks the page release, and get_page(), which assumes existing
      reference on page, is thus safe.
      However when TLB flush is done by a hypercall, e.g. in a Xen PV guest, there is
      no blocking thanks to disabled interrupts, and get_page() can succeed on a page
      that was already freed or even reused.
      
      We have recently seen this happen with our 4.4 and 4.12 based kernels, with
      userspace (java) that exits a thread, where mm_release() performs a futex_wake()
      on tsk->clear_child_tid, and another thread in parallel unmaps the page where
      tsk->clear_child_tid points to. The spurious get_page() succeeds, but futex code
      immediately releases the page again, while it's already on a freelist. Symptoms
      include a bad page state warning, general protection faults acessing a poisoned
      list prev/next pointer in the freelist, or free page pcplists of two cpus joined
      together in a single list. Oscar has also reproduced this scenario, with a
      patch inserting delays before the get_page() to make the race window larger.
      
      Fix this by removing the dependency on TLB flush interrupts the same way as the
      generic get_user_pages_fast() code by using page_cache_add_speculative() and
      revalidating the PTE contents after pinning the page. Mainline is safe since
      4.13 where the x86 gup code was removed in favor of the common code. Accessing
      the page table itself safely also relies on disabled interrupts and TLB flush
      IPIs that don't happen with hypercalls, which was acknowledged in commit
      9e52fc2b ("x86/mm: Enable RCU based page table freeing
      (CONFIG_HAVE_RCU_TABLE_FREE=y)"). That commit with follups should also be
      backported for full safety, although our reproducer didn't hit a problem
      without that backport.
      Reproduced-by: default avatarOscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d73af797
    • Josh Poimboeuf's avatar
      objtool: Support GCC 9 cold subfunction naming scheme · 410b20dd
      Josh Poimboeuf authored
      commit bcb6fb5d upstream.
      
      Starting with GCC 8, a lot of unlikely code was moved out of line to
      "cold" subfunctions in .text.unlikely.
      
      For example, the unlikely bits of:
      
        irq_do_set_affinity()
      
      are moved out to the following subfunction:
      
        irq_do_set_affinity.cold.49()
      
      Starting with GCC 9, the numbered suffix has been removed.  So in the
      above example, the cold subfunction is instead:
      
        irq_do_set_affinity.cold()
      
      Tweak the objtool subfunction detection logic so that it detects both
      GCC 8 and GCC 9 naming schemes.
      Reported-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/015e9544b1f188d36a7f02fa31e9e95629aa5f50.1541040800.git.jpoimboe@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      410b20dd
    • Miguel Ojeda's avatar
      include/linux/module.h: copy __init/__exit attrs to init/cleanup_module · 2c34c215
      Miguel Ojeda authored
      commit a6e60d84 upstream.
      
      The upcoming GCC 9 release extends the -Wmissing-attributes warnings
      (enabled by -Wall) to C and aliases: it warns when particular function
      attributes are missing in the aliases but not in their target.
      
      In particular, it triggers for all the init/cleanup_module
      aliases in the kernel (defined by the module_init/exit macros),
      ending up being very noisy.
      
      These aliases point to the __init/__exit functions of a module,
      which are defined as __cold (among other attributes). However,
      the aliases themselves do not have the __cold attribute.
      
      Since the compiler behaves differently when compiling a __cold
      function as well as when compiling paths leading to calls
      to __cold functions, the warning is trying to point out
      the possibly-forgotten attribute in the alias.
      
      In order to keep the warning enabled, we decided to silence
      this case. Ideally, we would mark the aliases directly
      as __init/__exit. However, there are currently around 132 modules
      in the kernel which are missing __init/__exit in their init/cleanup
      functions (either because they are missing, or for other reasons,
      e.g. the functions being called from somewhere else); and
      a section mismatch is a hard error.
      
      A conservative alternative was to mark the aliases as __cold only.
      However, since we would like to eventually enforce __init/__exit
      to be always marked,  we chose to use the new __copy function
      attribute (introduced by GCC 9 as well to deal with this).
      With it, we copy the attributes used by the target functions
      into the aliases. This way, functions that were not marked
      as __init/__exit won't have their aliases marked either,
      and therefore there won't be a section mismatch.
      
      Note that the warning would go away marking either the extern
      declaration, the definition, or both. However, we only mark
      the definition of the alias, since we do not want callers
      (which only see the declaration) to be compiled as if the function
      was __cold (and therefore the paths leading to those calls
      would be assumed to be unlikely).
      
      Link: https://lore.kernel.org/lkml/259986242.BvXPX32bHu@devpool35/
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/lkml/20190123173707.GA16603@gmail.com/
      Link: https://lore.kernel.org/lkml/20190206175627.GA20399@gmail.com/Suggested-by: default avatarMartin Sebor <msebor@gcc.gnu.org>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c34c215
    • Miguel Ojeda's avatar
      Backport minimal compiler_attributes.h to support GCC 9 · fe584436
      Miguel Ojeda authored
      This adds support for __copy to v4.9.y so that we can use it in
      init/exit_module to avoid -Werror=missing-attributes errors on GCC 9.
      
      Link: https://lore.kernel.org/lkml/259986242.BvXPX32bHu@devpool35/
      Cc: <stable@vger.kernel.org>
      Suggested-by: default avatarRolf Eike Beer <eb@emlix.com>
      Signed-off-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe584436
    • Jean Delvare's avatar
      eeprom: at24: make spd world-readable again · 0b326994
      Jean Delvare authored
      commit 25e5ef30 upstream.
      
      The integration of the at24 driver into the nvmem framework broke the
      world-readability of spd EEPROMs. Fix it.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: stable@vger.kernel.org
      Fixes: 57d15550 ("eeprom: at24: extend driver to plug into the NVMEM framework")
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Bartosz Golaszewski <brgl@bgdev.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      [Bartosz: backported the patch to older branches]
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b326994
    • Andrea Arcangeli's avatar
      coredump: fix race condition between collapse_huge_page() and core dumping · d4fc64c9
      Andrea Arcangeli authored
      commit 59ea6d06 upstream.
      
      When fixing the race conditions between the coredump and the mmap_sem
      holders outside the context of the process, we focused on
      mmget_not_zero()/get_task_mm() callers in 04f5866e ("coredump: fix
      race condition between mmget_not_zero()/get_task_mm() and core
      dumping"), but those aren't the only cases where the mmap_sem can be
      taken outside of the context of the process as Michal Hocko noticed
      while backporting that commit to older -stable kernels.
      
      If mmgrab() is called in the context of the process, but then the
      mm_count reference is transferred outside the context of the process,
      that can also be a problem if the mmap_sem has to be taken for writing
      through that mm_count reference.
      
      khugepaged registration calls mmgrab() in the context of the process,
      but the mmap_sem for writing is taken later in the context of the
      khugepaged kernel thread.
      
      collapse_huge_page() after taking the mmap_sem for writing doesn't
      modify any vma, so it's not obvious that it could cause a problem to the
      coredump, but it happens to modify the pmd in a way that breaks an
      invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
      needs the mmap_sem for writing just to block concurrent page faults that
      call pmd_trans_huge_lock().
      
      Specifically the invariant that "!pmd_trans_huge()" cannot become a
      "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
      
      The coredump will call __get_user_pages() without mmap_sem for reading,
      which eventually can invoke a lockless page fault which will need a
      functional pmd_trans_huge_lock().
      
      So collapse_huge_page() needs to use mmget_still_valid() to check it's
      not running concurrently with the coredump...  as long as the coredump
      can invoke page faults without holding the mmap_sem for reading.
      
      This has "Fixes: khugepaged" to facilitate backporting, but in my view
      it's more a bug in the coredump code that will eventually have to be
      rewritten to stop invoking page faults without the mmap_sem for reading.
      So the long term plan is still to drop all mmget_still_valid().
      
      Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
      Fixes: ba76149f ("thp: khugepaged")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [Ajay: Just adjusted to apply on v4.9]
      Signed-off-by: default avatarAjay Kaher <akaher@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4fc64c9
    • Ajay Kaher's avatar
      infiniband: fix race condition between infiniband mlx4, mlx5 driver and core dumping · 91c3a6ce
      Ajay Kaher authored
      This patch is the extension of following upstream commit to fix
      the race condition between get_task_mm() and core dumping
      for IB->mlx4 and IB->mlx5 drivers:
      
      commit 04f5866e ("coredump: fix race condition between
      mmget_not_zero()/get_task_mm() and core dumping")'
      
      Thanks to Jason for pointing this.
      Signed-off-by: default avatarAjay Kaher <akaher@vmware.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91c3a6ce
    • Andrea Arcangeli's avatar
      coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping · 16903f1a
      Andrea Arcangeli authored
      commit 04f5866e upstream.
      
      The core dumping code has always run without holding the mmap_sem for
      writing, despite that is the only way to ensure that the entire vma
      layout will not change from under it.  Only using some signal
      serialization on the processes belonging to the mm is not nearly enough.
      This was pointed out earlier.  For example in Hugh's post from Jul 2017:
      
        https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
      
        "Not strictly relevant here, but a related note: I was very surprised
         to discover, only quite recently, how handle_mm_fault() may be called
         without down_read(mmap_sem) - when core dumping. That seems a
         misguided optimization to me, which would also be nice to correct"
      
      In particular because the growsdown and growsup can move the
      vm_start/vm_end the various loops the core dump does around the vma will
      not be consistent if page faults can happen concurrently.
      
      Pretty much all users calling mmget_not_zero()/get_task_mm() and then
      taking the mmap_sem had the potential to introduce unexpected side
      effects in the core dumping code.
      
      Adding mmap_sem for writing around the ->core_dump invocation is a
      viable long term fix, but it requires removing all copy user and page
      faults and to replace them with get_dump_page() for all binary formats
      which is not suitable as a short term fix.
      
      For the time being this solution manually covers the places that can
      confuse the core dump either by altering the vma layout or the vma flags
      while it runs.  Once ->core_dump runs under mmap_sem for writing the
      function mmget_still_valid() can be dropped.
      
      Allowing mmap_sem protected sections to run in parallel with the
      coredump provides some minor parallelism advantage to the swapoff code
      (which seems to be safe enough by never mangling any vma field and can
      keep doing swapins in parallel to the core dumping) and to some other
      corner case.
      
      In order to facilitate the backporting I added "Fixes: 86039bd3"
      however the side effect of this same race condition in /proc/pid/mem
      should be reproducible since before 2.6.12-rc2 so I couldn't add any
      other "Fixes:" because there's no hash beyond the git genesis commit.
      
      Because find_extend_vma() is the only location outside of the process
      context that could modify the "mm" structures under mmap_sem for
      reading, by adding the mmget_still_valid() check to it, all other cases
      that take the mmap_sem for reading don't need the new check after
      mmget_not_zero()/get_task_mm().  The expand_stack() in page fault
      context also doesn't need the new check, because all tasks under core
      dumping are frozen.
      
      Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
      Fixes: 86039bd3 ("userfaultfd: add new syscall to provide memory externalization")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarJann Horn <jannh@google.com>
      Suggested-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [akaher@vmware.com: stable 4.9 backport
      -  handle binder_update_page_range - mhocko@suse.com]
      Signed-off-by: default avatarAjay Kaher <akaher@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16903f1a
    • Yishai Hadas's avatar
      IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification · ed56c249
      Yishai Hadas authored
      commit b7165bd0 upstream.
      
      The specification for the Toeplitz function doesn't require to set the key
      explicitly to be symmetric. In case a symmetric functionality is required
      a symmetric key can be simply used.
      
      Wrongly forcing the algorithm to symmetric causes the wrong packet
      distribution and a performance degradation.
      
      Link: https://lore.kernel.org/r/20190723065733.4899-7-leon@kernel.org
      Cc: <stable@vger.kernel.org> # 4.7
      Fixes: 28d61370 ("IB/mlx5: Add RSS QP support")
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: default avatarAlex Vainman <alexv@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed56c249
    • Juergen Gross's avatar
      xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() · 436cd8a9
      Juergen Gross authored
      commit 50f6393f upstream.
      
      The condition in xen_swiotlb_free_coherent() for deciding whether to
      call xen_destroy_contiguous_region() is wrong: in case the region to
      be freed is not contiguous calling xen_destroy_contiguous_region() is
      the wrong thing to do: it would result in inconsistent mappings of
      multiple PFNs to the same MFN. This will lead to various strange
      crashes or data corruption.
      
      Instead of calling xen_destroy_contiguous_region() in that case a
      warning should be issued as that situation should never occur.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      436cd8a9
    • Will Deacon's avatar
      drivers/perf: arm_pmu: Fix failure path in PM notifier · 9224305c
      Will Deacon authored
      commit 0d7fd70f upstream.
      
      Handling of the CPU_PM_ENTER_FAILED transition in the Arm PMU PM
      notifier code incorrectly skips restoration of the counters. Fix the
      logic so that CPU_PM_ENTER_FAILED follows the same path as CPU_PM_EXIT.
      
      Cc: <stable@vger.kernel.org>
      Fixes: da4e4f18 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
      Reported-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Acked-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9224305c
    • Stefan Haberland's avatar
      s390/dasd: fix endless loop after read unit address configuration · 0a878192
      Stefan Haberland authored
      commit 41995342 upstream.
      
      After getting a storage server event that causes the DASD device driver
      to update its unit address configuration during a device shutdown there is
      the possibility of an endless loop in the device driver.
      
      In the system log there will be ongoing DASD error messages with RC: -19.
      
      The reason is that the loop starting the ruac request only terminates when
      the retry counter is decreased to 0. But in the sleep_on function there are
      early exit paths that do not decrease the retry counter.
      
      Prevent an endless loop by handling those cases separately.
      
      Remove the unnecessary do..while loop since the sleep_on function takes
      care of retries by itself.
      
      Fixes: 8e09f215 ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
      Cc: stable@vger.kernel.org # 2.6.25+
      Signed-off-by: default avatarStefan Haberland <sth@linux.ibm.com>
      Reviewed-by: default avatarJan Hoeppner <hoeppner@linux.ibm.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a878192
    • Ondrej Mosnacek's avatar
      selinux: fix memory leak in policydb_init() · ae190f04
      Ondrej Mosnacek authored
      commit 45385237 upstream.
      
      Since roles_init() adds some entries to the role hash table, we need to
      destroy also its keys/values on error, otherwise we get a memory leak in
      the error path.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: syzbot+fee3a14d4cdf92646287@syzkaller.appspotmail.com
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae190f04
    • Michael Wu's avatar
      gpiolib: fix incorrect IRQ requesting of an active-low lineevent · 7b185859
      Michael Wu authored
      commit 223ecaf1 upstream.
      
      When a pin is active-low, logical trigger edge should be inverted to match
      the same interrupt opportunity.
      
      For example, a button pushed triggers falling edge in ACTIVE_HIGH case; in
      ACTIVE_LOW case, the button pushed triggers rising edge. For user space the
      IRQ requesting doesn't need to do any modification except to configuring
      GPIOHANDLE_REQUEST_ACTIVE_LOW.
      
      For example, we want to catch the event when the button is pushed. The
      button on the original board drives level to be low when it is pushed, and
      drives level to be high when it is released.
      
      In user space we can do:
      
      	req.handleflags = GPIOHANDLE_REQUEST_INPUT;
      	req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;
      
      	while (1) {
      		read(fd, &dat, sizeof(dat));
      		if (dat.id == GPIOEVENT_EVENT_FALLING_EDGE)
      			printf("button pushed\n");
      	}
      
      Run the same logic on another board which the polarity of the button is
      inverted; it drives level to be high when pushed, and level to be low when
      released. For this inversion we add flag GPIOHANDLE_REQUEST_ACTIVE_LOW:
      
      	req.handleflags = GPIOHANDLE_REQUEST_INPUT |
      		GPIOHANDLE_REQUEST_ACTIVE_LOW;
      	req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;
      
      At the result, there are no any events caught when the button is pushed.
      By the way, button releasing will emit a "falling" event. The timing of
      "falling" catching is not expected.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichael Wu <michael.wu@vatics.com>
      Tested-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b185859
    • Douglas Anderson's avatar
      mmc: dw_mmc: Fix occasional hang after tuning on eMMC · 9d06bfcb
      Douglas Anderson authored
      commit ba2d139b upstream.
      
      In commit 46d17952 ("mmc: dw_mmc: Wait for data transfer after
      response errors.") we fixed a tuning-induced hang that I saw when
      stress testing tuning on certain SD cards.  I won't re-hash that whole
      commit, but the summary is that as a normal part of tuning you need to
      deal with transfer errors and there were cases where these transfer
      errors was putting my system into a bad state causing all future
      transfers to fail.  That commit fixed handling of the transfer errors
      for me.
      
      In downstream Chrome OS my fix landed and had the same behavior for
      all SD/MMC commands.  However, it looks like when the commit landed
      upstream we limited it to only SD tuning commands.  Presumably this
      was to try to get around problems that Alim Akhtar reported on exynos
      [1].
      
      Unfortunately while stress testing reboots (and suspend/resume) on
      some rk3288-based Chromebooks I found the same problem on the eMMC on
      some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
      tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
      vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
      same situation.
      
      I'm hoping that whatever problems exynos was having in the past are
      somehow magically fixed now and we can make the behavior the same for
      all commands.
      
      [1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com
      
      Fixes: 46d17952 ("mmc: dw_mmc: Wait for data transfer after response errors.")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Alim Akhtar <alim.akhtar@gmail.com>
      Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d06bfcb
    • Filipe Manana's avatar
      Btrfs: fix incremental send failure after deduplication · 09c63dcc
      Filipe Manana authored
      commit b4f9a1a8 upstream.
      
      When doing an incremental send operation we can fail if we previously did
      deduplication operations against a file that exists in both snapshots. In
      that case we will fail the send operation with -EIO and print a message
      to dmesg/syslog like the following:
      
        BTRFS error (device sdc): Send: inconsistent snapshot, found updated \
        extent for inode 257 without updated inode item, send root is 258, \
        parent root is 257
      
      This requires that we deduplicate to the same file in both snapshots for
      the same amount of times on each snapshot. The issue happens because a
      deduplication only updates the iversion of an inode and does not update
      any other field of the inode, therefore if we deduplicate the file on
      each snapshot for the same amount of time, the inode will have the same
      iversion value (stored as the "sequence" field on the inode item) on both
      snapshots, therefore it will be seen as unchanged between in the send
      snapshot while there are new/updated/deleted extent items when comparing
      to the parent snapshot. This makes the send operation return -EIO and
      print an error message.
      
      Example reproducer:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        # Create our first file. The first half of the file has several 64Kb
        # extents while the second half as a single 512Kb extent.
        $ xfs_io -f -s -c "pwrite -S 0xb8 -b 64K 0 512K" /mnt/foo
        $ xfs_io -c "pwrite -S 0xb8 512K 512K" /mnt/foo
      
        # Create the base snapshot and the parent send stream from it.
        $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
        $ btrfs send -f /tmp/1.snap /mnt/mysnap1
      
        # Create our second file, that has exactly the same data as the first
        # file.
        $ xfs_io -f -c "pwrite -S 0xb8 0 1M" /mnt/bar
      
        # Create the second snapshot, used for the incremental send, before
        # doing the file deduplication.
        $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
      
        # Now before creating the incremental send stream:
        #
        # 1) Deduplicate into a subrange of file foo in snapshot mysnap1. This
        #    will drop several extent items and add a new one, also updating
        #    the inode's iversion (sequence field in inode item) by 1, but not
        #    any other field of the inode;
        #
        # 2) Deduplicate into a different subrange of file foo in snapshot
        #    mysnap2. This will replace an extent item with a new one, also
        #    updating the inode's iversion by 1 but not any other field of the
        #    inode.
        #
        # After these two deduplication operations, the inode items, for file
        # foo, are identical in both snapshots, but we have different extent
        # items for this inode in both snapshots. We want to check this doesn't
        # cause send to fail with an error or produce an incorrect stream.
      
        $ xfs_io -r -c "dedupe /mnt/bar 0 0 512K" /mnt/mysnap1/foo
        $ xfs_io -r -c "dedupe /mnt/bar 512K 512K 512K" /mnt/mysnap2/foo
      
        # Create the incremental send stream.
        $ btrfs send -p /mnt/mysnap1 -f /tmp/2.snap /mnt/mysnap2
        ERROR: send ioctl failed with -5: Input/output error
      
      This issue started happening back in 2015 when deduplication was updated
      to not update the inode's ctime and mtime and update only the iversion.
      Back then we would hit a BUG_ON() in send, but later in 2016 send was
      updated to return -EIO and print the error message instead of doing the
      BUG_ON().
      
      A test case for fstests follows soon.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203933
      Fixes: 1c919a5e ("btrfs: don't update mtime/ctime on deduped inodes")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09c63dcc
    • Masahiro Yamada's avatar
      kbuild: initialize CLANG_FLAGS correctly in the top Makefile · 757ee027
      Masahiro Yamada authored
      commit 5241ab4c upstream.
      
      CLANG_FLAGS is initialized by the following line:
      
        CLANG_FLAGS     := --target=$(notdir $(CROSS_COMPILE:%-=%))
      
      ..., which is run only when CROSS_COMPILE is set.
      
      Some build targets (bindeb-pkg etc.) recurse to the top Makefile.
      
      When you build the kernel with Clang but without CROSS_COMPILE,
      the same compiler flags such as -no-integrated-as are accumulated
      into CLANG_FLAGS.
      
      If you run 'make CC=clang' and then 'make CC=clang bindeb-pkg',
      Kbuild will recompile everything needlessly due to the build command
      change.
      
      Fix this by correctly initializing CLANG_FLAGS.
      
      Fixes: 238bcbc4 ("kbuild: consolidate Clang compiler flags")
      Cc: <stable@vger.kernel.org> # v5.0+
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Acked-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      757ee027
    • Zhenzhong Duan's avatar
      x86, boot: Remove multiple copy of static function sanitize_boot_params() · b2ca435c
      Zhenzhong Duan authored
      [ Upstream commit 8c5477e8 ]
      
      Kernel build warns:
       'sanitize_boot_params' defined but not used [-Wunused-function]
      
      at below files:
        arch/x86/boot/compressed/cmdline.c
        arch/x86/boot/compressed/error.c
        arch/x86/boot/compressed/early_serial_console.c
        arch/x86/boot/compressed/acpi.c
      
      That's becausethey each include misc.h which includes a definition of
      sanitize_boot_params() via bootparam_utils.h.
      
      Remove the inclusion from misc.h and have the c file including
      bootparam_utils.h directly.
      Signed-off-by: default avatarZhenzhong Duan <zhenzhong.duan@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1563283092-1189-1-git-send-email-zhenzhong.duan@oracle.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      b2ca435c
    • Josh Poimboeuf's avatar
      x86/kvm: Don't call kvm_spurious_fault() from .fixup · 4d2bf579
      Josh Poimboeuf authored
      [ Upstream commit 3901336e ]
      
      After making a change to improve objtool's sibling call detection, it
      started showing the following warning:
      
        arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame
      
      The problem is the ____kvm_handle_fault_on_reboot() macro.  It does a
      fake call by pushing a fake RIP and doing a jump.  That tricks the
      unwinder into printing the function which triggered the exception,
      rather than the .fixup code.
      
      Instead of the hack to make it look like the original function made the
      call, just change the macro so that the original function actually does
      make the call.  This allows removal of the hack, and also makes objtool
      happy.
      
      I triggered a vmx instruction exception and verified that the stack
      trace is still sane:
      
        kernel BUG at arch/x86/kvm/x86.c:358!
        invalid opcode: 0000 [#1] SMP PTI
        CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16
        Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017
        RIP: 0010:kvm_spurious_fault+0x5/0x10
        Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
        RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246
        RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000
        RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0
        RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000
        R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0
        R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000
        FS:  00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        PKRU: 55555554
        Call Trace:
         loaded_vmcs_init+0x4f/0xe0
         alloc_loaded_vmcs+0x38/0xd0
         vmx_create_vcpu+0xf7/0x600
         kvm_vm_ioctl+0x5e9/0x980
         ? __switch_to_asm+0x40/0x70
         ? __switch_to_asm+0x34/0x70
         ? __switch_to_asm+0x40/0x70
         ? __switch_to_asm+0x34/0x70
         ? free_one_page+0x13f/0x4e0
         do_vfs_ioctl+0xa4/0x630
         ksys_ioctl+0x60/0x90
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x55/0x1c0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7fa349b1ee5b
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.1563413318.git.jpoimboe@redhat.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      4d2bf579
    • Kees Cook's avatar
      ipc/mqueue.c: only perform resource calculation if user valid · 62369d5c
      Kees Cook authored
      [ Upstream commit a318f12e ]
      
      Andreas Christoforou reported:
      
        UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow:
        9 * 2305843009213693951 cannot be represented in type 'long int'
        ...
        Call Trace:
          mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414
          evict+0x472/0x8c0 fs/inode.c:558
          iput_final fs/inode.c:1547 [inline]
          iput+0x51d/0x8c0 fs/inode.c:1573
          mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320
          mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459
          vfs_mkobj+0x39e/0x580 fs/namei.c:2892
          prepare_open ipc/mqueue.c:731 [inline]
          do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771
      
      Which could be triggered by:
      
              struct mq_attr attr = {
                      .mq_flags = 0,
                      .mq_maxmsg = 9,
                      .mq_msgsize = 0x1fffffffffffffff,
                      .mq_curmsgs = 0,
              };
      
              if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1)
                      perror("mq_open");
      
      mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and
      preparing to return -EINVAL.  During the cleanup, it calls
      mqueue_evict_inode() which performed resource usage tracking math for
      updating "user", before checking if there was a valid "user" at all
      (which would indicate that the calculations would be sane).  Instead,
      delay this check to after seeing a valid "user".
      
      The overflow was real, but the results went unused, so while the flaw is
      harmless, it's noisy for kernel fuzzers, so just fix it by moving the
      calculation under the non-NULL "user" where it actually gets used.
      
      Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescookSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarAndreas Christoforou <andreaschristofo@gmail.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      62369d5c
    • Dan Carpenter's avatar
      drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings · bcebb441
      Dan Carpenter authored
      [ Upstream commit 156e0b1a ]
      
      The dev_info.name[] array has space for RIO_MAX_DEVNAME_SZ + 1
      characters.  But the problem here is that we don't ensure that the user
      put a NUL terminator on the end of the string.  It could lead to an out
      of bounds read.
      
      Link: http://lkml.kernel.org/r/20190529110601.GB19119@mwanda
      Fixes: e8de3701 ("rapidio: add mport char device driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarAlexandre Bounine <alex.bou9@gmail.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bcebb441
    • Mikko Rapeli's avatar
      uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers · 317fc4dd
      Mikko Rapeli authored
      [ Upstream commit f90fb3c7 ]
      
      Only users of upc_req in kernel side fs/coda/psdev.c and
      fs/coda/upcall.c already include linux/coda_psdev.h.
      
      Suggested by Jan Harkes <jaharkes@cs.cmu.edu> in
        https://lore.kernel.org/lkml/20150531111913.GA23377@cs.cmu.edu/
      
      Fixes these include/uapi/linux/coda_psdev.h compilation errors in userspace:
      
        linux/coda_psdev.h:12:19: error: field `uc_chain' has incomplete type
        struct list_head    uc_chain;
                         ^
        linux/coda_psdev.h:13:2: error: unknown type name `caddr_t'
        caddr_t             uc_data;
        ^
        linux/coda_psdev.h:14:2: error: unknown type name `u_short'
        u_short             uc_flags;
        ^
        linux/coda_psdev.h:15:2: error: unknown type name `u_short'
        u_short             uc_inSize;  /* Size is at most 5000 bytes */
        ^
        linux/coda_psdev.h:16:2: error: unknown type name `u_short'
        u_short             uc_outSize;
        ^
        linux/coda_psdev.h:17:2: error: unknown type name `u_short'
        u_short             uc_opcode;  /* copied from data to save lookup */
        ^
        linux/coda_psdev.h:19:2: error: unknown type name `wait_queue_head_t'
        wait_queue_head_t   uc_sleep;   /* process' wait queue */
        ^
      
      Link: http://lkml.kernel.org/r/9f99f5ce6a0563d5266e6cf7aa9585aac2cae971.1558117389.git.jaharkes@cs.cmu.eduSigned-off-by: default avatarMikko Rapeli <mikko.rapeli@iki.fi>
      Signed-off-by: default avatarJan Harkes <jaharkes@cs.cmu.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Sam Protsenko <semen.protsenko@linaro.org>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      317fc4dd
    • Sam Protsenko's avatar
      coda: fix build using bare-metal toolchain · 90320a50
      Sam Protsenko authored
      [ Upstream commit b2a57e33 ]
      
      The kernel is self-contained project and can be built with bare-metal
      toolchain.  But bare-metal toolchain doesn't define __linux__.  Because
      of this u_quad_t type is not defined when using bare-metal toolchain and
      codafs build fails.  This patch fixes it by defining u_quad_t type
      unconditionally.
      
      Link: http://lkml.kernel.org/r/3cbb40b0a57b6f9923a9d67b53473c0b691a3eaa.1558117389.git.jaharkes@cs.cmu.eduSigned-off-by: default avatarSam Protsenko <semen.protsenko@linaro.org>
      Signed-off-by: default avatarJan Harkes <jaharkes@cs.cmu.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Cc: Zhouyang Jia <jiazhouyang09@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      90320a50
    • Zhouyang Jia's avatar
      coda: add error handling for fget · ca62806b
      Zhouyang Jia authored
      [ Upstream commit 02551c23 ]
      
      When fget fails, the lack of error-handling code may cause unexpected
      results.
      
      This patch adds error-handling code after calling fget.
      
      Link: http://lkml.kernel.org/r/2514ec03df9c33b86e56748513267a80dd8004d9.1558117389.git.jaharkes@cs.cmu.eduSigned-off-by: default avatarZhouyang Jia <jiazhouyang09@gmail.com>
      Signed-off-by: default avatarJan Harkes <jaharkes@cs.cmu.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Colin Ian King <colin.king@canonical.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
      Cc: Sam Protsenko <semen.protsenko@linaro.org>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ca62806b
    • Doug Berger's avatar
      mm/cma.c: fail if fixed declaration can't be honored · ad204110
      Doug Berger authored
      [ Upstream commit c633324e ]
      
      The description of cma_declare_contiguous() indicates that if the
      'fixed' argument is true the reserved contiguous area must be exactly at
      the address of the 'base' argument.
      
      However, the function currently allows the 'base', 'size', and 'limit'
      arguments to be silently adjusted to meet alignment constraints.  This
      commit enforces the documented behavior through explicit checks that
      return an error if the region does not fit within a specified region.
      
      Link: http://lkml.kernel.org/r/1561422051-16142-1-git-send-email-opendmb@gmail.com
      Fixes: 5ea3b1b2 ("cma: add placement specifier for "cma=" kernel parameter")
      Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
      Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Cc: Yue Hu <huyue2@yulong.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Peng Fan <peng.fan@nxp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ad204110
    • Arnd Bergmann's avatar
      x86: math-emu: Hide clang warnings for 16-bit overflow · bf4e8f2a
      Arnd Bergmann authored
      [ Upstream commit 29e7e966 ]
      
      clang warns about a few parts of the math-emu implementation
      where a 16-bit integer becomes negative during assignment:
      
      arch/x86/math-emu/poly_tan.c:88:35: error: implicit conversion from 'int' to 'short' changes value from 49216 to -16320 [-Werror,-Wconstant-conversion]
                                            (0x41 + EXTENDED_Ebias) | SIGN_Negative);
                                            ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
      arch/x86/math-emu/fpu_emu.h:180:58: note: expanded from macro 'setexponent16'
       #define setexponent16(x,y)  { (*(short *)&((x)->exp)) = (y); }
                                                            ~  ^
      arch/x86/math-emu/reg_constant.c:37:32: error: implicit conversion from 'int' to 'short' changes value from 49085 to -16451 [-Werror,-Wconstant-conversion]
      FPU_REG const CONST_PI2extra = MAKE_REG(NEG, -66,
                                     ^~~~~~~~~~~~~~~~~~
      arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG'
                      ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) }
                       ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
      arch/x86/math-emu/reg_constant.c:48:28: error: implicit conversion from 'int' to 'short' changes value from 65535 to -1 [-Werror,-Wconstant-conversion]
      FPU_REG const CONST_QNaN = MAKE_REG(NEG, EXP_OVER, 0x00000000, 0xC0000000);
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG'
                      ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) }
                       ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
      
      The code is correct as is, so add a typecast to shut up the warnings.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190712090816.350668-1-arnd@arndb.deSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf4e8f2a
    • Qian Cai's avatar
      x86/apic: Silence -Wtype-limits compiler warnings · d4ce30c9
      Qian Cai authored
      [ Upstream commit ec633558 ]
      
      There are many compiler warnings like this,
      
      In file included from ./arch/x86/include/asm/smp.h:13,
                       from ./arch/x86/include/asm/mmzone_64.h:11,
                       from ./arch/x86/include/asm/mmzone.h:5,
                       from ./include/linux/mmzone.h:969,
                       from ./include/linux/gfp.h:6,
                       from ./include/linux/mm.h:10,
                       from arch/x86/kernel/apic/io_apic.c:34:
      arch/x86/kernel/apic/io_apic.c: In function 'check_timer':
      ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
      expression >= 0 is always true [-Wtype-limits]
         if ((v) <= apic_verbosity) \
                 ^~
      arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro
      'apic_printk'
        apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X "
        ^~~~~~~~~~~
      ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
      expression >= 0 is always true [-Wtype-limits]
         if ((v) <= apic_verbosity) \
                 ^~
      arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro
      'apic_printk'
          apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: "
          ^~~~~~~~~~~
      
      APIC_QUIET is 0, so silence them by making apic_verbosity type int.
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pwSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      d4ce30c9
    • Benjamin Poirier's avatar
      be2net: Signal that the device cannot transmit during reconfiguration · 0b4f4a4b
      Benjamin Poirier authored
      [ Upstream commit 7429c6c0 ]
      
      While changing the number of interrupt channels, be2net stops adapter
      operation (including netif_tx_disable()) but it doesn't signal that it
      cannot transmit. This may lead dev_watchdog() to falsely trigger during
      that time.
      
      Add the missing call to netif_carrier_off(), following the pattern used in
      many other drivers. netif_carrier_on() is already taken care of in
      be_open().
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0b4f4a4b
    • Arnd Bergmann's avatar
      ACPI: fix false-positive -Wuninitialized warning · 0153dbcb
      Arnd Bergmann authored
      [ Upstream commit dfd6f9ad ]
      
      clang gets confused by an uninitialized variable in what looks
      to it like a never executed code path:
      
      arch/x86/kernel/acpi/boot.c:618:13: error: variable 'polarity' is uninitialized when used here [-Werror,-Wuninitialized]
              polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH;
                         ^~~~~~~~
      arch/x86/kernel/acpi/boot.c:606:32: note: initialize the variable 'polarity' to silence this warning
              int rc, irq, trigger, polarity;
                                            ^
                                             = 0
      arch/x86/kernel/acpi/boot.c:617:12: error: variable 'trigger' is uninitialized when used here [-Werror,-Wuninitialized]
              trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
                        ^~~~~~~
      arch/x86/kernel/acpi/boot.c:606:22: note: initialize the variable 'trigger' to silence this warning
              int rc, irq, trigger, polarity;
                                  ^
                                   = 0
      
      This is unfortunately a design decision in clang and won't be fixed.
      
      Changing the acpi_get_override_irq() macro to an inline function
      reliably avoids the issue.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0153dbcb
    • Benjamin Block's avatar
      scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized · a0e456f0
      Benjamin Block authored
      [ Upstream commit 48464708 ]
      
      GCC v9 emits this warning:
            CC      drivers/s390/scsi/zfcp_erp.o
          drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue':
          drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized]
            217 |  struct zfcp_erp_action *erp_action;
                |                          ^~~~~~~~~~
      
      This is a possible false positive case, as also documented in the GCC
      documentations:
          https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uninitialized
      
      The actual code-sequence is like this:
          Various callers can invoke the function below with the argument "want"
          being one of:
          ZFCP_ERP_ACTION_REOPEN_ADAPTER,
          ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
          ZFCP_ERP_ACTION_REOPEN_PORT, or
          ZFCP_ERP_ACTION_REOPEN_LUN.
      
          zfcp_erp_action_enqueue(want, ...)
              ...
              need = zfcp_erp_required_act(want, ...)
                  need = want
                  ...
                  maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT
                  maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER
                  ...
                  return need
              ...
              zfcp_erp_setup_act(need, ...)
                  struct zfcp_erp_action *erp_action; // <== line 217
                  ...
                  switch(need) {
                  case ZFCP_ERP_ACTION_REOPEN_LUN:
                          ...
                          erp_action = &zfcp_sdev->erp_action;
                          WARN_ON_ONCE(erp_action->port != port); // <== access
                          ...
                          break;
                  case ZFCP_ERP_ACTION_REOPEN_PORT:
                  case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
                          ...
                          erp_action = &port->erp_action;
                          WARN_ON_ONCE(erp_action->port != port); // <== access
                          ...
                          break;
                  case ZFCP_ERP_ACTION_REOPEN_ADAPTER:
                          ...
                          erp_action = &adapter->erp_action;
                          WARN_ON_ONCE(erp_action->port != NULL); // <== access
                          ...
                          break;
                  }
                  ...
                  WARN_ON_ONCE(erp_action->adapter != adapter); // <== access
      
      When zfcp_erp_setup_act() is called, 'need' will never be anything else
      than one of the 4 possible enumeration-names that are used in the
      switch-case, and 'erp_action' is initialized for every one of them, before
      it is used. Thus the warning is a false positive, as documented.
      
      We introduce the extra if{} in the beginning to create an extra code-flow,
      so the compiler can be convinced that the switch-case will never see any
      other value.
      
      BUG_ON()/BUG() is intentionally not used to not crash anything, should
      this ever happen anyway - right now it's impossible, as argued above; and
      it doesn't introduce a 'default:' switch-case to retain warnings should
      'enum zfcp_erp_act_type' ever be extended and no explicit case be
      introduced. See also v5.0 commit 399b6c8b ("scsi: zfcp: drop old
      default switch case which might paper over missing case").
      Signed-off-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Reviewed-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a0e456f0
    • Jeff Layton's avatar
      ceph: return -ERANGE if virtual xattr value didn't fit in buffer · 679ff6a3
      Jeff Layton authored
      [ Upstream commit 3b421018 ]
      
      The getxattr manpage states that we should return ERANGE if the
      destination buffer size is too small to hold the value.
      ceph_vxattrcb_layout does this internally, but we should be doing
      this for all vxattrs.
      
      Fix the only caller of getxattr_cb to check the returned size
      against the buffer length and return -ERANGE if it doesn't fit.
      Drop the same check in ceph_vxattrcb_layout and just rely on the
      caller to handle it.
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Acked-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      679ff6a3
    • Andrea Parri's avatar
      ceph: fix improper use of smp_mb__before_atomic() · 4e064062
      Andrea Parri authored
      [ Upstream commit 74960773 ]
      
      This barrier only applies to the read-modify-write operations; in
      particular, it does not apply to the atomic64_set() primitive.
      
      Replace the barrier with an smp_mb().
      
      Fixes: fdd4e158 ("ceph: rework dcache readdir")
      Reported-by: default avatar"Paul E. McKenney" <paulmck@linux.ibm.com>
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Reviewed-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4e064062
    • David Sterba's avatar
      btrfs: fix minimum number of chunk errors for DUP · 44f7521a
      David Sterba authored
      [ Upstream commit 0ee5f8ae ]
      
      The list of profiles in btrfs_chunk_max_errors lists DUP as a profile
      DUP able to tolerate 1 device missing. Though this profile is special
      with 2 copies, it still needs the device, unlike the others.
      
      Looking at the history of changes, thre's no clear reason why DUP is
      there, functions were refactored and blocks of code merged to one
      helper.
      
      d20983b4 Btrfs: fix writing data into the seed filesystem
        - factor code to a helper
      
      de11cc12 Btrfs: don't pre-allocate btrfs bio
        - unrelated change, DUP still in the list with max errors 1
      
      a236aed1 Btrfs: Deal with failed writes in mirrored configurations
        - introduced the max errors, leaves DUP and RAID1 in the same group
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      44f7521a
    • Russell King's avatar
      fs/adfs: super: fix use-after-free bug · 820402d2
      Russell King authored
      [ Upstream commit 5808b14a ]
      
      Fix a use-after-free bug during filesystem initialisation, where we
      access the disc record (which is stored in a buffer) after we have
      released the buffer.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      820402d2
    • Geert Uytterhoeven's avatar
      dmaengine: rcar-dmac: Reject zero-length slave DMA requests · 1235f5e0
      Geert Uytterhoeven authored
      [ Upstream commit 78efb76a ]
      
      While the .device_prep_slave_sg() callback rejects empty scatterlists,
      it still accepts single-entry scatterlists with a zero-length segment.
      These may happen if a driver calls dmaengine_prep_slave_single() with a
      zero len parameter.  The corresponding DMA request will never complete,
      leading to messages like:
      
          rcar-dmac e7300000.dma-controller: Channel Address Error happen
      
      and DMA timeouts.
      
      Although requesting a zero-length DMA request is a driver bug, rejecting
      it early eases debugging.  Note that the .device_prep_dma_memcpy()
      callback already rejects requests to copy zero bytes.
      Reported-by: default avatarEugeniu Rosca <erosca@de.adit-jv.com>
      Analyzed-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1235f5e0
    • Petr Cvek's avatar
      MIPS: lantiq: Fix bitfield masking · f1741424
      Petr Cvek authored
      [ Upstream commit ba1bc0fc ]
      
      The modification of EXIN register doesn't clean the bitfield before
      the writing of a new value. After a few modifications the bitfield would
      accumulate only '1's.
      Signed-off-by: default avatarPetr Cvek <petrcvekcz@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: hauke@hauke-m.de
      Cc: john@phrozen.org
      Cc: linux-mips@vger.kernel.org
      Cc: openwrt-devel@lists.openwrt.org
      Cc: pakahmar@hotmail.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f1741424
    • Prarit Bhargava's avatar
      kernel/module.c: Only return -EEXIST for modules that have finished loading · 10430769
      Prarit Bhargava authored
      [ Upstream commit 6e6de3de ]
      
      Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and
      linux guests boot with repeated errors:
      
      amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
      amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
      amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
      amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
      amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
      amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
      
      The warnings occur because the module code erroneously returns -EEXIST
      for modules that have failed to load and are in the process of being
      removed from the module list.
      
      module amd64_edac_mod has a dependency on module edac_mce_amd.  Using
      modules.dep, systemd will load edac_mce_amd for every request of
      amd64_edac_mod.  When the edac_mce_amd module loads, the module has
      state MODULE_STATE_UNFORMED and once the module load fails and the state
      becomes MODULE_STATE_GOING.  Another request for edac_mce_amd module
      executes and add_unformed_module() will erroneously return -EEXIST even
      though the previous instance of edac_mce_amd has MODULE_STATE_GOING.
      Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which
      fails because of unknown symbols from edac_mce_amd.
      
      add_unformed_module() must wait to return for any case other than
      MODULE_STATE_LIVE to prevent a race between multiple loads of
      dependent modules.
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: default avatarBarret Rhoden <brho@google.com>
      Cc: David Arcari <darcari@redhat.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      10430769
    • Cheng Jian's avatar
      ftrace: Enable trampoline when rec count returns back to one · 81c09bab
      Cheng Jian authored
      [ Upstream commit a124692b ]
      
      Custom trampolines can only be enabled if there is only a single ops
      attached to it. If there's only a single callback registered to a function,
      and the ops has a trampoline registered for it, then we can call the
      trampoline directly. This is very useful for improving the performance of
      ftrace and livepatch.
      
      If more than one callback is registered to a function, the general
      trampoline is used, and the custom trampoline is not restored back to the
      direct call even if all the other callbacks were unregistered and we are
      back to one callback for the function.
      
      To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented
      to one, and the ops that left has a trampoline.
      
      Testing After this patch :
      
      insmod livepatch_unshare_files.ko
      cat /sys/kernel/debug/tracing/enabled_functions
      
      	unshare_files (1) R I	tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
      
      echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter
      echo function > /sys/kernel/debug/tracing/current_tracer
      cat /sys/kernel/debug/tracing/enabled_functions
      
      	unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150
      
      echo nop > /sys/kernel/debug/tracing/current_tracer
      cat /sys/kernel/debug/tracing/enabled_functions
      
      	unshare_files (1) R I	tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
      
      Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@huawei.comSigned-off-by: default avatarCheng Jian <cj.chengjian@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      81c09bab
    • Douglas Anderson's avatar
      ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend · 614e14d6
      Douglas Anderson authored
      [ Upstream commit 8ef1ba39 ]
      
      This is similar to commit e6186820 ("arm64: dts: rockchip: Arch
      counter doesn't tick in system suspend").  Specifically on the rk3288
      it can be seen that the timer stops ticking in suspend if we end up
      running through the "osc_disable" path in rk3288_slp_mode_set().  In
      that path the 24 MHz clock will turn off and the timer stops.
      
      To test this, I ran this on a Chrome OS filesystem:
        before=$(date); \
        suspend_stress_test -c1 --suspend_min=30 --suspend_max=31; \
        echo ${before}; date
      
      ...and I found that unless I plug in a device that requests USB wakeup
      to be active that the two calls to "date" would show that fewer than
      30 seconds passed.
      
      NOTE: deep suspend (where the 24 MHz clock gets disabled) isn't
      supported yet on upstream Linux so this was tested on a downstream
      kernel.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      614e14d6
    • Douglas Anderson's avatar
      ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again · 2b0a7453
      Douglas Anderson authored
      [ Upstream commit 99fa0667 ]
      
      When I try to boot rk3288-veyron-mickey I totally fail to make the
      eMMC work.  Specifically my logs (on Chrome OS 4.19):
      
        mmc_host mmc1: card is non-removable.
        mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
        mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0)
        mmc1: switch to bus width 8 failed
        mmc1: switch to bus width 4 failed
        mmc1: new high speed MMC card at address 0001
        mmcblk1: mmc1:0001 HAG2e 14.7 GiB
        mmcblk1boot0: mmc1:0001 HAG2e partition 1 4.00 MiB
        mmcblk1boot1: mmc1:0001 HAG2e partition 2 4.00 MiB
        mmcblk1rpmb: mmc1:0001 HAG2e partition 3 4.00 MiB, chardev (243:0)
        mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
        mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0)
        mmc1: switch to bus width 8 failed
        mmc1: switch to bus width 4 failed
        mmc1: tried to HW reset card, got error -110
        mmcblk1: error -110 requesting status
        mmcblk1: recovery failed!
        print_req_error: I/O error, dev mmcblk1, sector 0
        ...
      
      When I remove the '/delete-property/mmc-hs200-1_8v' then everything is
      hunky dory.
      
      That line comes from the original submission of the mickey dts
      upstream, so presumably at the time the HS200 was failing and just
      enumerating things as a high speed device was fine.  ...or maybe it's
      just that some mickey devices work when enumerating at "high speed",
      just not mine?
      
      In any case, hs200 seems good now.  Let's turn it on.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2b0a7453