1. 13 Jun, 2017 4 commits
  2. 08 Jun, 2017 36 commits
    • Yisheng Xie's avatar
      mlock: fix mlock count can not decrease in race condition · 00fc586e
      Yisheng Xie authored
      [ Upstream commit 70feee0e ]
      
      Kefeng reported that when running the follow test, the mlock count in
      meminfo will increase permanently:
      
       [1] testcase
       linux:~ # cat test_mlockal
       grep Mlocked /proc/meminfo
        for j in `seq 0 10`
        do
       	for i in `seq 4 15`
       	do
       		./p_mlockall >> log &
       	done
       	sleep 0.2
       done
       # wait some time to let mlock counter decrease and 5s may not enough
       sleep 5
       grep Mlocked /proc/meminfo
      
       linux:~ # cat p_mlockall.c
       #include <sys/mman.h>
       #include <stdlib.h>
       #include <stdio.h>
      
       #define SPACE_LEN	4096
      
       int main(int argc, char ** argv)
       {
      	 	int ret;
      	 	void *adr = malloc(SPACE_LEN);
      	 	if (!adr)
      	 		return -1;
      
      	 	ret = mlockall(MCL_CURRENT | MCL_FUTURE);
      	 	printf("mlcokall ret = %d\n", ret);
      
      	 	ret = munlockall();
      	 	printf("munlcokall ret = %d\n", ret);
      
      	 	free(adr);
      	 	return 0;
      	 }
      
      In __munlock_pagevec() we should decrement NR_MLOCK for each page where
      we clear the PageMlocked flag.  Commit 1ebb7cc6 ("mm: munlock: batch
      NR_MLOCK zone state updates") has introduced a bug where we don't
      decrement NR_MLOCK for pages where we clear the flag, but fail to
      isolate them from the lru list (e.g.  when the pages are on some other
      cpu's percpu pagevec).  Since PageMlocked stays cleared, the NR_MLOCK
      accounting gets permanently disrupted by this.
      
      Fix it by counting the number of page whose PageMlock flag is cleared.
      
      Fixes: 1ebb7cc6 (" mm: munlock: batch NR_MLOCK zone state updates")
      Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.comSigned-off-by: default avatarYisheng Xie <xieyisheng1@huawei.com>
      Reported-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Tested-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: zhongjiang <zhongjiang@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      00fc586e
    • Naoya Horiguchi's avatar
      mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling · 00136071
      Naoya Horiguchi authored
      [ Upstream commit ead07f6a ]
      
      memory_failure() can run in 2 different mode (specified by
      MF_COUNT_INCREASED) in page refcount perspective.  When
      MF_COUNT_INCREASED is set, memory_failure() assumes that the caller
      takes a refcount of the target page.  And if cleared, memory_failure()
      takes it in it's own.
      
      In current code, however, refcounting is done differently in each caller.
      For example, madvise_hwpoison() uses get_user_pages_fast() and
      hwpoison_inject() uses get_page_unless_zero().  So this inconsistent
      refcounting causes refcount failure especially for thp tail pages.
      Typical user visible effects are like memory leak or
      VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page().
      
      To fix this refcounting issue, this patch introduces get_hwpoison_page()
      to handle thp tail pages in the same manner for each caller of hwpoison
      code.
      
      memory_failure() might fail to split thp and in such case it returns
      without completing page isolation.  This is not good because PageHWPoison
      on the thp is still set and there's no easy way to unpoison such thps.  So
      this patch try to roll back any action to the thp in "non anonymous thp"
      case and "thp split failed" case, expecting an MCE(SRAR) generated by
      later access afterward will properly free such thps.
      
      [akpm@linux-foundation.org: fix CONFIG_HWPOISON_INJECT=m]
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      00136071
    • Naoya Horiguchi's avatar
      mm/memory-failure: split thp earlier in memory error handling · da7cbd0c
      Naoya Horiguchi authored
      [ Upstream commit 415c64c1 ]
      
      memory_failure() doesn't handle thp itself at this time and need to split
      it before doing isolation.  Currently thp is split in the middle of
      hwpoison_user_mappings(), but there're corner cases where memory_failure()
      wrongly tries to handle thp without splitting.
      
      1) "non anonymous" thp, which is not a normal operating mode of thp,
         but a memory error could hit a thp before anon_vma is initialized.  In
         such case, split_huge_page() fails and me_huge_page() (intended for
         hugetlb) is called for thp, which triggers BUG_ON in page_hstate().
      
      2) !PageLRU case, where hwpoison_user_mappings() returns with
         SWAP_SUCCESS and the result is the same as case 1.
      
      memory_failure() can't avoid splitting, so let's split it more earlier,
      which also reduces code which are prepared for both of normal page and
      thp.
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      da7cbd0c
    • Thomas Gleixner's avatar
      slub/memcg: cure the brainless abuse of sysfs attributes · aeb3435b
      Thomas Gleixner authored
      [ Upstream commit 478fe303 ]
      
      memcg_propagate_slab_attrs() abuses the sysfs attribute file functions
      to propagate settings from the root kmem_cache to a newly created
      kmem_cache.  It does that with:
      
           attr->show(root, buf);
           attr->store(new, buf, strlen(bug);
      
      Aside of being a lazy and absurd hackery this is broken because it does
      not check the return value of the show() function.
      
      Some of the show() functions return 0 w/o touching the buffer.  That
      means in such a case the store function is called with the stale content
      of the previous show().  That causes nonsense like invoking
      kmem_cache_shrink() on a newly created kmem_cache.  In the worst case it
      would cause handing in an uninitialized buffer.
      
      This should be rewritten proper by adding a propagate() callback to
      those slub_attributes which must be propagated and avoid that insane
      conversion to and from ASCII, but that's too large for a hot fix.
      
      Check at least the return value of the show() function, so calling
      store() with stale content is prevented.
      
      Steven said:
       "It can cause a deadlock with get_online_cpus() that has been uncovered
        by recent cpu hotplug and lockdep changes that Thomas and Peter have
        been doing.
      
           Possible unsafe locking scenario:
      
                 CPU0                    CPU1
                 ----                    ----
            lock(cpu_hotplug.lock);
                                         lock(slab_mutex);
                                         lock(cpu_hotplug.lock);
            lock(slab_mutex);
      
           *** DEADLOCK ***"
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705201244540.2255@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      aeb3435b
    • Tejun Heo's avatar
      blkcg: use blkg_free() in blkcg_init_queue() failure path · afc6ec14
      Tejun Heo authored
      [ Upstream commit 994b7832 ]
      
      When blkcg_init_queue() fails midway after creating a new blkg, it
      performs kfree() directly; however, this doesn't free the policy data
      areas.  Make it use blkg_free() instead.  In turn, blkg_free() is
      updated to handle root request_list special case.
      
      While this fixes a possible memory leak, it's on an unlikely failure
      path of an already cold path and the size leaked per occurrence is
      miniscule too.  I don't think it needs to be tagged for -stable.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      afc6ec14
    • Tejun Heo's avatar
      blkcg: always create the blkcg_gq for the root blkcg · f9fac98f
      Tejun Heo authored
      [ Upstream commit ec13b1d6 ]
      
      Currently, blkcg does a minor optimization where the root blkcg is
      created when the first blkcg policy is activated on a queue and
      destroyed on the deactivation of the last.  On systems where blkcg is
      configured but not used, this saves one blkcg_gq struct per queue.  On
      systems where blkcg is actually used, there's no difference.  The only
      case where this can lead to any meaninful, albeit still minute, save
      in memory consumption is when all blkcg policies are deactivated after
      being widely used in the system, which is a hihgly unlikely scenario.
      
      The conditional existence of root blkcg_gq has already created several
      bugs in blkcg and became an issue once again for the new per-cgroup
      wb_congested mechanism for cgroup writeback support leading to a NULL
      dereference when no blkcg policy is active.  This is really not worth
      bothering with.  This patch makes blkcg always allocate and link the
      root blkcg_gq and release it only on queue destruction.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      f9fac98f
    • Herbert Xu's avatar
      iscsi-target: Use shash and ahash · 712b6a6d
      Herbert Xu authored
      [ Upstream commit 69110e3c ]
      
      This patch replaces uses of the long obsolete hash interface with
      either shash (for non-SG users) or ahash.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      712b6a6d
    • Alexei Potashnik's avatar
      target/iscsi: Use proper SGL accessors for digest computation · 1bd31de3
      Alexei Potashnik authored
      [ Upstream commit aa75679c ]
      
      Current implementation assumes that all the buffers of an IO are linked
      with a single SG list, which is OK because target-core is only allocating
      a contigious scatterlist region.  However, this assumption is wrong for
      se_cmd descriptors that want to use chaining across multiple SGL regions.
      
      Fix this up by using proper SGL accessors for digest payload computation.
      Signed-off-by: default avatarAlexei Potashnik <alexei@purestorage.com>
      Cc: Roland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      1bd31de3
    • Nicholas Bellinger's avatar
      iscsi-target: Fix initial login PDU asynchronous socket close OOPs · 89ff28d0
      Nicholas Bellinger authored
      [ Upstream commit 25cdda95 ]
      
      This patch fixes a OOPs originally introduced by:
      
         commit bb048357
         Author: Nicholas Bellinger <nab@linux-iscsi.org>
         Date:   Thu Sep 5 14:54:04 2013 -0700
      
         iscsi-target: Add sk->sk_state_change to cleanup after TCP failure
      
      which would trigger a NULL pointer dereference when a TCP connection
      was closed asynchronously via iscsi_target_sk_state_change(), but only
      when the initial PDU processing in iscsi_target_do_login() from iscsi_np
      process context was blocked waiting for backend I/O to complete.
      
      To address this issue, this patch makes the following changes.
      
      First, it introduces some common helper functions used for checking
      socket closing state, checking login_flags, and atomically checking
      socket closing state + setting login_flags.
      
      Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
      connection has dropped via iscsi_target_sk_state_change(), but the
      initial PDU processing within iscsi_target_do_login() in iscsi_np
      context is still running.  For this case, it sets LOGIN_FLAGS_CLOSED,
      but doesn't invoke schedule_delayed_work().
      
      The original NULL pointer dereference case reported by MNC is now handled
      by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
      transitioning to FFP to determine when the socket has already closed,
      or iscsi_target_start_negotiation() if the login needs to exchange
      more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
      closed.  For both of these cases, the cleanup up of remaining connection
      resources will occur in iscsi_target_start_negotiation() from iscsi_np
      process context once the failure is detected.
      
      Finally, to handle to case where iscsi_target_sk_state_change() is
      called after the initial PDU procesing is complete, it now invokes
      conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
      existing iscsi_target_sk_check_close() checks detect connection failure.
      For this case, the cleanup of remaining connection resources will occur
      in iscsi_target_do_login_rx() from delayed workqueue process context
      once the failure is detected.
      Reported-by: default avatarMike Christie <mchristi@redhat.com>
      Reviewed-by: default avatarMike Christie <mchristi@redhat.com>
      Tested-by: default avatarMike Christie <mchristi@redhat.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Reported-by: default avatarHannes Reinecke <hare@suse.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Varun Prakash <varun@chelsio.com>
      Cc: <stable@vger.kernel.org> # v3.12+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      89ff28d0
    • Bart Van Assche's avatar
      target/iscsi: Fix indentation in iscsi_target_start_negotiation() · 09cb399b
      Bart Van Assche authored
      [ Upstream commit 1efaa949 ]
      
      This patch avoids that smatch complains about inconsistent
      indentation in iscsi_target_start_negotiation().
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      09cb399b
    • Nicholas Bellinger's avatar
      iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race · 68185cb1
      Nicholas Bellinger authored
      [ Upstream commit 8f0dfb3d ]
      
      There is a iscsi-target/tcp login race in LOGIN_FLAGS_READY
      state assignment that can result in frequent errors during
      iscsi discovery:
      
            "iSCSI Login negotiation failed."
      
      To address this bug, move the initial LOGIN_FLAGS_READY
      assignment ahead of iscsi_target_do_login() when handling
      the initial iscsi_target_start_negotiation() request PDU
      during connection login.
      
      As iscsi_target_do_login_rx() work_struct callback is
      clearing LOGIN_FLAGS_READ_ACTIVE after subsequent calls
      to iscsi_target_do_login(), the early sk_data_ready
      ahead of the first iscsi_target_do_login() expects
      LOGIN_FLAGS_READY to also be set for the initial
      login request PDU.
      
      As reported by Maged, this was first obsered using an
      MSFT initiator running across multiple VMWare host
      virtual machines with iscsi-target/tcp.
      Reported-by: default avatarMaged Mokhtar <mmokhtar@binarykinetics.com>
      Tested-by: default avatarMaged Mokhtar <mmokhtar@binarykinetics.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      68185cb1
    • David Arcari's avatar
      cpufreq: cpufreq_register_driver() should return -ENODEV if init fails · 5df474e6
      David Arcari authored
      [ Upstream commit 6c770036 ]
      
      For a driver that does not set the CPUFREQ_STICKY flag, if all of the
      ->init() calls fail, cpufreq_register_driver() should return an error.
      This will prevent the driver from loading.
      
      Fixes: ce1bcfe9 (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs)
      Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
      Signed-off-by: default avatarDavid Arcari <darcari@redhat.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      5df474e6
    • Eric Anholt's avatar
      drm/msm: Expose our reservation object when exporting a dmabuf. · 7e144ca4
      Eric Anholt authored
      [ Upstream commit 43523eba ]
      
      Without this, polling on the dma-buf (and presumably other devices
      synchronizing against our rendering) would return immediately, even
      while the BO was busy.
      Signed-off-by: default avatarEric Anholt <eric@anholt.net>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: linux-arm-msm@vger.kernel.org
      Cc: freedreno@lists.freedesktop.org
      Reviewed-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7e144ca4
    • Jan Kara's avatar
      xfs: Fix missed holes in SEEK_HOLE implementation · 7e185e00
      Jan Kara authored
      [ Upstream commit 5375023a ]
      
      XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as
      can be seen by the following command:
      
      xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
             -c "seek -h 0" file
      wrote 57344/57344 bytes at offset 0
      56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec)
      wrote 8192/8192 bytes at offset 131072
      8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec)
      Whence	Result
      HOLE	139264
      
      Where we can see that hole at offset 56k was just ignored by SEEK_HOLE
      implementation. The bug is in xfs_find_get_desired_pgoff() which does
      not properly detect the case when pages are not contiguous.
      
      Fix the problem by properly detecting when found page has larger offset
      than expected.
      
      CC: stable@vger.kernel.org
      Fixes: d126d43fSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7e185e00
    • Eryu Guan's avatar
      xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff() · 59acce81
      Eryu Guan authored
      [ Upstream commit 8affebe1 ]
      
      xfs_find_get_desired_pgoff() is used to search for offset of hole or
      data in page range [index, end] (both inclusive), and the max number
      of pages to search should be at least one, if end == index.
      Otherwise the only page is missed and no hole or data is found,
      which is not correct.
      
      When block size is smaller than page size, this can be demonstrated
      by preallocating a file with size smaller than page size and writing
      data to the last block. E.g. run this xfs_io command on a 1k block
      size XFS on x86_64 host.
      
        # xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
        	    -c "seek -d 0" /mnt/xfs/testfile
        wrote 1024/1024 bytes at offset 2048
        1 KiB, 1 ops; 0.0000 sec (33.675 MiB/sec and 34482.7586 ops/sec)
        Whence  Result
        DATA    EOF
      
      Data at offset 2k was missed, and lseek(2) returned ENXIO.
      
      This is uncovered by generic/285 subtest 07 and 08 on ppc64 host,
      where pagesize is 64k. Because a recent change to generic/285
      reduced the preallocated file size to smaller than 64k.
      
      Cc: stable@vger.kernel.org # v3.7+
      Signed-off-by: default avatarEryu Guan <eguan@redhat.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      59acce81
    • Lyude's avatar
      drm/radeon: Unbreak HPD handling for r600+ · b96e5f18
      Lyude authored
      [ Upstream commit 3d18e337 ]
      
      We end up reading the interrupt register for HPD5, and then writing it
      to HPD6 which on systems without anything using HPD5 results in
      permanently disabling hotplug on one of the display outputs after the
      first time we acknowledge a hotplug interrupt from the GPU.
      
      This code is really bad. But for now, let's just fix this. I will
      hopefully have a large patch series to refactor all of this soon.
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarLyude <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b96e5f18
    • Alexander Sverdlin's avatar
      dmaengine: ep93xx: Don't drain the transfers in terminate_all() · 81402e40
      Alexander Sverdlin authored
      [ Upstream commit 98f9de36 ]
      
      Draining the transfers in terminate_all callback happens with IRQs disabled,
      therefore induces huge latency:
      
       irqsoff latency trace v1.1.5 on 4.11.0
       --------------------------------------------------------------------
       latency: 39770 us, #57/57, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0)
          -----------------
          | task: process-129 (uid:0 nice:0 policy:2 rt_prio:50)
          -----------------
        => started at: _snd_pcm_stream_lock_irqsave
        => ended at:   snd_pcm_stream_unlock_irqrestore
      
                        _------=> CPU#
                       / _-----=> irqs-off
                      | / _----=> need-resched
                      || / _---=> hardirq/softirq
                      ||| / _--=> preempt-depth
                      |||| /     delay
        cmd     pid   ||||| time  |   caller
           \   /      |||||  \    |   /
      process-129     0d.s.    3us : _snd_pcm_stream_lock_irqsave
      process-129     0d.s1    9us : snd_pcm_stream_lock <-_snd_pcm_stream_lock_irqsave
      process-129     0d.s1   15us : preempt_count_add <-snd_pcm_stream_lock
      process-129     0d.s2   22us : preempt_count_add <-snd_pcm_stream_lock
      process-129     0d.s3   32us : snd_pcm_update_hw_ptr0 <-snd_pcm_period_elapsed
      process-129     0d.s3   41us : soc_pcm_pointer <-snd_pcm_update_hw_ptr0
      process-129     0d.s3   50us : dmaengine_pcm_pointer <-soc_pcm_pointer
      process-129     0d.s3   58us+: snd_dmaengine_pcm_pointer_no_residue <-dmaengine_pcm_pointer
      process-129     0d.s3   96us : update_audio_tstamp <-snd_pcm_update_hw_ptr0
      process-129     0d.s3  103us : snd_pcm_update_state <-snd_pcm_update_hw_ptr0
      process-129     0d.s3  112us : xrun <-snd_pcm_update_state
      process-129     0d.s3  119us : snd_pcm_stop <-xrun
      process-129     0d.s3  126us : snd_pcm_action <-snd_pcm_stop
      process-129     0d.s3  134us : snd_pcm_action_single <-snd_pcm_action
      process-129     0d.s3  141us : snd_pcm_pre_stop <-snd_pcm_action_single
      process-129     0d.s3  150us : snd_pcm_do_stop <-snd_pcm_action_single
      process-129     0d.s3  157us : soc_pcm_trigger <-snd_pcm_do_stop
      process-129     0d.s3  166us : snd_dmaengine_pcm_trigger <-soc_pcm_trigger
      process-129     0d.s3  175us : ep93xx_dma_terminate_all <-snd_dmaengine_pcm_trigger
      process-129     0d.s3  182us : preempt_count_add <-ep93xx_dma_terminate_all
      process-129     0d.s4  189us*: m2p_hw_shutdown <-ep93xx_dma_terminate_all
      process-129     0d.s4 39472us : m2p_hw_setup <-ep93xx_dma_terminate_all
      
       ... rest skipped...
      
      process-129     0d.s. 40080us : <stack trace>
       => ep93xx_dma_tasklet
       => tasklet_action
       => __do_softirq
       => irq_exit
       => __handle_domain_irq
       => vic_handle_irq
       => __irq_usr
       => 0xb66c6668
      
      Just abort the transfers and warn if the HW state is not what we expect.
      Move draining into device_synchronize callback.
      Signed-off-by: default avatarAlexander Sverdlin <alexander.sverdlin@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      81402e40
    • Alexander Sverdlin's avatar
      dmaengine: ep93xx: Always start from BASE0 · 1a45b842
      Alexander Sverdlin authored
      [ Upstream commit 0037ae47 ]
      
      The current buffer is being reset to zero on device_free_chan_resources()
      but not on device_terminate_all(). It could happen that HW is restarted and
      expects BASE0 to be used, but the driver is not synchronized and will start
      from BASE1. One solution is to reset the buffer explicitly in
      m2p_hw_setup().
      Signed-off-by: default avatarAlexander Sverdlin <alexander.sverdlin@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      1a45b842
    • Patrik Jakobsson's avatar
      drm/gma500/psb: Actually use VBT mode when it is found · 72a5ed83
      Patrik Jakobsson authored
      [ Upstream commit 82bc9a42 ]
      
      With LVDS we were incorrectly picking the pre-programmed mode instead of
      the prefered mode provided by VBT. Make sure we pick the VBT mode if
      one is provided. It is likely that the mode read-out code is still wrong
      but this patch fixes the immediate problem on most machines.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78562
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPatrik Jakobsson <patrik.r.jakobsson@gmail.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170418114332.12183-1-patrik.r.jakobsson@gmail.comSigned-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      72a5ed83
    • Rafael J. Wysocki's avatar
      PCI / PM: Avoid resuming more devices during system suspend · 4f268a10
      Rafael J. Wysocki authored
      [ Upstream commit 2cef548a ]
      
      Commit bac2a909 (PCI / PM: Avoid resuming PCI devices during
      system suspend) introduced a mechanism by which some PCI devices that
      were runtime-suspended at the system suspend time might be left in
      that state for the duration of the system suspend-resume cycle.
      However, it overlooked devices that were marked as capable of waking
      up the system just because PME support was detected in their PCI
      config space.
      
      Namely, in that case, device_can_wakeup(dev) returns 'true' for the
      device and if the device is not configured for system wakeup,
      device_may_wakeup(dev) returns 'false' and it will be resumed during
      system suspend even though configuring it for system wakeup may not
      really make sense at all.
      
      To avoid this problem, simply disable PME for PCI devices that have
      not been configured for system wakeup and are runtime-suspended at
      the system suspend time for the duration of the suspend-resume cycle.
      
      If the device is in D3cold, its config space is not available and it
      shouldn't be written to, but that's only possible if the device
      has platform PM support and the platform code is responsible for
      checking whether or not the device's configuration is suitable for
      system suspend in that case.
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      4f268a10
    • Tadeusz Struk's avatar
      PCI: Add quirk for Intel DH895xCC VF PCI config erratum · b060ae49
      Tadeusz Struk authored
      [ Upstream commit 3388a614 ]
      
      The PCI capabilities list for Intel DH895xCC VFs (device id 0x0443) with
      QuickAssist Technology is prematurely terminated in hardware.
      Workaround the issue by hard-coding the known expected next capability
      pointer and saving the PCIE cap into internal buffer.
      
      Patch generated against cryptodev-2.6
      Signed-off-by: default avatarTadeusz Struk <tadeusz.struk@intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b060ae49
    • Alexander Tsoy's avatar
      ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430 · e0bda32c
      Alexander Tsoy authored
      [ Upstream commit 1fc2e41f ]
      
      This model is actually called 92XXM2-8 in Windows driver. But since pin
      configs for M22 and M28 are identical, just reuse M22 quirk.
      
      Fixes external microphone (tested) and probably docking station ports
      (not tested).
      Signed-off-by: default avatarAlexander Tsoy <alexander@tsoy.me>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      e0bda32c
    • Srinath Mannam's avatar
      mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read · 9dbe42c5
      Srinath Mannam authored
      [ Upstream commit f5f968f2 ]
      
      The stingray SDHCI hardware supports ACMD12 and automatically
      issues after multi block transfer completed.
      
      If ACMD12 in SDHCI is disabled, spurious tx done interrupts are seen
      on multi block read command with below error message:
      
      Got data interrupt 0x00000002 even though no data
      operation was in progress.
      
      This patch uses SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 to enable
      ACM12 support in SDHCI hardware and suppress spurious interrupt.
      Signed-off-by: default avatarSrinath Mannam <srinath.mannam@broadcom.com>
      Reviewed-by: default avatarRay Jui <ray.jui@broadcom.com>
      Reviewed-by: default avatarScott Branden <scott.branden@broadcom.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Fixes: b580c52d ("mmc: sdhci-iproc: add IPROC SDHCI driver")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      9dbe42c5
    • Sebastian Reichel's avatar
      i2c: i2c-tiny-usb: fix buffer not being DMA capable · 0210333e
      Sebastian Reichel authored
      [ Upstream commit 5165da59 ]
      
      Since v4.9 i2c-tiny-usb generates the below call trace
      and longer works, since it can't communicate with the
      USB device. The reason is, that since v4.9 the USB
      stack checks, that the buffer it should transfer is DMA
      capable. This was a requirement since v2.2 days, but it
      usually worked nevertheless.
      
      [   17.504959] ------------[ cut here ]------------
      [   17.505488] WARNING: CPU: 0 PID: 93 at drivers/usb/core/hcd.c:1587 usb_hcd_map_urb_for_dma+0x37c/0x570
      [   17.506545] transfer buffer not dma capable
      [   17.507022] Modules linked in:
      [   17.507370] CPU: 0 PID: 93 Comm: i2cdetect Not tainted 4.11.0-rc8+ #10
      [   17.508103] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      [   17.509039] Call Trace:
      [   17.509320]  ? dump_stack+0x5c/0x78
      [   17.509714]  ? __warn+0xbe/0xe0
      [   17.510073]  ? warn_slowpath_fmt+0x5a/0x80
      [   17.510532]  ? nommu_map_sg+0xb0/0xb0
      [   17.510949]  ? usb_hcd_map_urb_for_dma+0x37c/0x570
      [   17.511482]  ? usb_hcd_submit_urb+0x336/0xab0
      [   17.511976]  ? wait_for_completion_timeout+0x12f/0x1a0
      [   17.512549]  ? wait_for_completion_timeout+0x65/0x1a0
      [   17.513125]  ? usb_start_wait_urb+0x65/0x160
      [   17.513604]  ? usb_control_msg+0xdc/0x130
      [   17.514061]  ? usb_xfer+0xa4/0x2a0
      [   17.514445]  ? __i2c_transfer+0x108/0x3c0
      [   17.514899]  ? i2c_transfer+0x57/0xb0
      [   17.515310]  ? i2c_smbus_xfer_emulated+0x12f/0x590
      [   17.515851]  ? _raw_spin_unlock_irqrestore+0x11/0x20
      [   17.516408]  ? i2c_smbus_xfer+0x125/0x330
      [   17.516876]  ? i2c_smbus_xfer+0x125/0x330
      [   17.517329]  ? i2cdev_ioctl_smbus+0x1c1/0x2b0
      [   17.517824]  ? i2cdev_ioctl+0x75/0x1c0
      [   17.518248]  ? do_vfs_ioctl+0x9f/0x600
      [   17.518671]  ? vfs_write+0x144/0x190
      [   17.519078]  ? SyS_ioctl+0x74/0x80
      [   17.519463]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
      [   17.519959] ---[ end trace d047c04982f5ac50 ]---
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: default avatarTill Harbaum <till@harbaum.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0210333e
    • Chen, Gong's avatar
      x86/mce: Don't use percpu workqueues · 8852d28b
      Chen, Gong authored
      [ Upstream commit 061120ae ]
      
      An MCE is a rare event. Therefore, there's no need to have
      per-CPU instances of both normal and IRQ workqueues. Make them
      both global.
      Signed-off-by: default avatarChen, Gong <gong.chen@linux.intel.com>
      [ Fold in subsequent patch from Rui/Boris/Tony for early boot logging. ]
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      [ Massage commit message. ]
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1439396985-12812-4-git-send-email-bp@alien8.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      8852d28b
    • Al Viro's avatar
      osf_wait4(): fix infoleak · 94d42e88
      Al Viro authored
      [ Upstream commit a8c39544 ]
      
      failing sys_wait4() won't fill struct rusage...
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      94d42e88
    • Wanpeng Li's avatar
      KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation · 156c18c7
      Wanpeng Li authored
      [ Upstream commit cbfc6c91 ]
      
      Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation.
      
      - "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only,
        a read should be ignored) in guest can get a random number.
      - "rep insb" instruction to access PIT register port 0x43 can control memcpy()
        in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes,
        which will disclose the unimportant kernel memory in host but no crash.
      
      The similar test program below can reproduce the read out-of-bounds vulnerability:
      
      void hexdump(void *mem, unsigned int len)
      {
              unsigned int i, j;
      
              for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++)
              {
                      /* print offset */
                      if(i % HEXDUMP_COLS == 0)
                      {
                              printf("0x%06x: ", i);
                      }
      
                      /* print hex data */
                      if(i < len)
                      {
                              printf("%02x ", 0xFF & ((char*)mem)[i]);
                      }
                      else /* end of block, just aligning for ASCII dump */
                      {
                              printf("   ");
                      }
      
                      /* print ASCII dump */
                      if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1))
                      {
                              for(j = i - (HEXDUMP_COLS - 1); j <= i; j++)
                              {
                                      if(j >= len) /* end of block, not really printing */
                                      {
                                              putchar(' ');
                                      }
                                      else if(isprint(((char*)mem)[j])) /* printable char */
                                      {
                                              putchar(0xFF & ((char*)mem)[j]);
                                      }
                                      else /* other char */
                                      {
                                              putchar('.');
                                      }
                              }
                              putchar('\n');
                      }
              }
      }
      
      int main(void)
      {
      	int i;
      	if (iopl(3))
      	{
      		err(1, "set iopl unsuccessfully\n");
      		return -1;
      	}
      	static char buf[0x40];
      
      	/* test ioport 0x40,0x41,0x42,0x43,0x44,0x45 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x40, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x41, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x42, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x43, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x44, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("mov $0x45, %rdx;");
      	asm volatile ("in %dx,%al;");
      	asm volatile ("stosb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      
      	/* ins port 0x40 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x20, %rcx;");
      	asm volatile ("mov $0x40, %rdx;");
      	asm volatile ("rep insb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      
      	/* ins port 0x43 */
      
      	memset(buf, 0xab, sizeof(buf));
      
      	asm volatile("push %rdi;");
      	asm volatile("mov %0, %%rdi;"::"q"(buf));
      
      	asm volatile ("mov $0x20, %rcx;");
      	asm volatile ("mov $0x43, %rdx;");
      	asm volatile ("rep insb;");
      
      	asm volatile ("pop %rdi;");
      	hexdump(buf, 0x40);
      
      	printf("\n");
      	return 0;
      }
      
      The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation
      w/o clear after using which results in some random datas are left over in
      the buffer. Guest reads port 0x43 will be ignored since it is write only,
      however, the function kernel_pio() can't distigush this ignore from successfully
      reads data from device's ioport. There is no new data fill the buffer from
      port 0x43, however, emulator_pio_in_emulated() will copy the stale data in
      the buffer to the guest unconditionally. This patch fixes it by clearing the
      buffer before in instruction emulation to avoid to grant guest the stale data
      in the buffer.
      
      In addition, string I/O is not supported for in kernel device. So there is no
      iteration to read ioport %RCX times for string I/O. The function kernel_pio()
      just reads one round, and then copy the io size * %RCX to the guest unconditionally,
      actually it copies the one round ioport data w/ other random datas which are left
      over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by
      introducing the string I/O support for in kernel device in order to grant the right
      ioport datas to the guest.
      
      Before the patch:
      
      0x000000: fe 38 93 93 ff ff ab ab .8......
      0x000008: ab ab ab ab ab ab ab ab ........
      0x000010: ab ab ab ab ab ab ab ab ........
      0x000018: ab ab ab ab ab ab ab ab ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: f6 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
      0x000018: 30 30 20 33 20 20 20 20 00 3
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: f6 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
      0x000018: 30 30 20 33 20 20 20 20 00 3
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      After the patch:
      
      0x000000: 1e 02 f8 00 ff ff ab ab ........
      0x000008: ab ab ab ab ab ab ab ab ........
      0x000010: ab ab ab ab ab ab ab ab ........
      0x000018: ab ab ab ab ab ab ab ab ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: d2 e2 d2 df d2 db d2 d7 ........
      0x000008: d2 d3 d2 cf d2 cb d2 c7 ........
      0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........
      0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      
      0x000000: 00 00 00 00 00 00 00 00 ........
      0x000008: 00 00 00 00 00 00 00 00 ........
      0x000010: 00 00 00 00 00 00 00 00 ........
      0x000018: 00 00 00 00 00 00 00 00 ........
      0x000020: ab ab ab ab ab ab ab ab ........
      0x000028: ab ab ab ab ab ab ab ab ........
      0x000030: ab ab ab ab ab ab ab ab ........
      0x000038: ab ab ab ab ab ab ab ab ........
      Reported-by: default avatarMoguofang <moguofang@huawei.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Moguofang <moguofang@huawei.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      156c18c7
    • Johan Hovold's avatar
      watchdog: pcwd_usb: fix NULL-deref at probe · e8b80de6
      Johan Hovold authored
      [ Upstream commit 46c319b8 ]
      
      Make sure to check the number of endpoints to avoid dereferencing a
      NULL-pointer should a malicious device lack endpoints.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      e8b80de6
    • Julius Werner's avatar
      drivers: char: mem: Check for address space wraparound with mmap() · 9ef27e6c
      Julius Werner authored
      [ Upstream commit b299cde2 ]
      
      /dev/mem currently allows mmap() mappings that wrap around the end of
      the physical address space, which should probably be illegal. It
      circumvents the existing STRICT_DEVMEM permission check because the loop
      immediately terminates (as the start address is already higher than the
      end address). On the x86_64 architecture it will then cause a panic
      (from the BUG(start >= end) in arch/x86/mm/pat.c:reserve_memtype()).
      
      This patch adds an explicit check to make sure offset + size will not
      wrap around in the physical address type.
      Signed-off-by: default avatarJulius Werner <jwerner@chromium.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      9ef27e6c
    • Lucas Stach's avatar
      serial: core: fix crash in uart_suspend_port · 682182e9
      Lucas Stach authored
      [ Upstream commit 88e2582e ]
      
      With serdev we might end up with serial ports that have no cdev exported
      to userspace, as they are used as the bus interface to other devices. In
      that case serial_match_port() won't be able to find a matching tty_dev.
      
      Skip the irq wakeup enabling in that case, as serdev will make sure to
      keep the port active, as long as there are devices depending on it.
      
      Fixes: 8ee3fde0 (tty_port: register tty ports with serdev bus)
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      682182e9
    • Peter Hurley's avatar
      tty: Fix GPF in flush_to_ldisc() · b614900e
      Peter Hurley authored
      [ Upstream commit 9ce119f3 ]
      
      A line discipline which does not define a receive_buf() method can
      can cause a GPF if data is ever received [1]. Oddly, this was known
      to the author of n_tracesink in 2011, but never fixed.
      
      [1] GPF report
          BUG: unable to handle kernel NULL pointer dereference at           (null)
          IP: [<          (null)>]           (null)
          PGD 3752d067 PUD 37a7b067 PMD 0
          Oops: 0010 [#1] SMP KASAN
          Modules linked in:
          CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
          Workqueue: events_unbound flush_to_ldisc
          task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000
          RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
          RSP: 0018:ffff88006db67b50  EFLAGS: 00010246
          RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102
          RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388
          RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0
          R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000
          R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8
          FS:  0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
          CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0
          Stack:
           ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000
           ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90
           ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078
          Call Trace:
           [<ffffffff8127cf91>] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
           [<ffffffff8127df14>] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
           [<ffffffff8128faaf>] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
           [<ffffffff852a7c2f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
          Code:  Bad RIP value.
          RIP  [<          (null)>]           (null)
           RSP <ffff88006db67b50>
          CR2: 0000000000000000
          ---[ end trace a587f8947e54d6ea ]---
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      b614900e
    • Dmitry Vyukov's avatar
      tty: fix data race in flush_to_ldisc · 2e279b7d
      Dmitry Vyukov authored
      [ Upstream commit 7098296a ]
      
      flush_to_ldisc reads port->itty and checks that it is not NULL,
      concurrently release_tty sets port->itty to NULL. It is possible
      that flush_to_ldisc loads port->itty once, ensures that it is
      not NULL, but then reloads it again and uses. The second load
      can already return NULL, which will cause a crash.
      
      Use READ_ONCE to read port->itty.
      
      The data race was found with KernelThreadSanitizer (KTSAN).
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      2e279b7d
    • Johan Hovold's avatar
      serial: ifx6x60: fix use-after-free on module unload · 3e984ccc
      Johan Hovold authored
      [ Upstream commit 1e948479 ]
      
      Make sure to deregister the SPI driver before releasing the tty driver
      to avoid use-after-free in the SPI remove callback where the tty
      devices are deregistered.
      
      Fixes: 72d4724e ("serial: ifx6x60: Add modem power off function in the platform reboot process")
      Cc: stable <stable@vger.kernel.org>     # 3.8
      Cc: Jun Chen <jun.d.chen@intel.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      3e984ccc
    • Geert Uytterhoeven's avatar
      serial: ifx6x60: Remove dangerous spi_driver casts · 191c13c5
      Geert Uytterhoeven authored
      [ Upstream commit 9a499db0 ]
      
      Casting spi_driver pointers to "void *" when calling
      spi_{,un}register_driver() bypasses all type checking.
      
      Remove the superfluous casts to fix this.
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      191c13c5
    • Johan Hovold's avatar
      Revert "tty_port: register tty ports with serdev bus" · 95a639d1
      Johan Hovold authored
      [ Upstream commit d3ba126a ]
      
      This reverts commit 8ee3fde0.
      
      The new serdev bus hooked into the tty layer in
      tty_port_register_device() by registering a serdev controller instead of
      a tty device whenever a serdev client is present, and by deregistering
      the controller in the tty-port destructor. This is broken in several
      ways:
      
      Firstly, it leads to a NULL-pointer dereference whenever a tty driver
      later deregisters its devices as no corresponding character device will
      exist.
      
      Secondly, far from every tty driver uses tty-port refcounting (e.g.
      serial core) so the serdev devices might never be deregistered or
      deallocated.
      
      Thirdly, deregistering at tty-port destruction is too late as the
      underlying device and structures may be long gone by then. A port is not
      released before an open tty device is closed, something which a
      registered serdev client can prevent from ever happening. A driver
      callback while the device is gone typically also leads to crashes.
      
      Many tty drivers even keep their ports around until the driver is
      unloaded (e.g. serial core), something which even if a late callback
      never happens, leads to leaks if a device is unbound from its driver and
      is later rebound.
      
      The right solution here is to add a new tty_port_unregister_device()
      helper and to never call tty_device_unregister() whenever the port has
      been claimed by serdev, but since this requires modifying just about
      every tty driver (and multiple subsystems) it will need to be done
      incrementally.
      
      Reverting the offending patch is the first step in fixing the broken
      lifetime assumptions. A follow-up patch will add a new pair of
      tty-device registration helpers, which a vetted tty driver can use to
      support serdev (initially serial core). When every tty driver uses the
      serdev helpers (at least for deregistration), we can add serdev
      registration to tty_port_register_device() again.
      
      Note that this also fixes another issue with serdev, which currently
      allocates and registers a serdev controller for every tty device
      registered using tty_port_device_register() only to immediately
      deregister and deallocate it when the corresponding OF node or serdev
      child node is missing. This should be addressed before enabling serdev
      for hot-pluggable buses.
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      95a639d1
    • Rob Herring's avatar
      tty_port: register tty ports with serdev bus · 1520f7e7
      Rob Herring authored
      [ Upstream commit 8ee3fde0 ]
      
      Register a serdev controller with the serdev bus when a tty_port is
      registered. This creates the serdev controller and create's serdev
      devices for any DT child nodes of the tty_port's parent (i.e. the UART
      device).
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Reviewed-By: default avatarSebastian Reichel <sre@kernel.org>
      Tested-By: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      1520f7e7