1. 07 Jun, 2017 25 commits
    • Jan Kara's avatar
      xfs: Fix missed holes in SEEK_HOLE implementation · 6a46eeae
      Jan Kara authored
      commit 5375023a upstream.
      
      XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as
      can be seen by the following command:
      
      xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
             -c "seek -h 0" file
      wrote 57344/57344 bytes at offset 0
      56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec)
      wrote 8192/8192 bytes at offset 131072
      8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec)
      Whence	Result
      HOLE	139264
      
      Where we can see that hole at offset 56k was just ignored by SEEK_HOLE
      implementation. The bug is in xfs_find_get_desired_pgoff() which does
      not properly detect the case when pages are not contiguous.
      
      Fix the problem by properly detecting when found page has larger offset
      than expected.
      
      Fixes: d126d43fSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a46eeae
    • Yisheng Xie's avatar
      mlock: fix mlock count can not decrease in race condition · aef16f4c
      Yisheng Xie authored
      commit 70feee0e upstream.
      
      Kefeng reported that when running the follow test, the mlock count in
      meminfo will increase permanently:
      
       [1] testcase
       linux:~ # cat test_mlockal
       grep Mlocked /proc/meminfo
        for j in `seq 0 10`
        do
       	for i in `seq 4 15`
       	do
       		./p_mlockall >> log &
       	done
       	sleep 0.2
       done
       # wait some time to let mlock counter decrease and 5s may not enough
       sleep 5
       grep Mlocked /proc/meminfo
      
       linux:~ # cat p_mlockall.c
       #include <sys/mman.h>
       #include <stdlib.h>
       #include <stdio.h>
      
       #define SPACE_LEN	4096
      
       int main(int argc, char ** argv)
       {
      	 	int ret;
      	 	void *adr = malloc(SPACE_LEN);
      	 	if (!adr)
      	 		return -1;
      
      	 	ret = mlockall(MCL_CURRENT | MCL_FUTURE);
      	 	printf("mlcokall ret = %d\n", ret);
      
      	 	ret = munlockall();
      	 	printf("munlcokall ret = %d\n", ret);
      
      	 	free(adr);
      	 	return 0;
      	 }
      
      In __munlock_pagevec() we should decrement NR_MLOCK for each page where
      we clear the PageMlocked flag.  Commit 1ebb7cc6 ("mm: munlock: batch
      NR_MLOCK zone state updates") has introduced a bug where we don't
      decrement NR_MLOCK for pages where we clear the flag, but fail to
      isolate them from the lru list (e.g.  when the pages are on some other
      cpu's percpu pagevec).  Since PageMlocked stays cleared, the NR_MLOCK
      accounting gets permanently disrupted by this.
      
      Fix it by counting the number of page whose PageMlock flag is cleared.
      
      Fixes: 1ebb7cc6 (" mm: munlock: batch NR_MLOCK zone state updates")
      Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.comSigned-off-by: default avatarYisheng Xie <xieyisheng1@huawei.com>
      Reported-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Tested-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: zhongjiang <zhongjiang@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aef16f4c
    • Punit Agrawal's avatar
      mm/migrate: fix refcount handling when !hugepage_migration_supported() · 85190aa1
      Punit Agrawal authored
      commit 30809f55 upstream.
      
      On failing to migrate a page, soft_offline_huge_page() performs the
      necessary update to the hugepage ref-count.
      
      But when !hugepage_migration_supported() , unmap_and_move_hugepage()
      also decrements the page ref-count for the hugepage.  The combined
      behaviour leaves the ref-count in an inconsistent state.
      
      This leads to soft lockups when running the overcommitted hugepage test
      from mce-tests suite.
      
        Soft offlining pfn 0x83ed600 at process virtual address 0x400000000000
        soft offline: 0x83ed600: migration failed 1, type 1fffc00000008008 (uptodate|head)
        INFO: rcu_preempt detected stalls on CPUs/tasks:
         Tasks blocked on level-0 rcu_node (CPUs 0-7): P2715
          (detected by 7, t=5254 jiffies, g=963, c=962, q=321)
          thugetlb_overco R  running task        0  2715   2685 0x00000008
          Call trace:
            dump_backtrace+0x0/0x268
            show_stack+0x24/0x30
            sched_show_task+0x134/0x180
            rcu_print_detail_task_stall_rnp+0x54/0x7c
            rcu_check_callbacks+0xa74/0xb08
            update_process_times+0x34/0x60
            tick_sched_handle.isra.7+0x38/0x70
            tick_sched_timer+0x4c/0x98
            __hrtimer_run_queues+0xc0/0x300
            hrtimer_interrupt+0xac/0x228
            arch_timer_handler_phys+0x3c/0x50
            handle_percpu_devid_irq+0x8c/0x290
            generic_handle_irq+0x34/0x50
            __handle_domain_irq+0x68/0xc0
            gic_handle_irq+0x5c/0xb0
      
      Address this by changing the putback_active_hugepage() in
      soft_offline_huge_page() to putback_movable_pages().
      
      This only triggers on systems that enable memory failure handling
      (ARCH_SUPPORTS_MEMORY_FAILURE) but not hugepage migration
      (!ARCH_ENABLE_HUGEPAGE_MIGRATION).
      
      I imagine this wasn't triggered as there aren't many systems running
      this configuration.
      
      [akpm@linux-foundation.org: remove dead comment, per Naoya]
      Link: http://lkml.kernel.org/r/20170525135146.32011-1-punit.agrawal@arm.comReported-by: default avatarManoj Iyer <manoj.iyer@canonical.com>
      Tested-by: default avatarManoj Iyer <manoj.iyer@canonical.com>
      Suggested-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarPunit Agrawal <punit.agrawal@arm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85190aa1
    • Patrik Jakobsson's avatar
      drm/gma500/psb: Actually use VBT mode when it is found · e7d2e465
      Patrik Jakobsson authored
      commit 82bc9a42 upstream.
      
      With LVDS we were incorrectly picking the pre-programmed mode instead of
      the prefered mode provided by VBT. Make sure we pick the VBT mode if
      one is provided. It is likely that the mode read-out code is still wrong
      but this patch fixes the immediate problem on most machines.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78562Signed-off-by: default avatarPatrik Jakobsson <patrik.r.jakobsson@gmail.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170418114332.12183-1-patrik.r.jakobsson@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7d2e465
    • Thomas Gleixner's avatar
      slub/memcg: cure the brainless abuse of sysfs attributes · aa2f9ae3
      Thomas Gleixner authored
      commit 478fe303 upstream.
      
      memcg_propagate_slab_attrs() abuses the sysfs attribute file functions
      to propagate settings from the root kmem_cache to a newly created
      kmem_cache.  It does that with:
      
           attr->show(root, buf);
           attr->store(new, buf, strlen(bug);
      
      Aside of being a lazy and absurd hackery this is broken because it does
      not check the return value of the show() function.
      
      Some of the show() functions return 0 w/o touching the buffer.  That
      means in such a case the store function is called with the stale content
      of the previous show().  That causes nonsense like invoking
      kmem_cache_shrink() on a newly created kmem_cache.  In the worst case it
      would cause handing in an uninitialized buffer.
      
      This should be rewritten proper by adding a propagate() callback to
      those slub_attributes which must be propagated and avoid that insane
      conversion to and from ASCII, but that's too large for a hot fix.
      
      Check at least the return value of the show() function, so calling
      store() with stale content is prevented.
      
      Steven said:
       "It can cause a deadlock with get_online_cpus() that has been uncovered
        by recent cpu hotplug and lockdep changes that Thomas and Peter have
        been doing.
      
           Possible unsafe locking scenario:
      
                 CPU0                    CPU1
                 ----                    ----
            lock(cpu_hotplug.lock);
                                         lock(slab_mutex);
                                         lock(cpu_hotplug.lock);
            lock(slab_mutex);
      
           *** DEADLOCK ***"
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705201244540.2255@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa2f9ae3
    • Alexander Tsoy's avatar
      ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430 · 2c41aea2
      Alexander Tsoy authored
      commit 1fc2e41f upstream.
      
      This model is actually called 92XXM2-8 in Windows driver. But since pin
      configs for M22 and M28 are identical, just reuse M22 quirk.
      
      Fixes external microphone (tested) and probably docking station ports
      (not tested).
      Signed-off-by: default avatarAlexander Tsoy <alexander@tsoy.me>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c41aea2
    • Nicolas Iooss's avatar
      pcmcia: remove left-over %Z format · e102d49a
      Nicolas Iooss authored
      commit ff5a2016 upstream.
      
      Commit 5b5e0928 ("lib/vsprintf.c: remove %Z support") removed some
      usages of format %Z but forgot "%.2Zx".  This makes clang 4.0 reports a
      -Wformat-extra-args warning because it does not know about %Z.
      
      Replace %Z with %z.
      
      Link: http://lkml.kernel.org/r/20170520090946.22562-1-nicolas.iooss_linux@m4x.orgSigned-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Harald Welte <laforge@gnumonks.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e102d49a
    • Alex Deucher's avatar
      drm/radeon/ci: disable mclk switching for high refresh rates (v2) · efaeb8c1
      Alex Deucher authored
      commit 58d7e3e4 upstream.
      
      Even if the vblank period would allow it, it still seems to
      be problematic on some cards.
      
      v2: fix logic inversion (Nils)
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      efaeb8c1
    • Sebastian Reichel's avatar
      i2c: i2c-tiny-usb: fix buffer not being DMA capable · b54c9caa
      Sebastian Reichel authored
      commit 5165da59 upstream.
      
      Since v4.9 i2c-tiny-usb generates the below call trace
      and longer works, since it can't communicate with the
      USB device. The reason is, that since v4.9 the USB
      stack checks, that the buffer it should transfer is DMA
      capable. This was a requirement since v2.2 days, but it
      usually worked nevertheless.
      
      [   17.504959] ------------[ cut here ]------------
      [   17.505488] WARNING: CPU: 0 PID: 93 at drivers/usb/core/hcd.c:1587 usb_hcd_map_urb_for_dma+0x37c/0x570
      [   17.506545] transfer buffer not dma capable
      [   17.507022] Modules linked in:
      [   17.507370] CPU: 0 PID: 93 Comm: i2cdetect Not tainted 4.11.0-rc8+ #10
      [   17.508103] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      [   17.509039] Call Trace:
      [   17.509320]  ? dump_stack+0x5c/0x78
      [   17.509714]  ? __warn+0xbe/0xe0
      [   17.510073]  ? warn_slowpath_fmt+0x5a/0x80
      [   17.510532]  ? nommu_map_sg+0xb0/0xb0
      [   17.510949]  ? usb_hcd_map_urb_for_dma+0x37c/0x570
      [   17.511482]  ? usb_hcd_submit_urb+0x336/0xab0
      [   17.511976]  ? wait_for_completion_timeout+0x12f/0x1a0
      [   17.512549]  ? wait_for_completion_timeout+0x65/0x1a0
      [   17.513125]  ? usb_start_wait_urb+0x65/0x160
      [   17.513604]  ? usb_control_msg+0xdc/0x130
      [   17.514061]  ? usb_xfer+0xa4/0x2a0
      [   17.514445]  ? __i2c_transfer+0x108/0x3c0
      [   17.514899]  ? i2c_transfer+0x57/0xb0
      [   17.515310]  ? i2c_smbus_xfer_emulated+0x12f/0x590
      [   17.515851]  ? _raw_spin_unlock_irqrestore+0x11/0x20
      [   17.516408]  ? i2c_smbus_xfer+0x125/0x330
      [   17.516876]  ? i2c_smbus_xfer+0x125/0x330
      [   17.517329]  ? i2cdev_ioctl_smbus+0x1c1/0x2b0
      [   17.517824]  ? i2cdev_ioctl+0x75/0x1c0
      [   17.518248]  ? do_vfs_ioctl+0x9f/0x600
      [   17.518671]  ? vfs_write+0x144/0x190
      [   17.519078]  ? SyS_ioctl+0x74/0x80
      [   17.519463]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
      [   17.519959] ---[ end trace d047c04982f5ac50 ]---
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.co.uk>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: default avatarTill Harbaum <till@harbaum.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b54c9caa
    • Davide Caratti's avatar
      sctp: fix ICMP processing if skb is non-linear · 698f506e
      Davide Caratti authored
      
      [ Upstream commit 804ec7eb ]
      
      sometimes ICMP replies to INIT chunks are ignored by the client, even if
      the encapsulated SCTP headers match an open socket. This happens when the
      ICMP packet is carried by a paged skb: use skb_header_pointer() to read
      packet contents beyond the SCTP header, so that chunk header and initiate
      tag are validated correctly.
      
      v2:
      - don't use skb_header_pointer() to read the transport header, since
        icmp_socket_deliver() already puts these 8 bytes in the linear area.
      - change commit message to make specific reference to INIT chunks.
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      698f506e
    • Wei Wang's avatar
      tcp: avoid fastopen API to be used on AF_UNSPEC · 4507a04e
      Wei Wang authored
      
      [ Upstream commit ba615f67 ]
      
      Fastopen API should be used to perform fastopen operations on the TCP
      socket. It does not make sense to use fastopen API to perform disconnect
      by calling it with AF_UNSPEC. The fastopen data path is also prone to
      race conditions and bugs when using with AF_UNSPEC.
      
      One issue reported and analyzed by Vegard Nossum is as follows:
      +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
      Thread A:                            Thread B:
      ------------------------------------------------------------------------
      sendto()
       - tcp_sendmsg()
           - sk_stream_memory_free() = 0
               - goto wait_for_sndbuf
      	     - sk_stream_wait_memory()
      	        - sk_wait_event() // sleep
                |                          sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC)
      	  |                           - tcp_sendmsg()
      	  |                              - tcp_sendmsg_fastopen()
      	  |                                 - __inet_stream_connect()
      	  |                                    - tcp_disconnect() //because of AF_UNSPEC
      	  |                                       - tcp_transmit_skb()// send RST
      	  |                                    - return 0; // no reconnect!
      	  |                           - sk_stream_wait_connect()
      	  |                                 - sock_error()
      	  |                                    - xchg(&sk->sk_err, 0)
      	  |                                    - return -ECONNRESET
      	- ... // wake up, see sk->sk_err == 0
          - skb_entail() on TCP_CLOSE socket
      
      If the connection is reopened then we will send a brand new SYN packet
      after thread A has already queued a buffer. At this point I think the
      socket internal state (sequence numbers etc.) becomes messed up.
      
      When the new connection is closed, the FIN-ACK is rejected because the
      sequence number is outside the window. The other side tries to
      retransmit,
      but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which
      corrupts the skb data length and hits a BUG() in copy_and_csum_bits().
      +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
      
      Hence, this patch adds a check for AF_UNSPEC in the fastopen data path
      and return EOPNOTSUPP to user if such case happens.
      
      Fixes: cf60af03 ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
      Reported-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4507a04e
    • Eric Dumazet's avatar
      ipv6: fix out of bound writes in __ip6_append_data() · 1d31de23
      Eric Dumazet authored
      
      [ Upstream commit 232cd35d ]
      
      Andrey Konovalov and idaifish@gmail.com reported crashes caused by
      one skb shared_info being overwritten from __ip6_append_data()
      
      Andrey program lead to following state :
      
      copy -4200 datalen 2000 fraglen 2040
      maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200
      
      The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
      fraggap, 0); is overwriting skb->head and skb_shared_info
      
      Since we apparently detect this rare condition too late, move the
      code earlier to even avoid allocating skb and risking crashes.
      
      Once again, many thanks to Andrey and syzkaller team.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: <idaifish@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d31de23
    • Bjørn Mork's avatar
      qmi_wwan: add another Lenovo EM74xx device ID · cc870923
      Bjørn Mork authored
      
      [ Upstream commit 486181bc ]
      
      In their infinite wisdom, and never ending quest for end user frustration,
      Lenovo has decided to use a new USB device ID for the wwan modules in
      their 2017 laptops.  The actual hardware is still the Sierra Wireless
      EM7455 or EM7430, depending on region.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc870923
    • David S. Miller's avatar
      ipv6: Check ip6_find_1stfragopt() return value properly. · ef4656af
      David S. Miller authored
      
      [ Upstream commit 7dd7eb95 ]
      
      Do not use unsigned variables to see if it returns a negative
      error or not.
      
      Fixes: 2423496a ("ipv6: Prevent overrun when parsing v6 header options")
      Reported-by: default avatarJulia Lawall <julia.lawall@lip6.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef4656af
    • Craig Gallek's avatar
      ipv6: Prevent overrun when parsing v6 header options · 5ca68dbb
      Craig Gallek authored
      
      [ Upstream commit 2423496a ]
      
      The KASAN warning repoted below was discovered with a syzkaller
      program.  The reproducer is basically:
        int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP);
        send(s, &one_byte_of_data, 1, MSG_MORE);
        send(s, &more_than_mtu_bytes_data, 2000, 0);
      
      The socket() call sets the nexthdr field of the v6 header to
      NEXTHDR_HOP, the first send call primes the payload with a non zero
      byte of data, and the second send call triggers the fragmentation path.
      
      The fragmentation code tries to parse the header options in order
      to figure out where to insert the fragment option.  Since nexthdr points
      to an invalid option, the calculation of the size of the network header
      can made to be much larger than the linear section of the skb and data
      is read outside of it.
      
      This fix makes ip6_find_1stfrag return an error if it detects
      running out-of-bounds.
      
      [   42.361487] ==================================================================
      [   42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730
      [   42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789
      [   42.366469]
      [   42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41
      [   42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
      [   42.368824] Call Trace:
      [   42.369183]  dump_stack+0xb3/0x10b
      [   42.369664]  print_address_description+0x73/0x290
      [   42.370325]  kasan_report+0x252/0x370
      [   42.370839]  ? ip6_fragment+0x11c8/0x3730
      [   42.371396]  check_memory_region+0x13c/0x1a0
      [   42.371978]  memcpy+0x23/0x50
      [   42.372395]  ip6_fragment+0x11c8/0x3730
      [   42.372920]  ? nf_ct_expect_unregister_notifier+0x110/0x110
      [   42.373681]  ? ip6_copy_metadata+0x7f0/0x7f0
      [   42.374263]  ? ip6_forward+0x2e30/0x2e30
      [   42.374803]  ip6_finish_output+0x584/0x990
      [   42.375350]  ip6_output+0x1b7/0x690
      [   42.375836]  ? ip6_finish_output+0x990/0x990
      [   42.376411]  ? ip6_fragment+0x3730/0x3730
      [   42.376968]  ip6_local_out+0x95/0x160
      [   42.377471]  ip6_send_skb+0xa1/0x330
      [   42.377969]  ip6_push_pending_frames+0xb3/0xe0
      [   42.378589]  rawv6_sendmsg+0x2051/0x2db0
      [   42.379129]  ? rawv6_bind+0x8b0/0x8b0
      [   42.379633]  ? _copy_from_user+0x84/0xe0
      [   42.380193]  ? debug_check_no_locks_freed+0x290/0x290
      [   42.380878]  ? ___sys_sendmsg+0x162/0x930
      [   42.381427]  ? rcu_read_lock_sched_held+0xa3/0x120
      [   42.382074]  ? sock_has_perm+0x1f6/0x290
      [   42.382614]  ? ___sys_sendmsg+0x167/0x930
      [   42.383173]  ? lock_downgrade+0x660/0x660
      [   42.383727]  inet_sendmsg+0x123/0x500
      [   42.384226]  ? inet_sendmsg+0x123/0x500
      [   42.384748]  ? inet_recvmsg+0x540/0x540
      [   42.385263]  sock_sendmsg+0xca/0x110
      [   42.385758]  SYSC_sendto+0x217/0x380
      [   42.386249]  ? SYSC_connect+0x310/0x310
      [   42.386783]  ? __might_fault+0x110/0x1d0
      [   42.387324]  ? lock_downgrade+0x660/0x660
      [   42.387880]  ? __fget_light+0xa1/0x1f0
      [   42.388403]  ? __fdget+0x18/0x20
      [   42.388851]  ? sock_common_setsockopt+0x95/0xd0
      [   42.389472]  ? SyS_setsockopt+0x17f/0x260
      [   42.390021]  ? entry_SYSCALL_64_fastpath+0x5/0xbe
      [   42.390650]  SyS_sendto+0x40/0x50
      [   42.391103]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.391731] RIP: 0033:0x7fbbb711e383
      [   42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383
      [   42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003
      [   42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018
      [   42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad
      [   42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00
      [   42.397257]
      [   42.397411] Allocated by task 3789:
      [   42.397702]  save_stack_trace+0x16/0x20
      [   42.398005]  save_stack+0x46/0xd0
      [   42.398267]  kasan_kmalloc+0xad/0xe0
      [   42.398548]  kasan_slab_alloc+0x12/0x20
      [   42.398848]  __kmalloc_node_track_caller+0xcb/0x380
      [   42.399224]  __kmalloc_reserve.isra.32+0x41/0xe0
      [   42.399654]  __alloc_skb+0xf8/0x580
      [   42.400003]  sock_wmalloc+0xab/0xf0
      [   42.400346]  __ip6_append_data.isra.41+0x2472/0x33d0
      [   42.400813]  ip6_append_data+0x1a8/0x2f0
      [   42.401122]  rawv6_sendmsg+0x11ee/0x2db0
      [   42.401505]  inet_sendmsg+0x123/0x500
      [   42.401860]  sock_sendmsg+0xca/0x110
      [   42.402209]  ___sys_sendmsg+0x7cb/0x930
      [   42.402582]  __sys_sendmsg+0xd9/0x190
      [   42.402941]  SyS_sendmsg+0x2d/0x50
      [   42.403273]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.403718]
      [   42.403871] Freed by task 1794:
      [   42.404146]  save_stack_trace+0x16/0x20
      [   42.404515]  save_stack+0x46/0xd0
      [   42.404827]  kasan_slab_free+0x72/0xc0
      [   42.405167]  kfree+0xe8/0x2b0
      [   42.405462]  skb_free_head+0x74/0xb0
      [   42.405806]  skb_release_data+0x30e/0x3a0
      [   42.406198]  skb_release_all+0x4a/0x60
      [   42.406563]  consume_skb+0x113/0x2e0
      [   42.406910]  skb_free_datagram+0x1a/0xe0
      [   42.407288]  netlink_recvmsg+0x60d/0xe40
      [   42.407667]  sock_recvmsg+0xd7/0x110
      [   42.408022]  ___sys_recvmsg+0x25c/0x580
      [   42.408395]  __sys_recvmsg+0xd6/0x190
      [   42.408753]  SyS_recvmsg+0x2d/0x50
      [   42.409086]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   42.409513]
      [   42.409665] The buggy address belongs to the object at ffff88000969e780
      [   42.409665]  which belongs to the cache kmalloc-512 of size 512
      [   42.410846] The buggy address is located 24 bytes inside of
      [   42.410846]  512-byte region [ffff88000969e780, ffff88000969e980)
      [   42.411941] The buggy address belongs to the page:
      [   42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
      [   42.413298] flags: 0x100000000008100(slab|head)
      [   42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c
      [   42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000
      [   42.415074] page dumped because: kasan: bad access detected
      [   42.415604]
      [   42.415757] Memory state around the buggy address:
      [   42.416222]  ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   42.416904]  ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   42.418273]                    ^
      [   42.418588]  ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   42.419273]  ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   42.419882] ==================================================================
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5ca68dbb
    • Soheil Hassas Yeganeh's avatar
      tcp: eliminate negative reordering in tcp_clean_rtx_queue · 94107068
      Soheil Hassas Yeganeh authored
      
      [ Upstream commit bafbb9c7 ]
      
      tcp_ack() can call tcp_fragment() which may dededuct the
      value tp->fackets_out when MSS changes. When prior_fackets
      is larger than tp->fackets_out, tcp_clean_rtx_queue() can
      invoke tcp_update_reordering() with negative values. This
      results in absurd tp->reodering values higher than
      sysctl_tcp_max_reordering.
      
      Note that tcp_update_reordering indeeds sets tp->reordering
      to min(sysctl_tcp_max_reordering, metric), but because
      the comparison is signed, a negative metric always wins.
      
      Fixes: c7caf8d3 ("[TCP]: Fix reord detection due to snd_una covered holes")
      Reported-by: default avatarRebecca Isaacs <risaacs@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      94107068
    • Eric Dumazet's avatar
      sctp: do not inherit ipv6_{mc|ac|fl}_list from parent · 56fd34c6
      Eric Dumazet authored
      
      [ Upstream commit fdcee2cb ]
      
      SCTP needs fixes similar to 83eaddab ("ipv6/dccp: do not inherit
      ipv6_mc_list from parent"), otherwise bad things can happen.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56fd34c6
    • Xin Long's avatar
      sctp: fix src address selection if using secondary addresses for ipv6 · dc03072a
      Xin Long authored
      
      [ Upstream commit dbc2b5e9 ]
      
      Commit 0ca50d12 ("sctp: fix src address selection if using secondary
      addresses") has fixed a src address selection issue when using secondary
      addresses for ipv4.
      
      Now sctp ipv6 also has the similar issue. When using a secondary address,
      sctp_v6_get_dst tries to choose the saddr which has the most same bits
      with the daddr by sctp_v6_addr_match_len. It may make some cases not work
      as expected.
      
      hostA:
        [1] fd21:356b:459a:cf10::11 (eth1)
        [2] fd21:356b:459a:cf20::11 (eth2)
      
      hostB:
        [a] fd21:356b:459a:cf30::2  (eth1)
        [b] fd21:356b:459a:cf40::2  (eth2)
      
      route from hostA to hostB:
        fd21:356b:459a:cf30::/64 dev eth1  metric 1024  mtu 1500
      
      The expected path should be:
        fd21:356b:459a:cf10::11 <-> fd21:356b:459a:cf30::2
      But addr[2] matches addr[a] more bits than addr[1] does, according to
      sctp_v6_addr_match_len. It causes the path to be:
        fd21:356b:459a:cf20::11 <-> fd21:356b:459a:cf30::2
      
      This patch is to fix it with the same way as Marcelo's fix for sctp ipv4.
      As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check
      if the saddr is in a dev instead.
      
      Note that for backwards compatibility, it will still do the addr_match_len
      check here when no optimal is found.
      Reported-by: default avatarPatrick Talbert <ptalbert@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc03072a
    • Yuchung Cheng's avatar
      tcp: avoid fragmenting peculiar skbs in SACK · b1ff990a
      Yuchung Cheng authored
      
      [ Upstream commit b451e5d2 ]
      
      This patch fixes a bug in splitting an SKB during SACK
      processing. Specifically if an skb contains multiple
      packets and is only partially sacked in the higher sequences,
      tcp_match_sack_to_skb() splits the skb and marks the second fragment
      as SACKed.
      
      The current code further attempts rounding up the first fragment
      to MSS boundaries. But it misses a boundary condition when the
      rounded-up fragment size (pkt_len) is exactly skb size.  Spliting
      such an skb is pointless and causses a kernel warning and aborts
      the SACK processing. This patch universally checks such over-split
      before calling tcp_fragment to prevent these unnecessary warnings.
      
      Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries")
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1ff990a
    • Julian Wiedmann's avatar
      s390/qeth: avoid null pointer dereference on OSN · de66696e
      Julian Wiedmann authored
      
      [ Upstream commit 25e2c341 ]
      
      Access card->dev only after checking whether's its valid.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Reviewed-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de66696e
    • Julian Wiedmann's avatar
      s390/qeth: unbreak OSM and OSN support · 4c814829
      Julian Wiedmann authored
      
      [ Upstream commit 2d2ebb3e ]
      
      commit b4d72c08 ("qeth: bridgeport support - basic control")
      broke the support for OSM and OSN devices as follows:
      
      As OSM and OSN are L2 only, qeth_core_probe_device() does an early
      setup by loading the l2 discipline and calling qeth_l2_probe_device().
      In this context, adding the l2-specific bridgeport sysfs attributes
      via qeth_l2_create_device_attributes() hits a BUG_ON in fs/sysfs/group.c,
      since the basic sysfs infrastructure for the device hasn't been
      established yet.
      
      Note that OSN actually has its own unique sysfs attributes
      (qeth_osn_devtype), so the additional attributes shouldn't be created
      at all.
      For OSM, add a new qeth_l2_devtype that contains all the common
      and l2-specific sysfs attributes.
      When qeth_core_probe_device() does early setup for OSM or OSN, assign
      the corresponding devtype so that the ccwgroup probe code creates the
      full set of sysfs attributes.
      This allows us to skip qeth_l2_create_device_attributes() in case
      of an early setup.
      
      Any device that can't do early setup will initially have only the
      generic sysfs attributes, and when it's probed later
      qeth_l2_probe_device() adds the l2-specific attributes.
      
      If an early-setup device is removed (by calling ccwgroup_ungroup()),
      device_unregister() will - using the devtype - delete the
      l2-specific attributes before qeth_l2_remove_device() is called.
      So make sure to not remove them twice.
      
      What complicates the issue is that qeth_l2_probe_device() and
      qeth_l2_remove_device() is also called on a device when its
      layer2 attribute changes (ie. its layer mode is switched).
      For early-setup devices this wouldn't work properly - we wouldn't
      remove the l2-specific attributes when switching to L3.
      But switching the layer mode doesn't actually make any sense;
      we already decided that the device can only operate in L2!
      So just refuse to switch the layer mode on such devices. Note that
      OSN doesn't have a layer2 attribute, so we only need to special-case
      OSM.
      
      Based on an initial patch by Ursula Braun.
      
      Fixes: b4d72c08 ("qeth: bridgeport support - basic control")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c814829
    • Ursula Braun's avatar
      s390/qeth: handle sysfs error during initialization · 03effc4a
      Ursula Braun authored
      
      [ Upstream commit 9111e788 ]
      
      When setting up the device from within the layer discipline's
      probe routine, creating the layer-specific sysfs attributes can fail.
      Report this error back to the caller, and handle it by
      releasing the layer discipline.
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      [jwi: updated commit msg, moved an OSN change to a subsequent patch]
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03effc4a
    • Eric Dumazet's avatar
      dccp/tcp: do not inherit mc_list from parent · 4bb305d0
      Eric Dumazet authored
      
      [ Upstream commit 657831ff ]
      
      syzkaller found a way to trigger double frees from ip_mc_drop_socket()
      
      It turns out that leave a copy of parent mc_list at accept() time,
      which is very bad.
      
      Very similar to commit 8b485ce6 ("tcp: do not inherit
      fastopen_req from parent")
      
      Initial report from Pray3r, completed by Andrey one.
      Thanks a lot to them !
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarPray3r <pray3r.z@gmail.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4bb305d0
    • Eric Dumazet's avatar
      netem: fix skb_orphan_partial() · 3aad7706
      Eric Dumazet authored
      commit f6ba8d33 upstream.
      
      I should have known that lowering skb->truesize was dangerous :/
      
      In case packets are not leaving the host via a standard Ethernet device,
      but looped back to local sockets, bad things can happen, as reported
      by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 )
      
      So instead of tweaking skb->truesize, lets change skb->destructor
      and keep a reference on the owner socket via its sk_refcnt.
      
      Fixes: f2f872f9 ("netem: Introduce skb_orphan_partial() helper")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMichael Madsen <mkm@nabto.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3aad7706
    • Greg Kroah-Hartman's avatar
      Revert "stackprotector: Increase the per-task stack canary's random range from... · 2bc281eb
      Greg Kroah-Hartman authored
      Revert "stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms"
      
      This reverts commit 609a3e81 which is
      commit 5ea30e4e upstream.
      
      It shouldn't have been backported to 3.18, as we do not have
      get_random_long() in that kernel tree.
      Reported-by: default avatarPhilip Müller <philm@manjaro.org>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Arjan van Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-hardening@lists.openwall.com
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      2bc281eb
  2. 25 May, 2017 15 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.18.55 · 6b65a8f6
      Greg Kroah-Hartman authored
      6b65a8f6
    • Maksim Salau's avatar
      usb: misc: legousbtower: Fix memory leak · 5d5ea138
      Maksim Salau authored
      commit 0bd193d6 upstream.
      
      get_version_reply is not freed if function returns with success.
      
      Fixes: 942a4873 ("usb: misc: legousbtower: Fix buffers on stack")
      Reported-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Signed-off-by: default avatarMaksim Salau <maksim.salau@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d5ea138
    • Julius Werner's avatar
      drivers: char: mem: Check for address space wraparound with mmap() · ea3c00fd
      Julius Werner authored
      commit b299cde2 upstream.
      
      /dev/mem currently allows mmap() mappings that wrap around the end of
      the physical address space, which should probably be illegal. It
      circumvents the existing STRICT_DEVMEM permission check because the loop
      immediately terminates (as the start address is already higher than the
      end address). On the x86_64 architecture it will then cause a panic
      (from the BUG(start >= end) in arch/x86/mm/pat.c:reserve_memtype()).
      
      This patch adds an explicit check to make sure offset + size will not
      wrap around in the physical address type.
      Signed-off-by: default avatarJulius Werner <jwerner@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea3c00fd
    • Lukas Wunner's avatar
      PCI: Freeze PME scan before suspending devices · cc932a80
      Lukas Wunner authored
      commit ea00353f upstream.
      
      Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790)
      crashes during suspend tests.  Geert Uytterhoeven managed to reproduce the
      issue on an M2-W Koelsch board (r8a7791):
      
        It occurs when the PME scan runs, once per second.  During PME scan, the
        PCI host bridge (rcar-pci) registers are accessed while its module clock
        has already been disabled, leading to the crash.
      
      One reproducer is to configure s2ram to use "s2idle" instead of "deep"
      suspend:
      
        # echo 0 > /sys/module/printk/parameters/console_suspend
        # echo s2idle > /sys/power/mem_sleep
        # echo mem > /sys/power/state
      
      Another reproducer is to write either "platform" or "processors" to
      /sys/power/pm_test.  It does not (or is less likely) to happen during full
      system suspend ("core" or "none") because system suspend also disables
      timers, and thus the workqueue handling PME scans no longer runs.  Geert
      believes the issue may still happen in the small window between disabling
      module clocks and disabling timers:
      
        # echo 0 > /sys/module/printk/parameters/console_suspend
        # echo platform > /sys/power/pm_test    # Or "processors"
        # echo mem > /sys/power/state
      
      (Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.)
      
      Rafael Wysocki agrees that PME scans should be suspended before the host
      bridge registers become inaccessible.  To that end, queue the task on a
      workqueue that gets frozen before devices suspend.
      
      Rafael notes however that as a result, some wakeup events may be missed if
      they are delivered via PME from a device without working IRQ (which hence
      must be polled) and occur after the workqueue has been frozen.  If that
      turns out to be an issue in practice, it may be possible to solve it by
      calling pci_pme_list_scan() once directly from one of the host bridge's
      pm_ops callbacks.
      
      Stacktrace for posterity:
      
        PM: Syncing filesystems ... [   38.566237] done.
        PM: Preparing system for sleep (mem)
        Freezing user space processes ... [   38.579813] (elapsed 0.001 seconds) done.
        Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
        PM: Suspending system (mem)
        PM: suspend of devices complete after 152.456 msecs
        PM: late suspend of devices complete after 2.809 msecs
        PM: noirq suspend of devices complete after 29.863 msecs
        suspend debug: Waiting for 5 second(s).
        Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
        pgd = c0003000
        [00000000] *pgd=80000040004003, *pmd=00000000
        Internal error: : 1211 [#1] SMP ARM
        Modules linked in:
        CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
        4.9.0-rc1-koelsch-00011-g68db9bc8 #3383
        Hardware name: Generic R8A7791 (Flattened Device Tree)
        Workqueue: events pci_pme_list_scan
        task: eb56e140 task.stack: eb58e000
        PC is at pci_generic_config_read+0x64/0x6c
        LR is at rcar_pci_cfg_base+0x64/0x84
        pc : [<c041d7b4>]    lr : [<c04309a0>]    psr: 600d0093
        sp : eb58fe98  ip : c041d750  fp : 00000008
        r10: c0e2283c  r9 : 00000000  r8 : 600d0013
        r7 : 00000008  r6 : eb58fed6  r5 : 00000002  r4 : eb58feb4
        r3 : 00000000  r2 : 00000044  r1 : 00000008  r0 : 00000000
        Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
        Control: 30c5387d  Table: 6a9f6c80  DAC: 55555555
        Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
        Stack: (0xeb58fe98 to 0xeb590000)
        fe80:                                                       00000002 00000044
        fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
        fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
        fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
        ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
        ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
        ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
        ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
        ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
        ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
        ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
        [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
        (pci_bus_read_config_word+0x58/0x80)
        [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
        (pci_check_pme_status+0x34/0x78)
        [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
        [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
        [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
        (process_one_work+0x1bc/0x308)
        [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
        [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
        [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
        Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
        ---[ end trace 667d43ba3aa9e589 ]---
      
      Fixes: df17e62e ("PCI: Add support for polling PME state on suspended legacy PCI devices")
      Reported-and-tested-by: default avatarLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Reported-and-tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Cc: Simon Horman <horms+renesas@verge.net.au>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc932a80
    • David Woodhouse's avatar
      PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms · fb0aa10a
      David Woodhouse authored
      commit 6bccc7f4 upstream.
      
      In the PCI_MMAP_PROCFS case when the address being passed by the user is a
      'user visible' resource address based on the bus window, and not the actual
      contents of the resource, that's what we need to be checking it against.
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb0aa10a
    • Thomas Gleixner's avatar
      tracing/kprobes: Enforce kprobes teardown after testing · 9713a7c4
      Thomas Gleixner authored
      commit 30e7d894 upstream.
      
      Enabling the tracer selftest triggers occasionally the warning in
      text_poke(), which warns when the to be modified page is not marked
      reserved.
      
      The reason is that the tracer selftest installs kprobes on functions marked
      __init for testing. These probes are removed after the tests, but that
      removal schedules the delayed kprobes_optimizer work, which will do the
      actual text poke. If the work is executed after the init text is freed,
      then the warning triggers. The bug can be reproduced reliably when the work
      delay is increased.
      
      Flush the optimizer work and wait for the optimizing/unoptimizing lists to
      become empty before returning from the kprobes tracer selftest. That
      ensures that all operations which were queued due to the probes removal
      have completed.
      
      Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.homeSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Fixes: 6274de49 ("kprobes: Support delayed unoptimizing")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9713a7c4
    • Al Viro's avatar
      osf_wait4(): fix infoleak · 5e0fb40f
      Al Viro authored
      commit a8c39544 upstream.
      
      failing sys_wait4() won't fill struct rusage...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e0fb40f
    • Johan Hovold's avatar
      uwb: fix device quirk on big-endian hosts · 02a8cea4
      Johan Hovold authored
      commit 41318a2b upstream.
      
      Add missing endianness conversion when using the USB device-descriptor
      idProduct field to apply a hardware quirk.
      
      Fixes: 1ba47da5 ("uwb: add the i1480 DFU driver")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02a8cea4
    • Daniel Micay's avatar
      stackprotector: Increase the per-task stack canary's random range from 32 bits... · 609a3e81
      Daniel Micay authored
      stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
      
      commit 5ea30e4e upstream.
      
      The stack canary is an 'unsigned long' and should be fully initialized to
      random data rather than only 32 bits of random data.
      Signed-off-by: default avatarDaniel Micay <danielmicay@gmail.com>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Arjan van Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-hardening@lists.openwall.com
      Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      609a3e81
    • James Hogan's avatar
      metag/uaccess: Check access_ok in strncpy_from_user · 1897f50c
      James Hogan authored
      commit 3a158a62 upstream.
      
      The metag implementation of strncpy_from_user() doesn't validate the src
      pointer, which could allow reading of arbitrary kernel memory. Add a
      short access_ok() check to prevent that.
      
      Its still possible for it to read across the user/kernel boundary, but
      it will invariably reach a NUL character after only 9 bytes, leaking
      only a static kernel address being loaded into D0Re0 at the beginning of
      __start, which is acceptable for the immediate fix.
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1897f50c
    • James Hogan's avatar
      metag/uaccess: Fix access_ok() · 1ee04c4b
      James Hogan authored
      commit 8a8b5663 upstream.
      
      The __user_bad() macro used by access_ok() has a few corner cases
      noticed by Al Viro where it doesn't behave correctly:
      
       - The kernel range check has off by 1 errors which permit access to the
         first and last byte of the kernel mapped range.
      
       - The kernel range check ends at LINCORE_BASE rather than
         META_MEMORY_LIMIT, which is ineffective when the kernel is in global
         space (an extremely uncommon configuration).
      
      There are a couple of other shortcomings here too:
      
       - Access to the whole of the other address space is permitted (i.e. the
         global half of the address space when the kernel is in local space).
         This isn't ideal as it could theoretically still contain privileged
         mappings set up by the bootloader.
      
       - The size argument is unused, permitting user copies which start on
         valid pages at the end of the user address range and cross the
         boundary into the kernel address space (e.g. addr = 0x3ffffff0, size
         > 0x10).
      
      It isn't very convenient to add size checks when disallowing certain
      regions, and it seems far safer to be sure and explicit about what
      userland is able to access, so invert the logic to allow certain regions
      instead, and fix the off by 1 errors and missing size checks. This also
      allows the get_fs() == KERNEL_DS check to be more easily optimised into
      the user address range case.
      
      We now have 3 such allowed regions:
      
       - The user address range (incorporating the get_fs() == KERNEL_DS
         check).
      
       - NULL (some kernel code expects this to work, and we'll always catch
         the fault anyway).
      
       - The core code memory region.
      
      Fixes: 373cd784 ("metag: Memory handling")
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ee04c4b
    • Keno Fischer's avatar
      mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp · 6afa4514
      Keno Fischer authored
      commit 8310d48b upstream.
      
      In commit 19be0eaf ("mm: remove gup_flags FOLL_WRITE games from
      __get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE
      after a COW was resolved to setting the (newly introduced) FOLL_COW
      instead.  Simultaneously, the check in gup.c was updated to still allow
      writes with FOLL_FORCE set if FOLL_COW had also been set.
      
      However, a similar check in huge_memory.c was forgotten.  As a result,
      remote memory writes to ro regions of memory backed by transparent huge
      pages cause an infinite loop in the kernel (handle_mm_fault sets
      FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails
      out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is
      true.
      
      While in this state the process is stil SIGKILLable, but little else
      works (e.g.  no ptrace attach, no other signals).  This is easily
      reproduced with the following code (assuming thp are set to always):
      
          #include <assert.h>
          #include <fcntl.h>
          #include <stdint.h>
          #include <stdio.h>
          #include <string.h>
          #include <sys/mman.h>
          #include <sys/stat.h>
          #include <sys/types.h>
          #include <sys/wait.h>
          #include <unistd.h>
      
          #define TEST_SIZE 5 * 1024 * 1024
      
          int main(void) {
            int status;
            pid_t child;
            int fd = open("/proc/self/mem", O_RDWR);
            void *addr = mmap(NULL, TEST_SIZE, PROT_READ,
                              MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
            assert(addr != MAP_FAILED);
            pid_t parent_pid = getpid();
            if ((child = fork()) == 0) {
              void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE,
                                 MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
              assert(addr2 != MAP_FAILED);
              memset(addr2, 'a', TEST_SIZE);
              pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr);
              return 0;
            }
            assert(child == waitpid(child, &status, 0));
            assert(WIFEXITED(status) && WEXITSTATUS(status) == 0);
            return 0;
          }
      
      Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously
      to the update in gup.c in the original commit.  The same pattern exists
      in follow_devmap_pmd.  However, we should not be able to reach that
      check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we
      ever do.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.comSigned-off-by: default avatarKeno Fischer <keno@juliacomputing.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [AmitP: Minor refactoring of upstream changes for linux-3.18.y,
              where follow_devmap_pmd() doesn't exist.]
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6afa4514
    • Takashi Iwai's avatar
      xc2028: Fix use-after-free bug properly · 9c9b24b2
      Takashi Iwai authored
      commit 22a1e778 upstream.
      
      The commit 8dfbcc43 ("[media] xc2028: avoid use after free") tried
      to address the reported use-after-free by clearing the reference.
      
      However, it's clearing the wrong pointer; it sets NULL to
      priv->ctrl.fname, but it's anyway overwritten by the next line
      memcpy(&priv->ctrl, p, sizeof(priv->ctrl)).
      
      OTOH, the actual code accessing the freed string is the strcmp() call
      with priv->fname:
      	if (!firmware_name[0] && p->fname &&
      	    priv->fname && strcmp(p->fname, priv->fname))
      		free_firmware(priv);
      
      where priv->fname points to the previous file name, and this was
      already freed by kfree().
      
      For fixing the bug properly, this patch does the following:
      
      - Keep the copy of firmware file name in only priv->fname,
        priv->ctrl.fname isn't changed;
      - The allocation is done only when the firmware gets loaded;
      - The kfree() is called in free_firmware() commonly
      
      Fixes: commit 8dfbcc43 ('[media] xc2028: avoid use after free')
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c9b24b2
    • Kristina Martsenko's avatar
      arm64: documentation: document tagged pointer stack constraints · dd216cef
      Kristina Martsenko authored
      commit f0e421b1 upstream.
      
      Some kernel features don't currently work if a task puts a non-zero
      address tag in its stack pointer, frame pointer, or frame record entries
      (FP, LR).
      
      For example, with a tagged stack pointer, the kernel can't deliver
      signals to the process, and the task is killed instead. As another
      example, with a tagged frame pointer or frame records, perf fails to
      generate call graphs or resolve symbols.
      
      For now, just document these limitations, instead of finding and fixing
      everything that doesn't work, as it's not known if anyone needs to use
      tags in these places anyway.
      
      In addition, as requested by Dave Martin, generalize the limitations
      into a general kernel address tag policy, and refactor
      tagged-pointers.txt to include it.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd216cef
    • Mark Rutland's avatar
      arm64: uaccess: ensure extension of access_ok() addr · 033ed44a
      Mark Rutland authored
      commit a06040d7 upstream.
      
      Our access_ok() simply hands its arguments over to __range_ok(), which
      implicitly assummes that the addr parameter is 64 bits wide. This isn't
      necessarily true for compat code, which might pass down a 32-bit address
      parameter.
      
      In these cases, we don't have a guarantee that the address has been zero
      extended to 64 bits, and the upper bits of the register may contain
      unknown values, potentially resulting in a suprious failure.
      
      Avoid this by explicitly casting the addr parameter to an unsigned long
      (as is done on other architectures), ensuring that the parameter is
      widened appropriately.
      
      Fixes: 0aea86a2 ("arm64: User access library functions")
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      033ed44a